Skip to content

Commit

Permalink
Merge pull request #1333 from millingw/master
Browse files Browse the repository at this point in the history
Merge ClusterAPI notes into main branch
  • Loading branch information
millingw authored Jan 22, 2025
2 parents ada2ccf + 531ffd8 commit b0c637a
Show file tree
Hide file tree
Showing 9 changed files with 1,361 additions and 0 deletions.
102 changes: 102 additions & 0 deletions notes/millingw/ClusterAPIScripts/20250122-manila-test.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# test access to cephfs service
We should be able to access ceph shares directly in a pod.
However, as of 2025-01-22 this wasn't working!

In Horizon GUI, manually create a share. Create a cephx access rule, then copy the access key and full storage path

Create a secret containing the access key

ceph-secret.yaml
```
apiVersion: v1
kind: Secret
metadata:
name: ceph-secret
stringData:
key: ****
```
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f ceph-secret.yaml

Create a test pod that mounts the ceph share as a volume. The ceph share path needs to be separated into a list of monitor addresses and the relative path, eg

pod.yaml

```
---
apiVersion: v1
kind: Pod
metadata:
name: test-cephfs-share-pod
spec:
containers:
- name: web-server
image: nginx
imagePullPolicy: IfNotPresent
volumeMounts:
- name: testpvc
mountPath: /var/lib/www
- name: cephfs
mountPath: "/mnt/cephfs"
volumes:
- name: testpvc
persistentVolumeClaim:
claimName: test-cephfs-share-pvc
readOnly: false
- name: cephfs
cephfs:
monitors:
- 10.4.200.9:6789
- 10.4.200.13:6789
- 10.4.200.17:6789
- 10.4.200.25:6789
- 10.4.200.26:6789
secretRef:
name: ceph-secret
readOnly: false
path: "/volumes/_nogroup/ca890f73-3e33-4e07-879c-f7ec0f5a8a17/52bcd13b-a358-40f0-9ffa-4334eb1e06ae"
```

Example uses nginx, so install that:

```
helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig nginx bitnami/nginx
```

deploy the pod
```
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f manila-csi-kubespray/pod.yaml
```

Inspect the pod to verify that the ceph share was successfully mounted

# test jhub deployment, check where user areas get created

deploy jhub, check where user area is created

```
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig upgrade --install jhub jupyterhub/jupyterhub --version=3.3.8
```

# port forward on control VM
```
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace=default port-forward service/proxy-public 8080:http
```

# port forward on laptop:
ssh -i "gaia_jade_test_malcolm.pem" -L 8080:127.0.0.1:8080 [email protected]
browse to 127.0.0.1:8080 and login, eg as user 'hhh'

# on control VM, list pvs/pvcs
kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE 6h56m
pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO Delete Bound default/claim-hhh csi-manila-cephfs <unset> 6h51m
pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO Delete Bound default/hub-db-dir csi-manila-cephfs <unset>

kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
claim-hhh Bound pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO csi-manila-cephfs <unset> 6h52m
hub-db-dir Bound pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO csi-manila-cephfs <unset> 6h58m



22 changes: 22 additions & 0 deletions notes/millingw/ClusterAPIScripts/KubeadmConfigTemplate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
kind: KubeadmConfigTemplate
metadata:
name: iris-gaia-red-ceph-md-0
namespace: default
spec:
template:
spec:
mounts: []
preKubeadmCommands: ["apt-get update;", "apt-get install ceph-common -y;", "mkdir -p /mnt/kubernetes_scratch_share", "echo 10.4.200.9:6789,10.4.200.13:6789,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/280b44fc-d423-4496-8fb8-79bfc1f58b97/35e407e9-a34b-4c64-b480-3380002d64f8 /mnt/kubernetes_scratch_share ceph name=kubernetes-scratch-share,noatime,_netdev 0 2 >> /etc/fstab"]
files:
- path: /etc/ceph/ceph.conf
content: |
[global]
fsid = a900cf30-f8a3-42bf-98d6-af7ce92f1a1a
mon_host = [v2:10.4.200.13:3300/0,v1:10.4.200.13:6789/0] [v2:10.4.200.9:3300/0,v1:10.4.200.9:6789/0] [v2:10.4.200.17:3300/0,v1:10.4.200.17:6789/0] [v2:10.4.200.26:3300/0,v1:10.4.200.26:6789/0] [v2:10.4.200.25:3300/0,v1:10.4.200.25:6789/0]
- path: /etc/ceph/ceph.client.kubernetes-scratch-share.keyring
content: |
[client.kubernetes-scratch-share]
key = **REDACTED**
postKubeadmCommands: ["sudo mount -a"]
46 changes: 46 additions & 0 deletions notes/millingw/ClusterAPIScripts/Readme.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## ClusterAPI build scripts

Building a cluster involves multiple steps and lots of configuration files.
Each site that we deploy to is likely to have different storage configurations, networks, credentials
Here I am trying to collect together the set of config files for each site that we are deploying to, and using a single deployment script, build_my_cluster.sh, to try and make deployment a bit less manual
build_my_cluster.sh assumes that all preparatory work has already been done, ie a management cluster has been created, compatible ClusterAPI images have been created and tested in the target OpenStack environments, and a cluster template has been generated.
The following tools must be installed prior to running the script: kubectl, clusterctl, openstack cli

The script reads a config file, which sets all the necessary environment variables that the script expects:

```
export KUBECONFIG=<path to the KUBECONFIG file for our ClusterAPI management cluster>
export CLUSTER_NAME=<name of the cluster we are deploying, must match the cluster specification file>
export CLUSTER_SPECIFICATION_FILE=<generated ClusterAPI specification file, contains all the credentials and templates for our cluster creation>
export CLUSTER_CREDENTIAL_FILE=<path to configuration file containing credentials for target OpenStack project and load balancer / networking configuration>
export CINDER_SECRETS_FILE=<path to credentials file for installing the Cinder storage driver into our cluster (we assume OpenStack will always have CinderAvailable
```
The following is Manila-specific. On Arcus and Somerville we have the Manila service available, which gives us access to ceph. Other sites may not provide this, in which case set USE_MANILA=false
```
USE_MANILA=true
MANILA_PROTOCOLS_FILE=values.yaml
MANILA_SECRETS_FILE=secrets.yaml
MANILA_STORAGE_CLASS_FILE=sc.yaml
DEFAULT_STORAGE_CLASS=manila
```
Running ./build_my_cluster.sh will build a new cluster in the targeted OpenStack project.
The following stages are run:
* Build the initial cluster
* Wait for the initial control plane to become available
* Wait for the basic service to start
* Install the control plane software (Calico)
* Wait for initialisation
* Install cinder storage driver
* (Optionally install Manila storage driver)
* Wait for all workers to join

Cluster creation can be monitored with clusterctl, ie clusterctl describe cluster $CLUSTER_NAME

Note that a cluster may be ready for use before all workers are ready; the script may loop indefinitely if the target project can't provide the requested number of workers.

On successfull completion of the script, a KUBECONFIG file should be output that can be used to install services on the newly created cluster.

The resulting cluster and KUBECONFIG file can then be used to install kubernetes services in the usual fashion.

The intention is to maintain a set of production scripts for each deployment site, with a separate master configuration file for each site to be sourced by the build script.

11 changes: 11 additions & 0 deletions notes/millingw/ClusterAPIScripts/appcred-iris-gaia-red-demo.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[Global]
auth-url=https://arcus.openstack.hpc.cam.ac.uk:5000
region="RegionOne"
application-credential-id="**REDACTED**"
application-credential-secret="**REDACTED**"

[LoadBalancer]
use-octavia=true
floating-network-id=d5560abe-c5d5-4653-a2f7-59636448f8fe
network-id=37ad320e-18e7-4fac-8538-3232c6eeeec4

176 changes: 176 additions & 0 deletions notes/millingw/ClusterAPIScripts/arcus-red-demo.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
apiVersion: v1
data:
cacert: **REDACTED**
clouds.yaml: ***REDACTED**
kind: Secret
metadata:
labels:
clusterctl.cluster.x-k8s.io/move: "true"
name: iris-gaia-red-demo-cloud-config
namespace: default
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: iris-gaia-red-demo-md-0
namespace: default
spec:
template:
spec:
files: []
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: external
provider-id: openstack:///'{{ instance_id }}'
name: '{{ local_hostname }}'
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: iris-gaia-red-demo
namespace: default
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
serviceDomain: cluster.local
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: iris-gaia-red-demo-control-plane
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
name: iris-gaia-red-demo
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: iris-gaia-red-demo-md-0
namespace: default
spec:
clusterName: iris-gaia-red-demo
replicas: 7
selector:
matchLabels: null
template:
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
name: iris-gaia-red-demo-md-0
clusterName: iris-gaia-red-demo
failureDomain: nova
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
name: iris-gaia-red-demo-md-0
version: 1.30.2
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: iris-gaia-red-demo-control-plane
namespace: default
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-provider: external
controllerManager:
extraArgs:
cloud-provider: external
files: []
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: external
provider-id: openstack:///'{{ instance_id }}'
name: '{{ local_hostname }}'
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: external
provider-id: openstack:///'{{ instance_id }}'
name: '{{ local_hostname }}'
machineTemplate:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
name: iris-gaia-red-demo-control-plane
replicas: 3
version: 1.30.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackCluster
metadata:
name: iris-gaia-red-demo
namespace: default
spec:
apiServerLoadBalancer:
enabled: true
externalNetwork:
id: 57add367-d205-4030-a929-d75617a7c63e
identityRef:
cloudName: iris-gaia-red
name: iris-gaia-red-demo-cloud-config
managedSecurityGroups:
allNodesSecurityGroupRules:
- description: Created by cluster-api-provider-openstack - BGP (calico)
direction: ingress
etherType: IPv4
name: BGP (Calico)
portRangeMax: 179
portRangeMin: 179
protocol: tcp
remoteManagedGroups:
- controlplane
- worker
- description: Created by cluster-api-provider-openstack - IP-in-IP (calico)
direction: ingress
etherType: IPv4
name: IP-in-IP (calico)
protocol: "4"
remoteManagedGroups:
- controlplane
- worker
managedSubnets:
- cidr: 10.6.0.0/24
dnsNameservers:
- 8.8.8.8
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
name: iris-gaia-red-demo-control-plane
namespace: default
spec:
template:
spec:
flavor: gaia.vm.cclake.4vcpu
image:
filter:
name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2
sshKeyName: iris-malcolm-kube-test-keypair
rootVolume:
sizeGiB: 100
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: OpenStackMachineTemplate
metadata:
name: iris-gaia-red-demo-md-0
namespace: default
spec:
template:
spec:
flavor: gaia.vm.cclake.54vcpu
image:
filter:
name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2
sshKeyName: iris-malcolm-kube-test-keypair
rootVolume:
sizeGiB: 200
Loading

0 comments on commit b0c637a

Please sign in to comment.