From ad728b71ebe80d2d097e64b21b43064fa6ccddfd Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Thu, 12 Dec 2024 16:51:02 +0000 Subject: [PATCH 01/12] Add files via upload Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/DeployClusterAPI.md | 798 +++++++++++++++++++++++++++++ 1 file changed, 798 insertions(+) create mode 100644 notes/millingw/DeployClusterAPI.md diff --git a/notes/millingw/DeployClusterAPI.md b/notes/millingw/DeployClusterAPI.md new file mode 100644 index 00000000..c696795b --- /dev/null +++ b/notes/millingw/DeployClusterAPI.md @@ -0,0 +1,798 @@ +# Deploy Kubernetes Cluster on Arcus with ClusterCtl + +Based on Amy's notes https://git.ecdf.ed.ac.uk/akrause/openstack-bits-and-pieces/-/blob/main/ClusterAPI/CreateCluster.md + +Manila deployment based on Paul Browne's notes https://gitlab.developers.cam.ac.uk/pfb29/manila-csi-kubespray + +Used VM "gaia_dataset_one" in somerville gaia_jade project as command and control VM. + +Management cluster created in Somerville gaia_jade project using CAPI Magnum command line client, although management cluster could in theory be anywhere with vpn access. + +Prerequisites: + +Existing kubernetes cluster (management cluster): used existing cluster "malcolm_k8s" on somerville, created using Magnum python client. +However, process for creating initial cluster should not matter here. +Access to target OpenStack instance where new cluster will be generated. +A source recent ubuntu image must already be present in the target OpenStack project. +These notes assume a useable project-level router has already been provisioned in the target OpenStack project. + +Required software: +On command / control machine, need to install python, ansible, kubectl, clusterctl, packer (and dependencies). +Need ansible / packer to build images on target OpenStack instance +Need clusterctl for cluster template generation / deployment +Need kubeconfig for management cluster, access credentials for target openstack cluster. + +On gaia_dataset_one VM (on Somerville): + +Install kubectl: + +``` +curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" +sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl +``` + +Install dependencies +``` +pip install python-dev +pip install python-openstackclient +pip install python-magnumclient +pip install ansible +sudo dnf install make +sudo dnf install git +sudo dnf install wget +sudo dnf install yq +``` +# Create and export boostrap cluster details so that we can access it with kubectl (assuming clouds.yaml etc already points to bootstrap OpenStack instance) +``` +openstack coe cluster config --dir /home/rocky/openstack/k8sdir --force --output-certs malcolm_k8s --os-cloud somerville-jade +export KUBECONFIG=/home/rocky/openstack/k8sdir/config +KUBECONFIG now points at our (yet-to-be-initialised) management cluster +``` +# Install clusterctl: + +``` +curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.8.1/clusterctl-linux-amd64 -o clusterctl +sudo install -o root -g root -m 0755 clusterctl /usr/local/bin/clusterctl +``` + +Initialise the management cluster for deploying k8s into OpenStack clouds. +This turns our starting magnum-created kubernetes cluster into a ClusterAPI management cluster. + +``` +clusterctl init --infrastructure openstack +``` + +Our cluster on Somerville is now our management cluster. + +# Build CAPI image in target OpenStack environment: + +Next, we need to build a control image in our target OpenStack environment + +Install Packer on command/control VM: + +``` +curl https://releases.hashicorp.com/packer/1.11.2/packer_1.11.2_linux_amd64.zip --output packer_1.11.2_linux_amd64.zip +unzip packer_1.11.2_linux_amd64.zip +cd packer +sudo mv packer /usr/local/bin/packer +``` + +Create reqs-build.pkr.hcl + +``` +packer { + required_plugins { + openstack = { + version = ">= 1.1.2" + source = "github.com/hashicorp/openstack" + } + } +} +packer { + required_plugins { + ansible = { + version = ">= 1.1.1" + source = "github.com/hashicorp/ansible" + } + } +} + +packer init reqs-build.pkr.hcl +``` + +create packer_var_file.json, edited for arcus red project + +Note that I had to add packer_build_ingest security group to arcus project to allow ssh access for packer to build image +"networks" is existing router in OpenStack project, did not have to create this +CUDN-Internet is existing floating ip pool name in gaia red project +Had to work out flavor and image name from looking at options in the arcus gaia red OpenStack project and doing some trial VM creations to get good combinations +source_image has to be the name of an existing Ubuntu image in the target OpenStack project +image_name is the name of the CAPI magnum image that will be built in the target OpenStack project (ie a new image will be built with this name) + +``` +{ + "source_image": "Ubuntu-Jammy-22.04-20240514", + "network_discovery_cidrs": "10.1.0.0/24", + "networks": "77c534e1-1de2-400b-a315-9d1c9768c99f", + "flavor": "gaia.vm.cclake.26vcpu", + "floating_ip_network": "CUDN-Internet", + "image_name": "Ubuntu-Jammy-22.04-20240514-kube-1.30.2", + "image_visibility": "private", + "image_disk_format": "raw", + "volume_type": "", + "ssh_username": "ubuntu", + "kubernetes_deb_version": "1.30.2-1.1", + "kubernetes_semver": "v1.30.2", + "kubernetes_series": "v1.30", + "security_groups": "packer_build_ingest" +} +``` + +build the CAPI image in the target OpenStack project: + +``` +cd image-builder/images/capi +PACKER_VAR_FILES=/path/to/packer_var_file.json make build-openstack-ubuntu-2204 +take some time to run, generates new image Ubuntu-Jammy-22.04-20240514-kube-1.30.2 in the target OpenStack project +Check in the OpenStack project that the image built ok (either via the openstack client, or via the Horizon GUI for the target OpenStack project +``` + +# Create new Kubernetes cluster for actual use + +The following assumes the management cluster is up and running. + +## Create application credentials + +Create application credentials in Openstack for the target project (here, iris-gaia-red on Arcus) where the Kubernetes cluster will be created and store in `arcus-red.yaml`. + +``` +arcus-red.yaml +clouds: + + + iris-gaia-red: + auth: + auth_url: https://arcus.openstack.hpc.cam.ac.uk:5000 + application_credential_id: "*********" + application_credential_secret: "******" + region_name: "RegionOne" + interface: "public" + identity_api_version: 3 + auth_type: "v3applicationcredential" +``` + +## Set up environment + +Get the OpenStack API server certificates by browsing to the horizon interface, click on the padlock symbol, view certificates, download certificate chain +If necessary, create a new keypair in the OpenStack project that will used to access OpenStack during the cluster creation +Notes assume server certificates saved to arcus-openstack-hpc-cam-ac-uk.pem + +Create environment variable script for configuring clusterctl deployment. +Note that a value must be supplied for OPENSTACK_DNS_NAMESERVERS must be supplied for the config file generation; however, it may be necessary to edit or delete this from the generated config file (see below). +(We've seen that on Arcus the value is ignored, but on BSC it is used directly) + +``` +capi-arcus-red-vars.sh: + +#! /bin/bash + +b64encode(){ + # Check if wrap is supported. Otherwise, break is supported. + if echo | base64 --wrap=0 &> /dev/null; then + base64 --wrap=0 $1 + else + base64 --break=0 $1 + fi +} + +export OPENSTACK_CLOUD=iris-gaia-red +export OPENSTACK_CLOUD_YAML_B64=$( cat arcus-red.yaml | b64encode ) +export OPENSTACK_CLOUD_CACERT_B64=$( cat arcus-openstack-hpc-cam-ac-uk.pem | b64encode ) +export OPENSTACK_FAILURE_DOMAIN=nova +export OPENSTACK_EXTERNAL_NETWORK_ID=57add367-d205-4030-a929-d75617a7c63e +export OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=vm.v1.small +export OPENSTACK_NODE_MACHINE_FLAVOR=gaia.vm.cclake.26vcpu +export OPENSTACK_IMAGE_NAME=Ubuntu-Jammy-22.04-20240514-kube-1.30.2 +export OPENSTACK_SSH_KEY_NAME=iris-malcolm-kube-test-keypair +export OPENSTACK_DNS_NAMESERVERS=8.8.8.8 + +export KUBERNETES_VERSION=1.30.2 + +# optional +export CLUSTER_NAME=iris-gaia-red +export CONTROL_PLANE_MACHINE_COUNT=3 +export WORKER_MACHINE_COUNT=4 +``` + +Source the above file to populate the environment variables: +``` +source capi-arcus-red-vars.sh +``` + +To interact with the management cluster, ensure that you are using the correct kubeconfig: +``` +export KUBECONFIG=/home/rocky/openstack/k8sdir/config +``` + +## Create ClusterAPI config + +# generate a template file for the new cluster using the environment variables we set +# capi-red.yaml will be an openstack-specific, project specific template file for building a new k8s cluster +# this does not actually create a cluster, just a new template for building a cluster + +clusterctl generate cluster iris-gaia-red > capi-red.yaml + +Note that we can't check the generated yaml file into public github, as it contains (base64-encoded) access credentials for OpenStack + +The DNS configuration isn't required although the generate script insists that the environment variable is set. +You can remove the dns server reference from the config yaml ("dnsNameservers", see below), if not required. (See above note about BSC) + +Specify the loadbalancer provider `ovn`in capi-red.yaml: + +``` +kind: OpenStackCluster +metadata: + name: iris-gaia-red + namespace: default +spec: + apiServerLoadBalancer: + enabled: true + provider: ovn + ... +``` + +By default ClusterAPI will try to create a new private network for the kubernetes cluster. +We don't always want this. For example, if the network needs to talk to other services that we haven't configured in the template (such as ceph), we may want to use an existing network. +In the generated template, a section "managedSubnets" will appear under "OpenStackCluster". Remove the definition of cluster.managedSubnets and instead use cluster.network to specify an existing network. For example: + +``` +kind: OpenStackCluster +metadata: + name: iris-gaia-red + namespace: default +spec: + ... + network: + filter: + name: kubernetes-bootstrap-network +``` + +``` +managedSubnets: + - cidr: 10.6.0.0/24 + dnsNameservers: + - 84.88.52.35 +``` + + +If we are building a new network, the value we specified for the dns name server is injected via the value for dnsNameservers. +The behaviour here appears to be system-dependent. +On Arcus, the value we set appears to be ignored +On BSC, the value, if supplied, is used directly and must be correct. However, if dnsNameservers is deleted from the config file, the correct dns name server is used by default. + +Probably a good idea to have fairly large root volumes on our nodes; kubernetes seems to want to fill these fast. +Set rootVolume in our templates in the following places: + +``` +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: OpenStackMachineTemplate +metadata: + name: iris-gaia-red-ceph-control-plane + namespace: default +spec: + template: + spec: + flavor: gaia.vm.cclake.4vcpu + image: + filter: + name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2 + sshKeyName: iris-malcolm-kube-test-keypair + rootVolume: + sizeGiB: 100 +--- +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: OpenStackMachineTemplate +metadata: + name: iris-gaia-red-ceph-md-0 + namespace: default +spec: + template: + spec: + flavor: gaia.vm.cclake.26vcpu + image: + filter: + name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2 + sshKeyName: iris-malcolm-kube-test-keypair + rootVolume: + sizeGiB: 200 +``` + +## Create cluster +Use the management cluster to actually build the new cluster, in our target environment, using the image that we prebuilt earlier in the target project. + +``` +kubectl apply -f capi-red.yaml +``` + +## Check progress + +``` +export CLUSTER_NAME=iris-gaia-red +clusterctl describe cluster ${CLUSTER_NAME} +``` + +Once the first machines in the control plane have been created: + +Download kubeconfig: + +``` +clusterctl get kubeconfig ${CLUSTER_NAME} > ${CLUSTER_NAME}.kubeconfig +``` + +## Complete setup + +The cluster will not complete until the network configuration is created. + +Install Calico CNI +``` +curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml -O +kubectl --kubeconfig=${CLUSTER_NAME}.kubeconfig apply -f calico.yaml +``` + +Get network id of the private network of the cluster. The name starts with `k8s-clusterapi-`. +Get this from the Horizon GUI, or from the openstack client +(If we specified an existing network, get its ID instead) +Note that if we use an existing network, the configuration file only needs to be edited once, as the network ID will be fixed unless the network is deleted / recreated + +Create the Openstack cloud controller configuration `appcred-iris-gaia-red.conf`, add the application credentials and the private network id. +This file will be used to create a kubernetes secret, which will then be used by the system setup +On Arcus, we just use the default load balancer, amphora. + +``` +[Global] +auth-url=https://arcus.openstack.hpc.cam.ac.uk:5000 +region="RegionOne" +application-credential-id="****" +application-credential-secret="****" + +[LoadBalancer] +use-octavia=true +floating-network-id=d5560abe-c5d5-4653-a2f7-59636448f8fe +network-id=34de53cc-5b49-489b-9d02-93a31ab7812f +``` + +Finish network setup and install the Openstack cloud controller to the cluster. + +``` +kubectl --kubeconfig=${CLUSTER_NAME}.kubeconfig create secret -n kube-system generic cloud-config --from-file=cloud.conf=appcred-iris-gaia-red.conf +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-roles.yaml +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/openstack-cloud-controller-manager-ds.yaml +``` + +Now the cluster setup completes. +Watch progress +``` +clusterctl describe cluster ${CLUSTER_NAME} +``` + +The cluster initialises with no available storage classes, therefore applications cannot immediately be deployed. + +# Install cinder driver +Install the cinder helm chart + + +Edit cinder-values.yaml to match our deployed cluster. We point it at the secret we already created during the calico installation + +``` +secret: + enabled: true + name: cloud-config +``` + +# now deploy into our cluster +helm install --namespace=kube-system -f cinder-values.yaml --kubeconfig=./${CLUSTER_NAME}.kubeconfig cinder-csi cpo/openstack-cinder-csi + +# verify the storage classes were created +```` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get storageclass +NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE +csi-cinder-sc-delete cinder.csi.openstack.org Delete Immediate true 11d +csi-cinder-sc-retain cinder.csi.openstack.org Retain Immediate true 11d +```` + + +# Network configuration +If we specified an already-existing network in our template, we assume that the network has already had all the necessary configuration applied. +If we didn't specify a network, we need to do some work in the Horizon GUI to connect our generated network to the CEPHFS network. +Our generated network will have a name k8s-clusterapi-cluster-default-<$CLUSTER_NAME> + +In Horizon: +Cephfs router -> Add New Interface -> select k8s-clusterapi-cluster-default-iris-gaia-red, add unused IP address e.g. 10.6.0.10 +Networks -> select k8s-clusterapi-cluster-default-iris-gaia-red-> Edit Subnet -> Subnet Details. Added host route 10.4.200.0/24,10.6.0.10 +Add a new bastion host VM on k8s-clusterapi-cluster-default-iris-gaia-red network, add new floating ip address to permit ssh access +Log into bastion host to access kubernetes worker nodes +On each node, as root run sudo ip route add 10.4.200.0/24 via 10.6.0.10 +(We need to manually apply the routing on each node as the routing is normally only applied on VM creation) + +Note: it should be possible to automate this through the ClusterAPI template, but still work in progress for now ... + +# mount data shares +At this point our cluster is ready to use. However, we need to be able to access the GAIA DR3 (and potentially other) data from our services. +On the arcus deployment, data is held in a separate project ("iris-gaia-data") within the same physical hardware. +In the Horizon GUI, select iris-gaia-data in the project list, then navigate to "shares". +Identify the required data share, and note the share path and the associated cephx access rule and key. +In Horizon, if one doesn't already exist, create a bastion VM on the same network as the kubernetes cluster, and assign a public floating ip address to allow ssh access. +Log into the bastion VM, and log into each of the worker nodes. +Note that ceph is very fussy about consistent naming throughout. The name of the keyring file must be consistent with the name of the access rule ("grants access to") itself. +Do the following on each worker node, for each data share that we want to mount (access via bastion host). +ceph.conf file shown here for ceph on Arcus. Will be different for other systems. + + +``` +# apt update; apt dist-upgrade -y; apt-get install ceph-common -y +# vim /etc/ceph/ceph.conf +# cat /etc/ceph/ceph.conf +[global] +fsid = a900cf30-f8a3-42bf-98d6-af7ce92f1a1a +mon_host = [v2:10.4.200.13:3300/0,v1:10.4.200.13:6789/0] [v2:10.4.200.9:3300/0,v1:10.4.200.9:6789/0] [v2:10.4.200.17:3300/0,v1:10.4.200.17:6789/0] [v2:10.4.200.26:3300/0,v1:10.4.200.26:6789/0] [v2:10.4.200.25:3300/0,v1:10.4.200.25:6789/0] + + +# Provision the Manila-generated CephX key +root@pfb29-test:~# vim ceph.client.dr3_data_share.keyring +root@pfb29-test:~# chmod 0600 ceph.client.dr3_data_share.keyring +root@pfb29-test:~# cat ceph.client.dr3_data_share.keyring +[client.dr3_data_share] + key = $REDACTED + + +# Provision the Manila-generated export path to an env-var, make client mountpoint directory +# here, EXPORT_PATH is the data share path shown in Horizon for the share +root@pfb29-test:~# export EXPORT_PATH="10.4.200.9:6789,10.4.200.13:6789,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/fa5309a4-1b69-4713-b298-c8d7a479f86f/d53177c6-c45c-4583-9947-d50ab931445c" +root@pfb29-test:~# mkdir -p /mnt/dr3_data_share + + +# Mount and stat the CephFS share +root@pfb29-test:~# mount -t ceph $EXPORT_PATH /mnt/dr3_data_share -o name=dr3_data_share +root@pfb29-test:~# df -h -t ceph +Filesystem Size Used Avail Use% Mounted on +10.4.200.9:6789,10.4.200.13:6789,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/fa5309a4-1b69-4713-b298-c8d7a479f86f/d53177c6-c45c-4583-9947-d50ab931445c 10G 0 10G 0% /mnt/cephfs +``` + +Note to self - write a script to automate the above! + +Now that all our workers have the data share mounted, we can access it via a hostPath mount from our pods, eg + +``` +spec: + volumes: + - name: mount-this + hostPath: + path: /mnt/dr3_data_share + type: Directory + containers: + - volumeMounts: + - mountPath: /mnt/dr3_data_share + name: mount-this + readOnly: true +``` + +The (read-only) DR3 data should now be accessible in the pod at /mnt/dr3_data_share + +## rescale cluster + +The management cluster is used to view active workers and rescale a running worker cluster, via the machinedeployments class. +e.g. + +``` +$ kubectl get machinedeployment +NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION +bsc-gaia-md-0 bsc-gaia 3 3 3 0 Running 25h v1.30.2 +iris-gaia-red-ceph-md-0 iris-gaia-red-ceph 4 4 4 0 Running 22d v1.30.2 +iris-gaia-red-demo-md-0 iris-gaia-red-demo 7 7 7 0 Running 6d2h v1.30.2 + +$ kubectl scale machinedeployment iris-gaia-red-demo-md-0 --replicas=9 + +``` + +Note that with our current deployment, new VMs will not automatically get the ceph mounts. This will require manual intervention to perform the ceph configuration + +# Deleting a cluster + +Before deleting a cluster, note that CAPI struggles to delete resources that were created within the cluster, such as services, load balancers etc. +Applications should be deleted in reverse order of creation before trying to delete the cluster, especially those managing load balancers and floating ip addresses. +This may be useful in making deletions cleaner, haven't tried it yet ... https://github.com/azimuth-cloud/cluster-api-janitor-openstack + +To delete a CAPI-deployed cluster: + +``` +kubectl delete cluster ${CLUSTER_NAME} +``` + +Note we don't specify --kubeconfig here, as we are using the management cluster (ie pointed to by ${KUBECONFIG}) to control the cluster teardown + +## Manual deletion + +Sometimes things don't go smoothly during deployment, particularly when getting up and running at a new site. +The management cluster can get confused about the state of the remote cluster. +If this happens, easiest way to clean up is to manually delete all the created resources in the target environment, then purge references from the management cluster. +The following classes need to be purged for the failed cluster, in the following order: OpenStackMachines, OpenStackMachineTemplates, OpenStackClusterTemplate + +e.g. + +``` +$ kubectl get openstackmachines +NAME CLUSTER INSTANCESTATE READY PROVIDERID MACHINE AGE +bsc-gaia-control-plane-r94xt bsc-gaia ACTIVE true openstack:///25a0e44a-f037-4418-a515-cb2da0e4f3ff bsc-gaia-control-plane-r94xt 25h +bsc-gaia-md-0-xqdtp-52fm7 bsc-gaia ACTIVE true openstack:///dc4a2f10-6277-41e5-a6f6-10ef6278df97 bsc-gaia-md-0-xqdtp-52fm725h + +$kubectl delete openstackmachine bsc-gaia-md-0-xqdtp-52fm7 +``` + +Once all resources have been deleted from the management cluster, the cluster itself can be deleted. +To force deletion, it may be necessary to delete the cluster finaliser by editing the clustertemplate object + +``` +$ kubectl get openstackclusters +NAME CLUSTER READY NETWORK BASTION IP AGE +bsc-gaia bsc-gaia true b32e99b0-e3f8-4318-b0fb-9fa1ea3d4bf9 25h + +$ kubectl edit openstackcluster bsc-gaia (opens config in vim) +replace value for finalisers with [] and save out + +# Management cluster failure / deletion + +If we lose the management cluster for any reason, its not the end of the world. +The deployed clusters will still function independently, assuming we have their KUBECONFIG files. +However, we should do everything to avoid this happening ... + + +## Ceph and Manila CSI configuration + +Warning! Work in progress from this point ... + + +# install the ceph csi driver +# followed notes at https://gitlab.developers.cam.ac.uk/pfb29/manila-csi-kubespray + +``` +helm repo add ceph-csi https://ceph.github.io/csi-charts +helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig install --namespace kube-system ceph-csi-cephfs ceph-csi/ceph-csi-cephfs +``` + +# install the manila csi driver + +manila-values.yaml + +``` +--- +shareProtocols: + - protocolSelector: CEPHFS + fsGroupPolicy: None + fwdNodePluginEndpoint: + dir: /var/lib/kubelet/plugins/cephfs.csi.ceph.com + sockFile: csi.sock +``` + +``` +helm repo add cpo https://kubernetes.github.io/cloud-provider-openstack +helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace kube-system manila-csi cpo/openstack-manila-csi -f manila-values.yaml +``` + +# Create a secret for deploying our manila storage class, assumes we created an access credential in the target OpenStack project with suitable priviledges + +secrets.yaml + +``` +apiVersion: v1 +kind: Secret +metadata: + name: csi-manila-secrets + namespace: default +stringData: + # Mandatory + os-authURL: "https://arcus.openstack.hpc.cam.ac.uk:5000/v3" + os-region: "RegionOne" + + # Authentication using user credentials + os-applicationCredentialID: "*****" + os-applicationCredentialSecret: "*******" +``` + +``` +kubectl apply --kubeconfig=./${CLUSTER_NAME}.kubeconfig -f secrets.yaml +``` + +# create a manila storage class using the access secret we just created + +``` + +sc.yaml +--- +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + name: csi-manila-cephfs +provisioner: cephfs.manila.csi.openstack.org +parameters: + type: ceph01_cephfs # Manila share type + cephfs-mounter: kernel + csi.storage.k8s.io/provisioner-secret-name: csi-manila-secrets + csi.storage.k8s.io/provisioner-secret-namespace: default + csi.storage.k8s.io/node-stage-secret-name: csi-manila-secrets + csi.storage.k8s.io/node-stage-secret-namespace: default + csi.storage.k8s.io/node-publish-secret-name: csi-manila-secrets + csi.storage.k8s.io/node-publish-secret-namespace: default +``` + +``` +kubectl apply --kubeconfig=./${CLUSTER_NAME}.kubeconfig -f sc.yaml +``` + +# make manila the default storage class + +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig patch storageclass csi-manila-cephfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' +``` + +# list the storage classes in the cluster +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get storageclass +NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE +csi-cinder-sc-delete cinder.csi.openstack.org Delete Immediate true 12d +csi-cinder-sc-retain cinder.csi.openstack.org Retain Immediate true 12d +csi-manila-cephfs (default) cephfs.manila.csi.openstack.org Delete Immediate false 5d5 +``` + +# test access to cephfs service +In Horizon GUI, manually create a share. Create a cephx access rule, then copy the access key and full storage path + +Create a secret containing the access key + +ceph-secret.yaml +``` +apiVersion: v1 +kind: Secret +metadata: + name: ceph-secret +stringData: + key: **** +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f ceph-secret.yaml + +Create a test pod that mounts the ceph share as a volume. The ceph share path needs to be separated into a list of monitor addresses and the relative path, eg + +pod.yaml + +``` +--- +apiVersion: v1 +kind: Pod +metadata: + name: test-cephfs-share-pod +spec: + containers: + - name: web-server + image: nginx + imagePullPolicy: IfNotPresent + volumeMounts: + - name: testpvc + mountPath: /var/lib/www + - name: cephfs + mountPath: "/mnt/cephfs" + volumes: + - name: testpvc + persistentVolumeClaim: + claimName: test-cephfs-share-pvc + readOnly: false + - name: cephfs + cephfs: + monitors: + - 10.4.200.9:6789 + - 10.4.200.13:6789 + - 10.4.200.17:6789 + - 10.4.200.25:6789 + - 10.4.200.26:6789 + secretRef: + name: ceph-secret + readOnly: false + path: "/volumes/_nogroup/ca890f73-3e33-4e07-879c-f7ec0f5a8a17/52bcd13b-a358-40f0-9ffa-4334eb1e06ae" +``` + +Example uses nginx, so install that: + +``` +helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig nginx bitnami/nginx +``` + +deploy the pod +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f manila-csi-kubespray/pod.yaml +``` + +Inspect the pod to verify that the ceph share was successfully mounted + +# test jhub deployment, check where user areas get created + +deploy jhub, check where user area is created + +``` +helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ +helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig upgrade --install jhub jupyterhub/jupyterhub --version=3.3.8 +``` + +# port forward on control VM +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace=default port-forward service/proxy-public 8080:http +``` + +# port forward on laptop: +ssh -i "gaia_jade_test_malcolm.pem" -L 8080:127.0.0.1:8080 rocky@192.41.122.174 +browse to 127.0.0.1:8080 and login, eg as user 'hhh' + +# on control VM, list pvs/pvcs +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pv +NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE 6h56m +pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO Delete Bound default/claim-hhh csi-manila-cephfs 6h51m +pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO Delete Bound default/hub-db-dir csi-manila-cephfs + +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pvc +NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE +claim-hhh Bound pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO csi-manila-cephfs 6h52m +hub-db-dir Bound pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO csi-manila-cephfs 6h58m + +## Thoughts on automation and migration + +Each system that we deploy to will have different networking setup, storage services, image names, machine flavour. +Each system requires that a ClusterAPI image be built in that system from an Ubuntu image already present in that system. +For each system, we generate a configuration file using clusterctl generate. +Getting a working generation image and working combinations of images / flavours likely to be a trial and error process, little prospect for automation +Once we have a working template for a given site, that template can be reused for that site, but that site only. +Given a particular site with a working template, it should be possibe to automate creation of a cluster at that site. +Each site will require specific post-creation configuration, e.g. ceph mounts on Arcus, nfs(?) mounts on BSC + +Manual stages: +Install packer, clusterctl, server certificates etc. +Manually build / test image in target environment, get working combinations of flavours and boot disk sizes. +Generate template file, adjust any arguments. +Once we've got this far, can automate using the template. +Note that we can't check templates into a repo, as they contain security information + +Automated stages: + +kubectl apply template file +clusterctl describe until ready +get kubeconfig file +apply calico +use openstack to lookup network id for new network (how do we get cluster name? from environment variable?) +build application secret conf file +build secret in target environment +complete setup +install cinder storage classes + +do site-specific post-installation: +get list of worker names via kubectl get nodes +install ceph client on each worker node +configure ceph on each worker node +- mount ceph shares on Arcus. need list of shares to mount, lookup keys and create share mount on each worker VM +- attach shared volumes on Somerville, BSC? ) +- modify /etc/fstab rather than configuring from directory? + +Things to try: +Automatic configuration of ceph network on arcus +attach manila shares to pod instead of using ceph mounts (wont be available at every site) + +Generic scripts: + +lookup network id, build conf file +lookup keys for ceph shares +install list of ceph shares on VMs +get list of worker node names and ip addresses + + + + + + + + From 05709388a950fa38c33e198eda30f16e4812eba7 Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Thu, 12 Dec 2024 16:54:04 +0000 Subject: [PATCH 02/12] Create Readme.MD Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/ClusterAPIScripts/Readme.MD | 1 + 1 file changed, 1 insertion(+) create mode 100644 notes/millingw/ClusterAPIScripts/Readme.MD diff --git a/notes/millingw/ClusterAPIScripts/Readme.MD b/notes/millingw/ClusterAPIScripts/Readme.MD new file mode 100644 index 00000000..6d065a69 --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/Readme.MD @@ -0,0 +1 @@ +### Placeholder for example ClusterAPI related scripts and things From 806f4905efbe54f20c4a1d24373bfb338090ff6d Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Fri, 13 Dec 2024 11:02:31 +0000 Subject: [PATCH 03/12] Add files via upload Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- .../appcred-iris-gaia-red-demo.conf | 11 ++ .../ClusterAPIScripts/arcus-red-demo.yaml | 176 ++++++++++++++++++ .../ClusterAPIScripts/capi-arcus-demo.sh | 32 ++++ 3 files changed, 219 insertions(+) create mode 100644 notes/millingw/ClusterAPIScripts/appcred-iris-gaia-red-demo.conf create mode 100644 notes/millingw/ClusterAPIScripts/arcus-red-demo.yaml create mode 100644 notes/millingw/ClusterAPIScripts/capi-arcus-demo.sh diff --git a/notes/millingw/ClusterAPIScripts/appcred-iris-gaia-red-demo.conf b/notes/millingw/ClusterAPIScripts/appcred-iris-gaia-red-demo.conf new file mode 100644 index 00000000..bd576630 --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/appcred-iris-gaia-red-demo.conf @@ -0,0 +1,11 @@ +[Global] +auth-url=https://arcus.openstack.hpc.cam.ac.uk:5000 +region="RegionOne" +application-credential-id="**REDACTED**" +application-credential-secret="**REDACTED**" + +[LoadBalancer] +use-octavia=true +floating-network-id=d5560abe-c5d5-4653-a2f7-59636448f8fe +network-id=37ad320e-18e7-4fac-8538-3232c6eeeec4 + diff --git a/notes/millingw/ClusterAPIScripts/arcus-red-demo.yaml b/notes/millingw/ClusterAPIScripts/arcus-red-demo.yaml new file mode 100644 index 00000000..0403285d --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/arcus-red-demo.yaml @@ -0,0 +1,176 @@ +apiVersion: v1 +data: + cacert: **REDACTED** + clouds.yaml: ***REDACTED** +kind: Secret +metadata: + labels: + clusterctl.cluster.x-k8s.io/move: "true" + name: iris-gaia-red-demo-cloud-config + namespace: default +--- +apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 +kind: KubeadmConfigTemplate +metadata: + name: iris-gaia-red-demo-md-0 + namespace: default +spec: + template: + spec: + files: [] + joinConfiguration: + nodeRegistration: + kubeletExtraArgs: + cloud-provider: external + provider-id: openstack:///'{{ instance_id }}' + name: '{{ local_hostname }}' +--- +apiVersion: cluster.x-k8s.io/v1beta1 +kind: Cluster +metadata: + name: iris-gaia-red-demo + namespace: default +spec: + clusterNetwork: + pods: + cidrBlocks: + - 192.168.0.0/16 + serviceDomain: cluster.local + controlPlaneRef: + apiVersion: controlplane.cluster.x-k8s.io/v1beta1 + kind: KubeadmControlPlane + name: iris-gaia-red-demo-control-plane + infrastructureRef: + apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 + kind: OpenStackCluster + name: iris-gaia-red-demo +--- +apiVersion: cluster.x-k8s.io/v1beta1 +kind: MachineDeployment +metadata: + name: iris-gaia-red-demo-md-0 + namespace: default +spec: + clusterName: iris-gaia-red-demo + replicas: 7 + selector: + matchLabels: null + template: + spec: + bootstrap: + configRef: + apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 + kind: KubeadmConfigTemplate + name: iris-gaia-red-demo-md-0 + clusterName: iris-gaia-red-demo + failureDomain: nova + infrastructureRef: + apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 + kind: OpenStackMachineTemplate + name: iris-gaia-red-demo-md-0 + version: 1.30.2 +--- +apiVersion: controlplane.cluster.x-k8s.io/v1beta1 +kind: KubeadmControlPlane +metadata: + name: iris-gaia-red-demo-control-plane + namespace: default +spec: + kubeadmConfigSpec: + clusterConfiguration: + apiServer: + extraArgs: + cloud-provider: external + controllerManager: + extraArgs: + cloud-provider: external + files: [] + initConfiguration: + nodeRegistration: + kubeletExtraArgs: + cloud-provider: external + provider-id: openstack:///'{{ instance_id }}' + name: '{{ local_hostname }}' + joinConfiguration: + nodeRegistration: + kubeletExtraArgs: + cloud-provider: external + provider-id: openstack:///'{{ instance_id }}' + name: '{{ local_hostname }}' + machineTemplate: + infrastructureRef: + apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 + kind: OpenStackMachineTemplate + name: iris-gaia-red-demo-control-plane + replicas: 3 + version: 1.30.2 +--- +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: OpenStackCluster +metadata: + name: iris-gaia-red-demo + namespace: default +spec: + apiServerLoadBalancer: + enabled: true + externalNetwork: + id: 57add367-d205-4030-a929-d75617a7c63e + identityRef: + cloudName: iris-gaia-red + name: iris-gaia-red-demo-cloud-config + managedSecurityGroups: + allNodesSecurityGroupRules: + - description: Created by cluster-api-provider-openstack - BGP (calico) + direction: ingress + etherType: IPv4 + name: BGP (Calico) + portRangeMax: 179 + portRangeMin: 179 + protocol: tcp + remoteManagedGroups: + - controlplane + - worker + - description: Created by cluster-api-provider-openstack - IP-in-IP (calico) + direction: ingress + etherType: IPv4 + name: IP-in-IP (calico) + protocol: "4" + remoteManagedGroups: + - controlplane + - worker + managedSubnets: + - cidr: 10.6.0.0/24 + dnsNameservers: + - 8.8.8.8 +--- +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: OpenStackMachineTemplate +metadata: + name: iris-gaia-red-demo-control-plane + namespace: default +spec: + template: + spec: + flavor: gaia.vm.cclake.4vcpu + image: + filter: + name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2 + sshKeyName: iris-malcolm-kube-test-keypair + rootVolume: + sizeGiB: 100 +--- +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: OpenStackMachineTemplate +metadata: + name: iris-gaia-red-demo-md-0 + namespace: default +spec: + template: + spec: + flavor: gaia.vm.cclake.54vcpu + image: + filter: + name: Ubuntu-Jammy-22.04-20240514-kube-1.30.2 + sshKeyName: iris-malcolm-kube-test-keypair + rootVolume: + sizeGiB: 200 diff --git a/notes/millingw/ClusterAPIScripts/capi-arcus-demo.sh b/notes/millingw/ClusterAPIScripts/capi-arcus-demo.sh new file mode 100644 index 00000000..ec005cef --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/capi-arcus-demo.sh @@ -0,0 +1,32 @@ +#! /bin/bash + +#source /tmp/env.rc appcred-rundeckdemo01-clouds.yaml openstack + +b64encode(){ + # Check if wrap is supported. Otherwise, break is supported. + if echo | base64 --wrap=0 &> /dev/null; then + base64 --wrap=0 $1 + else + base64 --break=0 $1 + fi +} + +export OPENSTACK_CLOUD=iris-gaia-red +export OPENSTACK_CLOUD_YAML_B64=$( cat arcus-red.yaml | b64encode ) +export OPENSTACK_CLOUD_CACERT_B64=$( cat arcus-openstack-hpc-cam-ac-uk-chain.pem | b64encode ) +export OPENSTACK_FAILURE_DOMAIN=nova +# export OPENSTACK_EXTERNAL_NETWORK_ID=dcb035587-60e2-48eb-ac97-ff5fa38084eba +export OPENSTACK_EXTERNAL_NETWORK_ID=57add367-d205-4030-a929-d75617a7c63e +export OPENSTACK_DNS_NAMESERVERS=8.8.8.8 +export OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=gaia.vm.cclake.4vcpu +export OPENSTACK_NODE_MACHINE_FLAVOR=gaia.vm.cclake.54vcpu +export OPENSTACK_IMAGE_NAME=Ubuntu-Jammy-22.04-20240514-kube-1.30.2 +export OPENSTACK_SSH_KEY_NAME=iris-malcolm-kube-test-keypair + +export KUBERNETES_VERSION=1.30.2 + +# optional +export CLUSTER_NAME=iris-gaia-red-demo +export CONTROL_PLANE_MACHINE_COUNT=3 +export WORKER_MACHINE_COUNT=2 + From ef61c7db9fd77a971da6abf8ccae028e257b68d9 Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Mon, 13 Jan 2025 17:04:16 +0000 Subject: [PATCH 04/12] Script for autobuild of a ClusterAPI cluster Script for building a kubernetes cluster in an OpenStack project. Assumes control images, management cluster etc have already been provisioned Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- .../ClusterAPIScripts/build_my_cluster.sh | 229 ++++++++++++++++++ 1 file changed, 229 insertions(+) create mode 100644 notes/millingw/ClusterAPIScripts/build_my_cluster.sh diff --git a/notes/millingw/ClusterAPIScripts/build_my_cluster.sh b/notes/millingw/ClusterAPIScripts/build_my_cluster.sh new file mode 100644 index 00000000..9382f82c --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/build_my_cluster.sh @@ -0,0 +1,229 @@ +#!/bin/bash + +# we make the following assumptions: +# KUBECONFIG needs to be set to point at the ClusterAPI management cluster +# CLUSTER_SPECIFICATION_FILE is a ClusterAPI yaml file containing templates for the cluster we want to build +# CLUSTER_NAME is consistent with cluster name references in the specification file +# CINDER_SECRETS_FILE contains cinder config details +# CLUSTER_CREDENTIAL_FILE is configured to use an existing OpenStack network, so that we don't need to look up a network id +# TODO handle dynamic network creation; if we're using ceph, better to use a preconfigured network cos otherwise its all a bit of a nightmare + +# TODO read this all from a yaml config file, instead of specifying it all here! +export KUBECONFIG=/home/rocky/openstack/k8sdir/config +export CLUSTER_NAME=iris-gaia-red-ceph +#export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph.yaml +#export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-secret.yaml +export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-file-test.yaml +export CLUSTER_CREDENTIAL_FILE=appcred-iris-gaia-red-fixed-bootstrap.conf +export CINDER_SECRETS_FILE=cinder-values.yaml + +USE_MANILA=true +MANILA_PROTOCOLS_FILE=./manila-csi-kubespray/values.yaml +MANILA_SECRETS_FILE=./manila-csi-kubespray/secrets.yaml +MANILA_STORAGE_CLASS_FILE=./manila-csi-kubespray/sc.yaml +DEFAULT_STORAGE_CLASS=manila + +# check all our expected environment variables are set +if [ -z "${KUBECONFIG}" ]; then + echo environment variable KUBECONFIG not set + exit 1 +fi + +if [ -z "${CLUSTER_NAME}" ]; then + echo environment variable CLUSTER_NAME not set + exit 1 +fi + +if [ -z "${CLUSTER_SPECIFICATION_FILE}" ]; then + echo environment variable CLUSTER_SPECIFICATION_FILE not set + exit 1 +fi + +if [ -z "${CLUSTER_CREDENTIAL_FILE}" ]; then + echo environment variable CLUSTER_CREDENTIAL_FILE not set + exit 1 +fi + +if [ -z "${CINDER_SECRETS_FILE}" ]; then + echo environment variable CINDER_SECRETS_FILE not set + exit 1 +fi + +# check all the input config files exist + +if [ ! -f "${KUBECONFIG}" ]; then + echo file ${KUBECONFIG} not found + exit 1 +fi + +if [ ! -f "${CLUSTER_SPECIFICATION_FILE}" ]; then + echo file ${CLUSTER_SPECIFICATION_FILE} not found + exit 1 +fi + +if [ ! -f "${CLUSTER_CREDENTIAL_FILE}" ]; then + echo file ${CLUSTER_CREDENTIAL_FILE} not found + exit 1 +fi + +if [ ! -f "${CINDER_SECRETS_FILE}" ]; then + echo file ${CINDER_SECRETS_FILE} not found + exit 1 +fi + + +# check manila-specific environment variables and files +if [ $USE_MANILA = true ]; then + + if [ -z "${MANILA_PROTOCOLS_FILE}" ]; then + echo environment variable MANILA_PROTOCOLS_FILE not set + exit 1 + fi + + if [ -z "${MANILA_SECRETS_FILE}" ]; then + echo environment variable MANILA_SECRETS_FILE not set + exit 1 + fi + + if [ -z "${MANILA_PROTOCOLS_FILE}" ]; then + echo environment variable MANILA_STORAGE_CLASS_FILE not set + exit 1 + fi + + if [ ! -f "${MANILA_PROTOCOLS_FILE}" ]; then + echo file ${MANILA_PROTOCOLS_FILE} not found + exit 1 + fi + + if [ ! -f "${MANILA_SECRETS_FILE}" ]; then + echo file ${MANILA_SECRETS_FILE} not found + exit 1 + fi + + if [ ! -f "${MANILA_PROTOCOLS_FILE}" ]; then + echo file ${MANILA_STORAGE_CLASS_FILE} not set + exit 1 + fi +fi + + + +# create the cluster via the management cluster +echo building the cluster ... +kubectl apply -f ${CLUSTER_SPECIFICATION_FILE} + +# wait a couple of minutes, then loop loooking for the first control plane machine +echo Waiting for cluster to initialise ... +sleep 120 + +echo Looping till first control plane machine is available +control_plane_status='False' +until [ $control_plane_status == 'True' ]; +do + sleep 60 + control_plane_status=$(clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | grep -E "Machine/${CLUSTER_NAME}-control-plane" | awk -v OFS='\t' 'FNR == 1{print $3}') + echo $control_plane_status +done + +# we should be able to get the cluster's KUBECONFIG file now +clusterctl get kubeconfig ${CLUSTER_NAME} > ${CLUSTER_NAME}.kubeconfig + + +# +# check we can get the initial set of nodes, otherwise we need to wait a bit longer +# we should get at least our first control plane machine listed, with role 'control-plane' +echo looping till control plane nodes responding +control_plane_ready=false +until [ $control_plane_ready = true ]; +do + sleep 60 + get_nodes=$(kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get nodes | awk -v OFS='\t' 'FNR == 2{print $3}') + echo $get_nodes + + # if it's ready, get_nodes should contain 'control-plane', otherwise keep looping + if [ $get_nodes == 'control-plane' ]; then + control_plane_ready=true + fi + echo $control_plane_ready +done + + + + +# start installing the control layer components +echo installing calico components + +curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml -O +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f calico.yaml + +# create ceph secret before we build our worker nodes; +# config will use this to kernel mount our ceph shares +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f cephx-secret.yaml + + +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig create secret -n kube-system generic cloud-config --from-file=cloud.conf=${CLUSTER_CREDENTIAL_FILE} +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-roles.yaml +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/cloud-controller-manager-role-bindings.yaml +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f https://raw.githubusercontent.com/kubernetes/cloud-provider-openstack/master/manifests/controller-manager/openstack-cloud-controller-manager-ds.yaml +# now we loop and wait till the cluster reports success +echo waiting for cluster completion +cluster_status='False' +until [ $cluster_status == 'True' ]; +do + sleep 60 + cluster_status=$( clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | awk -v OFS='\t' 'FNR == 2{print $2}' ) + echo $cluster_status +done +echo Cluster creation complete + +# we assume all OpenStack systems will have a Cinder service +# (is this a safe assumption?) +echo Installing cinder driver +helm install --namespace=kube-system -f ${CINDER_SECRETS_FILE} --kubeconfig=./${CLUSTER_NAME}.kubeconfig cinder-csi cpo/openstack-cinder-csi + +echo Completed Cluster creation and installed Cinder storage classes + + +# Ceph / Manila installation +if [ $USE_MANILA = true ]; then +echo Installing Manilla storage class + +# install the ceph csi driver +# followed notes at https://gitlab.developers.cam.ac.uk/pfb29/manila-csi-kubespray + +helm repo add ceph-csi https://ceph.github.io/csi-charts +helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig install --namespace kube-system ceph-csi-cephfs ceph-csi/ceph-csi-cephfs + +# install the manila csi driver +helm repo add cpo https://kubernetes.github.io/cloud-provider-openstack +helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace kube-system manila-csi cpo/openstack-manila-csi -f ${MANILA_PROTOCOLS_FILE} + +# configure our access credentials for the manila service +kubectl apply --kubeconfig=./${CLUSTER_NAME}.kubeconfig -f ${MANILA_SECRETS_FILE} + +# create a storage class to let us use Manila from kubernetes +kubectl apply --kubeconfig=./${CLUSTER_NAME}.kubeconfig -f ${MANILA_STORAGE_CLASS_FILE} + +# make Manila the default storage class, if specified +if [ $DEFAULT_STORAGE_CLASS == 'manila' ]; then +echo Making manila the default storage class +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig patch storageclass csi-manila-cephfs -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' +fi + +echo Manila installation complete + +fi + +# TODO - wait for our workers to become available? + +echo Looping till workers are available +worker_nodes_status='False' +until [ $worker_nodes_status == 'True' ]; +do + sleep 60 + worker_nodes_status=$(clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | grep -E "MachineDeployment" | awk -v OFS='\t' '{print $2}') + echo $worker_nodes_status +done + + + From bc6d1f73bcfb39b043fe0d858f504ec02cf5ef2c Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Mon, 13 Jan 2025 17:25:09 +0000 Subject: [PATCH 05/12] example worker config Example worker config with ceph share kernel-mounted into the node Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- .../KubeadmConfigTemplate.yaml | 22 +++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 notes/millingw/ClusterAPIScripts/KubeadmConfigTemplate.yaml diff --git a/notes/millingw/ClusterAPIScripts/KubeadmConfigTemplate.yaml b/notes/millingw/ClusterAPIScripts/KubeadmConfigTemplate.yaml new file mode 100644 index 00000000..4cd8a198 --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/KubeadmConfigTemplate.yaml @@ -0,0 +1,22 @@ +kind: KubeadmConfigTemplate +metadata: + name: iris-gaia-red-ceph-md-0 + namespace: default +spec: + template: + spec: + mounts: [] + preKubeadmCommands: ["apt-get update;", "apt-get install ceph-common -y;", "mkdir -p /mnt/kubernetes_scratch_share", "echo 10.4.200.9:6789,10.4.200.13:6789,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/280b44fc-d423-4496-8fb8-79bfc1f58b97/35e407e9-a34b-4c64-b480-3380002d64f8 /mnt/kubernetes_scratch_share ceph name=kubernetes-scratch-share,noatime,_netdev 0 2 >> /etc/fstab"] + files: + - path: /etc/ceph/ceph.conf + content: | + [global] + fsid = a900cf30-f8a3-42bf-98d6-af7ce92f1a1a + mon_host = [v2:10.4.200.13:3300/0,v1:10.4.200.13:6789/0] [v2:10.4.200.9:3300/0,v1:10.4.200.9:6789/0] [v2:10.4.200.17:3300/0,v1:10.4.200.17:6789/0] [v2:10.4.200.26:3300/0,v1:10.4.200.26:6789/0] [v2:10.4.200.25:3300/0,v1:10.4.200.25:6789/0] + + - path: /etc/ceph/ceph.client.kubernetes-scratch-share.keyring + content: | + [client.kubernetes-scratch-share] + key = **REDACTED** + + postKubeadmCommands: ["sudo mount -a"] \ No newline at end of file From 329b199d6a665ceada720543aeead42f31df891a Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Tue, 21 Jan 2025 15:48:38 +0000 Subject: [PATCH 06/12] More output, better config More verbose output; config environment variables now read from a resource file Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- .../ClusterAPIScripts/build_my_cluster.sh | 67 ++++++++++++------- 1 file changed, 44 insertions(+), 23 deletions(-) diff --git a/notes/millingw/ClusterAPIScripts/build_my_cluster.sh b/notes/millingw/ClusterAPIScripts/build_my_cluster.sh index 9382f82c..9b962899 100644 --- a/notes/millingw/ClusterAPIScripts/build_my_cluster.sh +++ b/notes/millingw/ClusterAPIScripts/build_my_cluster.sh @@ -9,19 +9,34 @@ # TODO handle dynamic network creation; if we're using ceph, better to use a preconfigured network cos otherwise its all a bit of a nightmare # TODO read this all from a yaml config file, instead of specifying it all here! -export KUBECONFIG=/home/rocky/openstack/k8sdir/config -export CLUSTER_NAME=iris-gaia-red-ceph -#export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph.yaml -#export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-secret.yaml -export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-file-test.yaml -export CLUSTER_CREDENTIAL_FILE=appcred-iris-gaia-red-fixed-bootstrap.conf -export CINDER_SECRETS_FILE=cinder-values.yaml - -USE_MANILA=true -MANILA_PROTOCOLS_FILE=./manila-csi-kubespray/values.yaml -MANILA_SECRETS_FILE=./manila-csi-kubespray/secrets.yaml -MANILA_STORAGE_CLASS_FILE=./manila-csi-kubespray/sc.yaml -DEFAULT_STORAGE_CLASS=manila +#export KUBECONFIG=/home/rocky/openstack/k8sdir/config +#export CLUSTER_NAME=iris-gaia-red-ceph +##export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph.yaml +##export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-secret.yaml +#export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-file-test.yaml +#export CLUSTER_CREDENTIAL_FILE=appcred-iris-gaia-red-fixed-bootstrap.conf +#export CINDER_SECRETS_FILE=cinder-values.yaml + +#USE_MANILA=true +#MANILA_PROTOCOLS_FILE=./manila-csi-kubespray/values.yaml +#MANILA_SECRETS_FILE=./manila-csi-kubespray/secrets.yaml +#MANILA_STORAGE_CLASS_FILE=./manila-csi-kubespray/sc.yaml +#DEFAULT_STORAGE_CLASS=manila + +# setup the environment variables for our build +source cluster_config.rc + +echo KUBECONFIG $KUBECONFIG +echo CLUSTER_NAME $CLUSTER_NAME +echo CLUSTER_SPECIFICATION_FILE $CLUSTER_SPECIFICATION_FILE +echo CLUSTER_CREDENTIAL_FILE $CLUSTER_CREDENTIAL_FILE +echo CINDER_SECRETS_FILE $CINDER_SECRETS_FILE + +echo USE_MANILA $USE_MANILA +echo MANILA_PROTOCOLS_FILE $MANILA_PROTOCOLS_FILE +echo MANILA_SECRETS_FILE $MANILA_SECRETS_FILE +echo MANILA_STORAGE_CLASS_FILE $MANILA_STORAGE_CLASS_FILE +echo DEFAULT_STORAGE_CLASS $DEFAULT_STORAGE_CLASS # check all our expected environment variables are set if [ -z "${KUBECONFIG}" ]; then @@ -109,7 +124,7 @@ fi # create the cluster via the management cluster -echo building the cluster ... +echo building cluster $CLUSTER_NAME kubectl apply -f ${CLUSTER_SPECIFICATION_FILE} # wait a couple of minutes, then loop loooking for the first control plane machine @@ -122,7 +137,7 @@ until [ $control_plane_status == 'True' ]; do sleep 60 control_plane_status=$(clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | grep -E "Machine/${CLUSTER_NAME}-control-plane" | awk -v OFS='\t' 'FNR == 1{print $3}') - echo $control_plane_status + echo Control plane status: $control_plane_status done # we should be able to get the cluster's KUBECONFIG file now @@ -137,17 +152,19 @@ control_plane_ready=false until [ $control_plane_ready = true ]; do sleep 60 + echo Polling nodes to check if basic services up yet get_nodes=$(kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get nodes | awk -v OFS='\t' 'FNR == 2{print $3}') - echo $get_nodes + #echo $get_nodes # if it's ready, get_nodes should contain 'control-plane', otherwise keep looping if [ $get_nodes == 'control-plane' ]; then control_plane_ready=true fi - echo $control_plane_ready + #echo $control_plane_ready + echo not ready yet, waiting ... done - +echo Nodes responding, installing control layer components # start installing the control layer components @@ -158,7 +175,7 @@ kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f calico.yaml # create ceph secret before we build our worker nodes; # config will use this to kernel mount our ceph shares -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f cephx-secret.yaml +#kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f cephx-secret.yaml kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig create secret -n kube-system generic cloud-config --from-file=cloud.conf=${CLUSTER_CREDENTIAL_FILE} @@ -170,9 +187,10 @@ echo waiting for cluster completion cluster_status='False' until [ $cluster_status == 'True' ]; do + echo polling cluster status ... sleep 60 cluster_status=$( clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | awk -v OFS='\t' 'FNR == 2{print $2}' ) - echo $cluster_status + #echo Cluster status: $cluster_status done echo Cluster creation complete @@ -215,15 +233,18 @@ echo Manila installation complete fi # TODO - wait for our workers to become available? +# at this point we should have a functional k8s cluster +# but it might take some time for all the workers to become available +# or never, if we asked for too many machines ... -echo Looping till workers are available +echo Looping till all workers are available worker_nodes_status='False' until [ $worker_nodes_status == 'True' ]; do sleep 60 worker_nodes_status=$(clusterctl describe cluster ${CLUSTER_NAME} --grouping=false | grep -E "MachineDeployment" | awk -v OFS='\t' '{print $2}') - echo $worker_nodes_status + echo worker status: $worker_nodes_status done - +echo Cluster $CLUSTER_NAME creation complete From 6f2511f2547c1bac8c45bddd3797662cc8f2cbec Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Tue, 21 Jan 2025 15:51:04 +0000 Subject: [PATCH 07/12] Added config file Environment variables for the cluster building script Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/ClusterAPIScripts/cluster_config.rc | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 notes/millingw/ClusterAPIScripts/cluster_config.rc diff --git a/notes/millingw/ClusterAPIScripts/cluster_config.rc b/notes/millingw/ClusterAPIScripts/cluster_config.rc new file mode 100644 index 00000000..71a5d08d --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/cluster_config.rc @@ -0,0 +1,12 @@ +# TODO read this all from a yaml config file, instead of specifying it all here! +export KUBECONFIG=/home/rocky/openstack/k8sdir/config +export CLUSTER_NAME=iris-gaia-red-ceph +export CLUSTER_SPECIFICATION_FILE=capi-iris-gaia-red-ceph-file-test.yaml +export CLUSTER_CREDENTIAL_FILE=appcred-iris-gaia-red-fixed-bootstrap.conf +export CINDER_SECRETS_FILE=cinder-values.yaml + +USE_MANILA=true +MANILA_PROTOCOLS_FILE=./manila-csi-kubespray/values.yaml +MANILA_SECRETS_FILE=./manila-csi-kubespray/secrets.yaml +MANILA_STORAGE_CLASS_FILE=./manila-csi-kubespray/sc.yaml +DEFAULT_STORAGE_CLASS=manila From 204fd76f589e3e18edfd01dd761f7ba2ef6ad67e Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Tue, 21 Jan 2025 16:50:45 +0000 Subject: [PATCH 08/12] Updated with script config notes Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/ClusterAPIScripts/Readme.MD | 44 +++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/notes/millingw/ClusterAPIScripts/Readme.MD b/notes/millingw/ClusterAPIScripts/Readme.MD index 6d065a69..5896654e 100644 --- a/notes/millingw/ClusterAPIScripts/Readme.MD +++ b/notes/millingw/ClusterAPIScripts/Readme.MD @@ -1 +1,43 @@ -### Placeholder for example ClusterAPI related scripts and things +## ClusterAPI build scripts + +Building a cluster involves multiple steps and lots of configuration files. +Each site that we deploy to is likely to have different storage configurations, networks, credentials +Here I am trying to collect together the set of config files for each site that we are deploying to, and using a single deployment script, build_my_cluster.sh +build_my_cluster.sh assumes that all preparatory work has already been done, ie a management cluster has been created, compatible ClusterAPI images have been created and tested in the target OpenStack environments, and a cluster template has been generated. +The following tools must be installed prior to running the script: kubectl, clusterctl, openstack cli + +The script reads a config file, which sets all the necessary environment variables that the script expects: + +export KUBECONFIG= +export CLUSTER_NAME= +export CLUSTER_SPECIFICATION_FILE= +export CLUSTER_CREDENTIAL_FILE= +export CINDER_SECRETS_FILE= Date: Tue, 21 Jan 2025 17:03:07 +0000 Subject: [PATCH 09/12] Markdown formatting Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/ClusterAPIScripts/Readme.MD | 25 +++++++++++----------- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/notes/millingw/ClusterAPIScripts/Readme.MD b/notes/millingw/ClusterAPIScripts/Readme.MD index 5896654e..5406db82 100644 --- a/notes/millingw/ClusterAPIScripts/Readme.MD +++ b/notes/millingw/ClusterAPIScripts/Readme.MD @@ -2,36 +2,37 @@ Building a cluster involves multiple steps and lots of configuration files. Each site that we deploy to is likely to have different storage configurations, networks, credentials -Here I am trying to collect together the set of config files for each site that we are deploying to, and using a single deployment script, build_my_cluster.sh +Here I am trying to collect together the set of config files for each site that we are deploying to, and using a single deployment script, build_my_cluster.sh, to try and make deployment a bit less manual build_my_cluster.sh assumes that all preparatory work has already been done, ie a management cluster has been created, compatible ClusterAPI images have been created and tested in the target OpenStack environments, and a cluster template has been generated. The following tools must be installed prior to running the script: kubectl, clusterctl, openstack cli The script reads a config file, which sets all the necessary environment variables that the script expects: +``` export KUBECONFIG= export CLUSTER_NAME= export CLUSTER_SPECIFICATION_FILE= export CLUSTER_CREDENTIAL_FILE= export CINDER_SECRETS_FILE= Date: Wed, 22 Jan 2025 11:22:04 +0000 Subject: [PATCH 10/12] More tidying up of notes Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/DeployClusterAPI.md | 230 +++++++++-------------------- 1 file changed, 71 insertions(+), 159 deletions(-) diff --git a/notes/millingw/DeployClusterAPI.md b/notes/millingw/DeployClusterAPI.md index c696795b..f79ce005 100644 --- a/notes/millingw/DeployClusterAPI.md +++ b/notes/millingw/DeployClusterAPI.md @@ -62,11 +62,13 @@ This turns our starting magnum-created kubernetes cluster into a ClusterAPI mana clusterctl init --infrastructure openstack ``` -Our cluster on Somerville is now our management cluster. +Our cluster on Somerville is now our management cluster. We can use this to deploy and manage multiple OpenStack clusters on different sites. +If the management cluster is accidently deleted, then our worker clusters become independent and will still work, but won't be manageable via ClusterAPI. # Build CAPI image in target OpenStack environment: -Next, we need to build a control image in our target OpenStack environment +Next, we need to build a control image in our target OpenStack environment. The management cluster will use this image to create clusters in the target project. +Prerequisites are an existing Ubuntu image in the target project, and OpenStack credentials with project-level permissions for the target project. Install Packer on command/control VM: @@ -102,7 +104,7 @@ packer init reqs-build.pkr.hcl create packer_var_file.json, edited for arcus red project -Note that I had to add packer_build_ingest security group to arcus project to allow ssh access for packer to build image +Note that I had to add a "packer_build_ingest" security group to the arcus project to allow ssh access for packer to build the image "networks" is existing router in OpenStack project, did not have to create this CUDN-Internet is existing floating ip pool name in gaia red project Had to work out flavor and image name from looking at options in the arcus gaia red OpenStack project and doing some trial VM creations to get good combinations @@ -169,7 +171,7 @@ Notes assume server certificates saved to arcus-openstack-hpc-cam-ac-uk.pem Create environment variable script for configuring clusterctl deployment. Note that a value must be supplied for OPENSTACK_DNS_NAMESERVERS must be supplied for the config file generation; however, it may be necessary to edit or delete this from the generated config file (see below). -(We've seen that on Arcus the value is ignored, but on BSC it is used directly) +(We've seen that on Arcus the value is ignored, but on BSC it is used directly and messes things up) ``` capi-arcus-red-vars.sh: @@ -216,13 +218,13 @@ export KUBECONFIG=/home/rocky/openstack/k8sdir/config ## Create ClusterAPI config -# generate a template file for the new cluster using the environment variables we set -# capi-red.yaml will be an openstack-specific, project specific template file for building a new k8s cluster -# this does not actually create a cluster, just a new template for building a cluster +generate a template file for the new cluster using the environment variables we set +capi-red.yaml will be an openstack-specific, project specific template file for building a new k8s cluster +Note this does not actually create a cluster, just a new template for building a cluster clusterctl generate cluster iris-gaia-red > capi-red.yaml -Note that we can't check the generated yaml file into public github, as it contains (base64-encoded) access credentials for OpenStack +Warning! Note that we can't check the generated yaml file into public github, as it contains (base64-encoded) access credentials for OpenStack The DNS configuration isn't required although the generate script insists that the environment variable is set. You can remove the dns server reference from the config yaml ("dnsNameservers", see below), if not required. (See above note about BSC) @@ -376,12 +378,12 @@ Watch progress clusterctl describe cluster ${CLUSTER_NAME} ``` -The cluster initialises with no available storage classes, therefore applications cannot immediately be deployed. +The cluster initialises with no available storage classes, therefore applications cannot immediately be deployed. +We assume OpenStack systems will always provide a Cinder storage service, so install the Cinder storage driver into our new cluster. # Install cinder driver Install the cinder helm chart - Edit cinder-values.yaml to match our deployed cluster. We point it at the secret we already created during the calico installation ``` @@ -419,6 +421,10 @@ Note: it should be possible to automate this through the ClusterAPI template, bu # mount data shares At this point our cluster is ready to use. However, we need to be able to access the GAIA DR3 (and potentially other) data from our services. + +If we used a pre-existing network already configured to use the site-specific storage service network, and configured mount instructions in the worker template, then we shouldn't have anything further to do to access the data. Otherwise, we have some work to do in configuring routers and manually mounting services. + +The following instructions are for Arcus. Other sites will have different requirements. On the arcus deployment, data is held in a separate project ("iris-gaia-data") within the same physical hardware. In the Horizon GUI, select iris-gaia-data in the project list, then navigate to "shares". Identify the required data share, and note the share path and the associated cephx access rule and key. @@ -459,7 +465,46 @@ Filesystem 10.4.200.9:6789,10.4.200.13:6789,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/fa5309a4-1b69-4713-b298-c8d7a479f86f/d53177c6-c45c-4583-9947-d50ab931445c 10G 0 10G 0% /mnt/cephfs ``` -Note to self - write a script to automate the above! +Doing this for each machine in our cluster is clearly not ideal. The ClusterAPI template allows us to specify extended configuration information as follows. +Here, before worker machines join our cluster, we install and configure ceph, and create keyring files for our shares, and create mount entries in /etc/fstab +Then, we force a remount as the worker joins the cluster. (This does assume the ceph network has already been configured, otherwise the worker will likely fail). + +``` +kind: KubeadmConfigTemplate +metadata: + name: iris-gaia-red-ceph-md-0 + namespace: default +spec: + template: + spec: + mounts: [] + preKubeadmCommands: ["apt-get update;", "apt-get install ceph-common -y;", "mkdir -p /mnt/kubernetes_scratch_share", "echo 10.4.200.9:6789,10.4.200.13:67 +89,10.4.200.17:6789,10.4.200.25:6789,10.4.200.26:6789:/volumes/_nogroup/280b44fc-d423-4496-8fb8-79bfc1f58b97/35e407e9-a34b-4c64-b480-3380002d64f8 /mnt/kubernet +es_scratch_share ceph name=kubernetes-scratch-share,noatime,_netdev 0 2 >> /etc/fstab"] + files: + - path: /etc/ceph/ceph.conf + content: | + [global] + fsid = a900cf30-f8a3-42bf-98d6-af7ce92f1a1a + mon_host = [v2:10.4.200.13:3300/0,v1:10.4.200.13:6789/0] [v2:10.4.200.9:3300/0,v1:10.4.200.9:6789/0] [v2:10.4.200.17:3300/0,v1:10.4.200.17:6789/0 +] [v2:10.4.200.26:3300/0,v1:10.4.200.26:6789/0] [v2:10.4.200.25:3300/0,v1:10.4.200.25:6789/0] + + - path: /etc/ceph/ceph.client.kubernetes-scratch-share.keyring + content: | + [client.kubernetes-scratch-share] + key = REDACTED + + postKubeadmCommands: ["sudo mount -a"] + + joinConfiguration: + nodeRegistration: + kubeletExtraArgs: + cloud-provider: external + provider-id: openstack:///'{{ instance_id }}' + name: '{{ local_hostname }}' +``` + +(It should be possible to configure other storage types, such as nfs, in a similar fashion) Now that all our workers have the data share mounted, we can access it via a hostPath mount from our pods, eg @@ -481,7 +526,7 @@ The (read-only) DR3 data should now be accessible in the pod at /mnt/dr3_data_sh ## rescale cluster -The management cluster is used to view active workers and rescale a running worker cluster, via the machinedeployments class. +The management cluster can be used to view active workers and rescale a running worker cluster, via the machinedeployments class. e.g. ``` @@ -490,12 +535,16 @@ NAME CLUSTER REPLICAS READY UPDATED UNAV bsc-gaia-md-0 bsc-gaia 3 3 3 0 Running 25h v1.30.2 iris-gaia-red-ceph-md-0 iris-gaia-red-ceph 4 4 4 0 Running 22d v1.30.2 iris-gaia-red-demo-md-0 iris-gaia-red-demo 7 7 7 0 Running 6d2h v1.30.2 +``` -$ kubectl scale machinedeployment iris-gaia-red-demo-md-0 --replicas=9 +Increase number of workers for one of our clusters +``` +$ kubectl scale machinedeployment iris-gaia-red-demo-md-0 --replicas=9 ``` -Note that with our current deployment, new VMs will not automatically get the ceph mounts. This will require manual intervention to perform the ceph configuration +If we specified the storage mounts in our cluster template, then these should automatically be applied when the new worker joins the cluster. +However, if we created the mounts manually, this will need to be repeated manually for the new worker. # Deleting a cluster @@ -547,12 +596,11 @@ The deployed clusters will still function independently, assuming we have their However, we should do everything to avoid this happening ... -## Ceph and Manila CSI configuration - -Warning! Work in progress from this point ... +## Manila configuration +On Arcus and Somerville we have access to a Manila service. This effectively acts as a higher level storage service, and supports multiple protocols. +Currently these sites are configured to support ceph via Manila, so we can install the manila storage driver into our cluster. - -# install the ceph csi driver +# First install the ceph csi driver as manila will need it # followed notes at https://gitlab.developers.cam.ac.uk/pfb29/manila-csi-kubespray ``` @@ -629,6 +677,9 @@ parameters: kubectl apply --kubeconfig=./${CLUSTER_NAME}.kubeconfig -f sc.yaml ``` +We now have the manila storage driver installed. +We can make this the default storage class, so any user volumes are automatically created as ceph shares instead of cinder volumes + # make manila the default storage class ``` @@ -644,150 +695,11 @@ csi-cinder-sc-retain cinder.csi.openstack.org Retain csi-manila-cephfs (default) cephfs.manila.csi.openstack.org Delete Immediate false 5d5 ``` -# test access to cephfs service -In Horizon GUI, manually create a share. Create a cephx access rule, then copy the access key and full storage path - -Create a secret containing the access key - -ceph-secret.yaml -``` -apiVersion: v1 -kind: Secret -metadata: - name: ceph-secret -stringData: - key: **** -``` -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f ceph-secret.yaml - -Create a test pod that mounts the ceph share as a volume. The ceph share path needs to be separated into a list of monitor addresses and the relative path, eg - -pod.yaml - -``` ---- -apiVersion: v1 -kind: Pod -metadata: - name: test-cephfs-share-pod -spec: - containers: - - name: web-server - image: nginx - imagePullPolicy: IfNotPresent - volumeMounts: - - name: testpvc - mountPath: /var/lib/www - - name: cephfs - mountPath: "/mnt/cephfs" - volumes: - - name: testpvc - persistentVolumeClaim: - claimName: test-cephfs-share-pvc - readOnly: false - - name: cephfs - cephfs: - monitors: - - 10.4.200.9:6789 - - 10.4.200.13:6789 - - 10.4.200.17:6789 - - 10.4.200.25:6789 - - 10.4.200.26:6789 - secretRef: - name: ceph-secret - readOnly: false - path: "/volumes/_nogroup/ca890f73-3e33-4e07-879c-f7ec0f5a8a17/52bcd13b-a358-40f0-9ffa-4334eb1e06ae" -``` - -Example uses nginx, so install that: - -``` -helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig nginx bitnami/nginx -``` - -deploy the pod -``` -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f manila-csi-kubespray/pod.yaml -``` - -Inspect the pod to verify that the ceph share was successfully mounted - -# test jhub deployment, check where user areas get created - -deploy jhub, check where user area is created - -``` -helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ -helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig upgrade --install jhub jupyterhub/jupyterhub --version=3.3.8 -``` - -# port forward on control VM -``` -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace=default port-forward service/proxy-public 8080:http -``` - -# port forward on laptop: -ssh -i "gaia_jade_test_malcolm.pem" -L 8080:127.0.0.1:8080 rocky@192.41.122.174 -browse to 127.0.0.1:8080 and login, eg as user 'hhh' - -# on control VM, list pvs/pvcs -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pv -NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE 6h56m -pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO Delete Bound default/claim-hhh csi-manila-cephfs 6h51m -pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO Delete Bound default/hub-db-dir csi-manila-cephfs - -kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pvc -NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE -claim-hhh Bound pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO csi-manila-cephfs 6h52m -hub-db-dir Bound pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO csi-manila-cephfs 6h58m - -## Thoughts on automation and migration - -Each system that we deploy to will have different networking setup, storage services, image names, machine flavour. -Each system requires that a ClusterAPI image be built in that system from an Ubuntu image already present in that system. -For each system, we generate a configuration file using clusterctl generate. -Getting a working generation image and working combinations of images / flavours likely to be a trial and error process, little prospect for automation -Once we have a working template for a given site, that template can be reused for that site, but that site only. -Given a particular site with a working template, it should be possibe to automate creation of a cluster at that site. -Each site will require specific post-creation configuration, e.g. ceph mounts on Arcus, nfs(?) mounts on BSC - -Manual stages: -Install packer, clusterctl, server certificates etc. -Manually build / test image in target environment, get working combinations of flavours and boot disk sizes. -Generate template file, adjust any arguments. -Once we've got this far, can automate using the template. -Note that we can't check templates into a repo, as they contain security information - -Automated stages: - -kubectl apply template file -clusterctl describe until ready -get kubeconfig file -apply calico -use openstack to lookup network id for new network (how do we get cluster name? from environment variable?) -build application secret conf file -build secret in target environment -complete setup -install cinder storage classes +At this point our new cluster should be ready to accept kubernetes services in the normal fashion, using the KUBECONFIG file that was generated during the cluster creation. -do site-specific post-installation: -get list of worker names via kubectl get nodes -install ceph client on each worker node -configure ceph on each worker node -- mount ceph shares on Arcus. need list of shares to mount, lookup keys and create share mount on each worker VM -- attach shared volumes on Somerville, BSC? ) -- modify /etc/fstab rather than configuring from directory? -Things to try: -Automatic configuration of ceph network on arcus -attach manila shares to pod instead of using ceph mounts (wont be available at every site) -Generic scripts: -lookup network id, build conf file -lookup keys for ceph shares -install list of ceph shares on VMs -get list of worker node names and ip addresses From 1bdc69aa4db237478ca4b3f4b55ef88e4b4675fb Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Wed, 22 Jan 2025 11:25:54 +0000 Subject: [PATCH 11/12] Added notes for manila test Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- .../20250122-manila-test.txt | 102 ++++++++++++++++++ 1 file changed, 102 insertions(+) create mode 100644 notes/millingw/ClusterAPIScripts/20250122-manila-test.txt diff --git a/notes/millingw/ClusterAPIScripts/20250122-manila-test.txt b/notes/millingw/ClusterAPIScripts/20250122-manila-test.txt new file mode 100644 index 00000000..56a8ef85 --- /dev/null +++ b/notes/millingw/ClusterAPIScripts/20250122-manila-test.txt @@ -0,0 +1,102 @@ +# test access to cephfs service +We should be able to access ceph shares directly in a pod. +However, as of 2025-01-22 this wasn't working! + +In Horizon GUI, manually create a share. Create a cephx access rule, then copy the access key and full storage path + +Create a secret containing the access key + +ceph-secret.yaml +``` +apiVersion: v1 +kind: Secret +metadata: + name: ceph-secret +stringData: + key: **** +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f ceph-secret.yaml + +Create a test pod that mounts the ceph share as a volume. The ceph share path needs to be separated into a list of monitor addresses and the relative path, eg + +pod.yaml + +``` +--- +apiVersion: v1 +kind: Pod +metadata: + name: test-cephfs-share-pod +spec: + containers: + - name: web-server + image: nginx + imagePullPolicy: IfNotPresent + volumeMounts: + - name: testpvc + mountPath: /var/lib/www + - name: cephfs + mountPath: "/mnt/cephfs" + volumes: + - name: testpvc + persistentVolumeClaim: + claimName: test-cephfs-share-pvc + readOnly: false + - name: cephfs + cephfs: + monitors: + - 10.4.200.9:6789 + - 10.4.200.13:6789 + - 10.4.200.17:6789 + - 10.4.200.25:6789 + - 10.4.200.26:6789 + secretRef: + name: ceph-secret + readOnly: false + path: "/volumes/_nogroup/ca890f73-3e33-4e07-879c-f7ec0f5a8a17/52bcd13b-a358-40f0-9ffa-4334eb1e06ae" +``` + +Example uses nginx, so install that: + +``` +helm install --kubeconfig=./${CLUSTER_NAME}.kubeconfig nginx bitnami/nginx +``` + +deploy the pod +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig apply -f manila-csi-kubespray/pod.yaml +``` + +Inspect the pod to verify that the ceph share was successfully mounted + +# test jhub deployment, check where user areas get created + +deploy jhub, check where user area is created + +``` +helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/ +helm --kubeconfig=./${CLUSTER_NAME}.kubeconfig upgrade --install jhub jupyterhub/jupyterhub --version=3.3.8 +``` + +# port forward on control VM +``` +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig --namespace=default port-forward service/proxy-public 8080:http +``` + +# port forward on laptop: +ssh -i "gaia_jade_test_malcolm.pem" -L 8080:127.0.0.1:8080 rocky@192.41.122.174 +browse to 127.0.0.1:8080 and login, eg as user 'hhh' + +# on control VM, list pvs/pvcs +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pv +NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE 6h56m +pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO Delete Bound default/claim-hhh csi-manila-cephfs 6h51m +pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO Delete Bound default/hub-db-dir csi-manila-cephfs + +kubectl --kubeconfig=./${CLUSTER_NAME}.kubeconfig get pvc +NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE +claim-hhh Bound pvc-8b970f5c-440b-48f8-ae19-4fb35d20e85f 10Gi RWO csi-manila-cephfs 6h52m +hub-db-dir Bound pvc-7d104b45-7efe-4250-b9fe-5bf441eb65a9 1Gi RWO csi-manila-cephfs 6h58m + + + From 531ffd8309479715c8c669de1d518dd3e83a43ad Mon Sep 17 00:00:00 2001 From: millingw <13414895+millingw@users.noreply.github.com> Date: Wed, 22 Jan 2025 11:28:12 +0000 Subject: [PATCH 12/12] Update Readme.MD Signed-off-by: millingw <13414895+millingw@users.noreply.github.com> --- notes/millingw/ClusterAPIScripts/Readme.MD | 2 ++ 1 file changed, 2 insertions(+) diff --git a/notes/millingw/ClusterAPIScripts/Readme.MD b/notes/millingw/ClusterAPIScripts/Readme.MD index 5406db82..a11aafbe 100644 --- a/notes/millingw/ClusterAPIScripts/Readme.MD +++ b/notes/millingw/ClusterAPIScripts/Readme.MD @@ -38,6 +38,8 @@ Cluster creation can be monitored with clusterctl, ie clusterctl describe cluste Note that a cluster may be ready for use before all workers are ready; the script may loop indefinitely if the target project can't provide the requested number of workers. +On successfull completion of the script, a KUBECONFIG file should be output that can be used to install services on the newly created cluster. + The resulting cluster and KUBECONFIG file can then be used to install kubernetes services in the usual fashion. The intention is to maintain a set of production scripts for each deployment site, with a separate master configuration file for each site to be sourced by the build script.