Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes on hetzner with crio flannel and hetzner balancer #1043

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

DTiunov
Copy link

@DTiunov DTiunov commented Jan 16, 2025

I have read and understood the Contributor's Certificate of Origin available at the end of
https://raw.githubusercontent.com/hetzneronline/community-content/master/tutorial-template.md
and I hereby certify that I meet the contribution criteria described in it.
Signed-off-by: Dmitry Tiunov [email protected]

@svenja11 svenja11 added the review wanted Request a review label Jan 17, 2025
@svenja11 svenja11 added needs action Something has to be updated and removed review wanted Request a review labels Jan 20, 2025
Copy link
Author

@DTiunov DTiunov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@svenja11 svenja11 added review wanted Request a review and removed needs action Something has to be updated labels Jan 24, 2025
@svenja11
Copy link
Collaborator

I tested your tutorial and it worked fine up until Flannel. With Flannel I get the following error:

Error registering network: failed to acquire lease: node "ubuntu-4gb-hel1-1" pod cidr not assigned

Could you please check if there is something missing in the files (values.yaml / InitConfiguration.yaml / JoinConfiguration.yaml)

@svenja11 svenja11 added needs action Something has to be updated and removed review wanted Request a review labels Jan 27, 2025
@DTiunov
Copy link
Author

DTiunov commented Jan 28, 2025

@svenja11 Fixed. It was networking.podSubnet: 10.244.0.0/16 in InitConfiguration.yaml

@svenja11 svenja11 added review wanted Request a review and removed needs action Something has to be updated labels Jan 28, 2025
@svenja11
Copy link
Collaborator

svenja11 commented Jan 29, 2025

@DTiunov Thank you for the quick fix. I just tested it again, but now the API server seems to fail and it gets stuck during initialization (kubeadm init --config InitConfiguration.yaml). Could you look into this too please?

@DTiunov
Copy link
Author

DTiunov commented Jan 30, 2025

@DTiunov Thank you for the quick fix. I just tested it again, but now the API server seems to fail and it gets stuck during initialization (kubeadm init --config InitConfiguration.yaml). Could you look into this too please?

I just checked this. It works fine with a new node.

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 10.10.0.1:6443 --token abcdef.0123456789abcdef \
        --discovery-token-ca-cert-hash sha256:0c0a1c731aaac6b5e66b2f52d8a4e7ac83b4bdb1f90fb4cb83d6b606b7cf19fc
root@ubuntu-4gb-hel1-1:~# cat InitConfiguration.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 10.10.0.1
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/crio/crio.sock
  kubeletExtraArgs:
    cloud-provider: external
    node-ip: 10.10.0.1
  imagePullPolicy: IfNotPresent
  name: k8s-m-1
  taints: null
---
apiServer:
  certSANs: ['k8s-m-1.example.com', '10.10.0.1']
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
root@ubuntu-4gb-hel1-1:~# history | tail -n 5
   17  kubeadm init --config InitConfiguration.yaml
   18  nano InitConfiguration.yaml
   19  kubeadm init --config InitConfiguration.yaml
   20  cat InitConfiguration.yaml
   21  history | tail -n 5
root@ubuntu-4gb-hel1-1:~#

Could you send me your InitConfiguration file?
Have you tried reinitializing the cluster on a new node or reusing the previous one?

@svenja11
Copy link
Collaborator

I tested it on a new node with Ubuntu 24.04

Click here to see the contents of InitConfiguration.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: <my_token>
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.0.0.2
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/crio/crio.sock
  kubeletExtraArgs:
    cloud-provider: external
    node-ip: 10.0.0.2
  imagePullPolicy: IfNotPresent
  name: example-server
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
  certSANs: ['<my_domain>', '10.0.0.2']
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.30.0
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}

I also double checked my domain.

Domain
root@example-server:~# hostname -I
203.0.113.1 10.0.0.2 2001:db8:5678::1

root@example-server:~# dig <my_domain>
;; ANSWER SECTION:
<my_domain>.            3600    IN      A       203.0.113.1
Click here to see the output of kubeadm init --config InitConfiguration.yaml
root@example-server:~# kubeadm init --config InitConfiguration.yaml
[init] Using Kubernetes version: v1.30.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0130 09:19:59.621179    1188 checks.go:844] detected that the sandbox image "registry.k8s.io/pause:3.10" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [example-server kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local <my_domain>] and IPs [10.96.0.1 1.0.0.2 10.0.0.2]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [example-server localhost] and IPs [1.0.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [example-server localhost] and IPs [1.0.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.179187ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is not healthy after 4m0.000650453s

Unfortunately, an error has occurred:
        context deadline exceeded

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///var/run/crio/crio.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///var/run/crio/crio.sock logs CONTAINERID'
error execution phase wait-control-plane: could not initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
kubectl get pods -n kube-system
root@example-server:~# kubectl get pods -n kube-system
E0130 09:26:48.726175    2060 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0130 09:26:48.726836    2060 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0130 09:26:48.728554    2060 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0130 09:26:48.729130    2060 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0130 09:26:48.730727    2060 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?

root@example-server:~# systemctl status etcd
Unit etcd.service could not be found.

If I run this afterwards and initialize without the file, it doesn't fail:

kubeadm reset
kubeadm init
Click here to see the output
root@example-server:~# kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0130 09:31:16.325148    2105 reset.go:124] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://1.0.0.2:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0130 09:31:16.325264    2105 preflight.go:56] [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0130 09:31:17.912428    2105 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.



root@example-server:~# kubeadm init
I0130 09:32:09.323043    2184 version.go:256] remote version is much newer: v1.32.1; falling back to: stable-1.30
[init] Using Kubernetes version: v1.30.9
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0130 09:32:09.780740    2184 checks.go:844] detected that the sandbox image "registry.k8s.io/pause:3.10" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [example-server kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 203.0.113.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [example-server localhost] and IPs [203.0.113.1 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [example-server localhost] and IPs [203.0.113.1 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 500.95601ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 3.502226014s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node example-server as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node example-server as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: <my_token>
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 203.0.113.1:6443 --token <my_token> \
        --discovery-token-ca-cert-hash sha256:0c0a1c731aaac6b5e66b2f52d8a4e7ac83b4bdb1f90fb4cb83d6b606b7cf19fc
root@example-server:~#

@svenja11 svenja11 added needs action Something has to be updated and removed review wanted Request a review labels Jan 30, 2025
@DTiunov
Copy link
Author

DTiunov commented Jan 31, 2025

I tested it on a new node with Ubuntu 24.04

kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 1.0.0.2
bindPort: 6443

Tested. It works fine for me. You can try to set localAPIEndpoint.advertiseAddress: as your local IP 10.0.0.2
I have added some description to the tutorial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs action Something has to be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants