-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s 1.18.8 pods stuck in CrashLoopBackOff #2158
Comments
What do you see if you |
Below is the output when describing the coredns pod.
The logging of the
All other pods have the same type of event:
It seems that containerd is not able to create the container? |
@vvanouytsel given your installation method I suspect that you do not have the
So, either disable SELinux (
|
I've manually installed the following packages.
I've deleted the coredns pod and it still throws the following error:
I've also tried the k3s install script on a clean CentOS 7.8 vagrant box, and that worked perfectly. |
After some more testing I can confirm that everything works fine as long as I install the selinux packages before running
On the broken system where I installed the selinux packages after I already started k3s, I had to do the following:
Containerd is now able to successfully start the containers.
|
If you'd installed from the script, it would have checked for this on install to ensure that you got the package in before starting k3s for the first time. It would have also dropped a k3s-killall.sh that you could have used to terminate the pods so that they could be recreated with the correct context. Although in this case, I suspect that just restarting the node (which would have recreated the pods) would have fixed your issue as well. |
@vvanouytsel it looks like your issue has been solved then per above? If so, I will close this out. FWIW, I recreated the problem you had and came to the same resolution as the workaround. As @brandond mentioned, the simpler install steps would just be to install k3s from the script:
If you run the install script before installing selinux policy, you'll get an error like:
|
@brandond restarting the node does not solve the issue.
After restarting k3s, the pods still stay in
The logs of the
|
After installing the SELinux packages I ran restorecon.
When restarting k3s I found the following SELinux denials in the audit.log file.
|
When further investigating the broken system I can see that containerd tries to start a container with image
We can see that the image maps to the
When matching this with my previous comment, it seems that something SELinux related is not set up correctly yet for that image.
|
When looking through journalctl we can also see that that SELinux error is related to
|
I was able to work around the SELinux problem related to the '/pause' file, which is used in the 'docker.io/rancher/pause' image by running the following:
Although I am not sure why I had to do this manually after installing the following required SELinux pacakges...
|
Just to make this issue complete, it seemed that there were 2 events related to the
|
As you've noticed, it's a fair bit of work to try to repair a system that's been brought up without the correct packages and policies available. It's even more work to try do to it without simply deleting everything and starting over. @davidnuzik I suggest that we handle this as a documentation issue - the install script checks for this and prevents users from starting k3s without the selinux policies in place, but users that want to drop the binary directly without using the script (or RPM) should be responsible for ensuring that this is done themselves. |
I am also able to reproduce SELinux issues when defining a custom 'data-dir' path.
In journalctl the following error message is shown related to SELinux.
Adding a custom module for all 'pause' SELinux alerts works around the issue.
After solving the 'pause' issue, the next SELinux related alert pops up.
Again, by creating a custom module you can work around the issue.
After these two manual actions, k3s is running properly with a custom data directory. |
The install script does not validate or take any action on the flags; they're passed through to the systemd/openrc service config as-is. Setting the correct context on the data-dir (whether default or custom) would definitely be part of the documentation for manual installs with selinux. |
Documentation for how we'll recommend a k3s install on selinux-enforcing system to be handled here: #2058 The plan will be to support the yum repo installation method. |
I believe this should be covered in the documentation now. |
Environmental Info:
K3s Version:
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
1 server functioning as master and node
Describe the bug:
When running
k3s server
all pods will stay inCrashLoopBackOff
.The error message seems to be related to the fact that no '*.log' file is created. However the directory does exist.
Steps To Reproduce:
Expected behavior:
I would expect the pods to be in a Running state.
Actual behavior:
The pods are in a CrashLoopBackOff state and are contstantly restarting.
The log files of the pods specify the following
After some retries the file is created and no error mesage is shown anymore, however the pod is still in
CrashLoopBackOff
status.The text was updated successfully, but these errors were encountered: