Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usernetes doesn't work with Pasta mode of Podman (works fine with slirp4netns mode): Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout #2260

Open
AkihiroSuda opened this issue Dec 2, 2024 · 14 comments

Comments

@AkihiroSuda
Copy link

Usernetes (Kubernetes in Rootless Docker/Podman/nerdctl) works fine with Rootless Podman v5 + slirp4netns.

However, it doesn't seem to work with Pasta:

$ kubectl get -n kube-flannel pods
NAME                    READY   STATUS             RESTARTS      AGE
kube-flannel-ds-pnnrt   0/1     CrashLoopBackOff   7 (58s ago)   15m

: ↑ kubectl can connect to kube-apiserver

$ kubectl logs -n kube-flannel daemonsets/kube-flannel-ds 
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
Error from server: Get "https://192.168.5.15:10250/containerLogs/kube-flannel/kube-flannel-ds-pnnrt/kube-flannel": dial tcp 192.168.5.15:10250: i/o timeout

: ↑ kube-apiserver is failing to connect to kubelet

$ podman exec usernetes_node_1 sh -euxc 'cat /var/log/containers/kube-flannel-ds-*_kube-flannel_kube-flannel-*.log'
+ cat /var/log/containers/kube-flannel-ds-pnnrt_kube-flannel_kube-flannel-81d4059f4344ffb796b1ac0de247cf71d4b5dfc837a03e0307a54103e8e618ed.log
2024-12-02T20:59:22.893780871Z stderr F I1202 20:59:22.892812       1 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
2024-12-02T20:59:22.893836867Z stderr F W1202 20:59:22.893326       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2024-12-02T20:59:52.908415802Z stderr F E1202 20:59:52.908036       1 main.go:229] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-pnnrt': Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout

: ↑ the flannel pod is failing to connect to KUBERNETES_SERVICE_HOST

I haven't figured out whether this is Podman's misconfiguration of pasta, or a bug of pasta itself.
I'm opening an issue here anyway so as to inform that Podman shouldn't drop the support for slirp4netns yet.

Reproduction steps

  • Set network.default_rootless_network_cmd to "pasta" or "slirp4netns"
mkdir -p "$HOME/.config/containers/containers.conf.d"
cat <<EOF >"$HOME/.config/containers/containers.conf.d/network.conf"
[network]
# "pasta" (default since Podman v5) or "slirp4netns"
default_rootless_network_cmd="slirp4netns"
EOF
  • Install Podman, Podman Compose, and misc utilities
sudo dnf install -y podman podman-compose git make jq kubectl
  • Configure cgroup v2 delegation
sudo mkdir -p /etc/systemd/system/[email protected]
sudo tee /etc/systemd/system/[email protected]/delegate.conf <<EOF >/dev/null
[Service]
Delegate=cpu cpuset io memory pids
EOF
sudo systemctl daemon-reload
  • Load kernel modules
sudo modprobe br_netfilter
sudo modprobe vxlan
  • Set up a node of Usernetes using Rootless Podman
git clone https://github.com/rootless-containers/usernetes.git
cd usernetes
git checkout gen2-v20241203.0

export CONTAINER_ENGINE=podman
make up
make kubeadm-init
make install-flannel
make kubeconfig
export KUBECONFIG="$(pwd)/kubeconfig"
  • Check the status of kube-flannel-ds

🔴 pasta (CrashLoopBackOff):

$ kubectl get -n kube-flannel pods
NAME                    READY   STATUS             RESTARTS      AGE
kube-flannel-ds-pnnrt   0/1     CrashLoopBackOff   7 (58s ago)   15m

: ↑ kubectl can connect to kube-apiserver

$ kubectl logs -n kube-flannel daemonsets/kube-flannel-ds 
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
Error from server: Get "https://192.168.5.15:10250/containerLogs/kube-flannel/kube-flannel-ds-pnnrt/kube-flannel": dial tcp 192.168.5.15:10250: i/o timeout

: ↑ kube-apiserver is failing to connect to kubelet

$ podman exec usernetes_node_1 sh -euxc 'cat /var/log/containers/kube-flannel-ds-*_kube-flannel_kube-flannel-*.log'
+ cat /var/log/containers/kube-flannel-ds-pnnrt_kube-flannel_kube-flannel-81d4059f4344ffb796b1ac0de247cf71d4b5dfc837a03e0307a54103e8e618ed.log
2024-12-02T20:59:22.893780871Z stderr F I1202 20:59:22.892812       1 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
2024-12-02T20:59:22.893836867Z stderr F W1202 20:59:22.893326       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2024-12-02T20:59:52.908415802Z stderr F E1202 20:59:52.908036       1 main.go:229] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-pnnrt': Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout

: ↑ the flannel pod is failing to connect to KUBERNETES_SERVICE_HOST

🟢 slirp4netns (Running):

$ kubectl get -n kube-flannel pods
NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-cbkh4   1/1     Running   0          13m

$ kubectl logs -n kube-flannel daemonsets/kube-flannel-ds
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
I1202 20:54:04.703825       1 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: et
cdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[
] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptables
ResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
W1202 20:54:04.704890       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1202 20:54:04.733564       1 kube.go:139] Waiting 10m0s for node controller to sync
I1202 20:54:04.733734       1 kube.go:469] Starting kube subnet manager
I1202 20:54:05.734597       1 kube.go:146] Node controller sync successful
I1202 20:54:05.734862       1 main.go:232] Created subnet manager: Kubernetes Subnet Manager - u7s-lima-vm1
I1202 20:54:05.734880       1 main.go:235] Installing signal handlers
I1202 20:54:05.735058       1 main.go:469] Found network config - Backend type: vxlan
[...]
I1202 20:54:05.889340       1 main.go:413] Wrote subnet file to /run/flannel/subnet.env
I1202 20:54:05.889892       1 main.go:417] Running backend.
I1202 20:54:05.890676       1 vxlan_network.go:65] watching for new subnet leases
I1202 20:54:05.913585       1 main.go:438] Waiting for all goroutines to exit
I1202 20:54:05.923924       1 iptables.go:372] bootstrap done
I1202 20:54:05.938226       1 iptables.go:372] bootstrap done

Host Environment

podman-5.2.5-1.fc41.x86_64
podman-compose-1.2.0-2.fc41.noarch

passt-0^20240906.g6b38f07-1.fc41.x86_64
passt-selinux-0^20240906.g6b38f07-1.fc41.noarch

libslirp-4.8.0-2.fc41.x86_64
slirp4netns-1.3.1-1.fc41.x86_64
@AkihiroSuda
Copy link
Author

Found a minimum reproducer that doesn't need Usernetes:

HOST_IP="$(hostname  -I | cut -f1 -d' ')"
podman network create foo
podman run --name foo -d --network foo -p 80:80 docker.io/library/nginx:alpine
podman exec foo wget -O- "http://${HOST_IP}"
# wget hangs with pasta, works with slirp4netns

Probably this has been a known issue?

@mheon
Copy link
Member

mheon commented Dec 3, 2024

@Luap99 This doesn't look like the two known issues we were looking at, right? I wasn't aware of any issues hitting the host by its IP

@Luap99
Copy link
Member

Luap99 commented Dec 4, 2024

Found a minimum reproducer that doesn't need Usernetes:

HOST_IP="$(hostname  -I | cut -f1 -d' ')"
podman network create foo
podman run --name foo -d --network foo -p 80:80 docker.io/library/nginx:alpine
podman exec foo wget -O- "http://${HOST_IP}"
# wget hangs with pasta, works with slirp4netns

Probably this has been a known issue?

If HOST_IP == ip that pasta picked for the namespace then yes. https://blog.podman.io/2024/10/podman-5-3-changes-for-improved-networking-experience-with-pasta/

Routing wise that ip will always stay in the namespace as it is th eip of the pasta interface, you will have to use host.containers.internal to connect to the host.

@Luap99
Copy link
Member

Luap99 commented Dec 4, 2024

Although I do wonder, this could work as we create the firewall rules in the rootless-netns so I would assume the nat mapping to take care of it. I have to take a closer look at the exact details here

@Luap99
Copy link
Member

Luap99 commented Dec 4, 2024

Found a minimum reproducer that doesn't need Usernetes:

HOST_IP="$(hostname  -I | cut -f1 -d' ')"
podman network create foo
podman run --name foo -d --network foo -p 80:80 docker.io/library/nginx:alpine
podman exec foo wget -O- "http://${HOST_IP}"
# wget hangs with pasta, works with slirp4netns

Probably this has been a known issue?

These steps work fine for me on f40 with

$ rpm -q podman passt 
podman-5.3.1-1.fc40.x86_64
passt-0^20241030.gee7d0b6-1.fc40.x86_64

@AkihiroSuda
Copy link
Author

After running sudo dnf upgrade, the podman exec foo wget -O- "http://${HOST_IP}" example now works, but Usernetes still fails as shown in the OP

podman-5.3.1-1.fc41.x86_64
passt-0^20241127.gc0fbc7e-1.fc41.x86_64

@kwilczynski
Copy link
Member

@AkihiroSuda are you now on Fedora 41?

@AkihiroSuda
Copy link
Author

AkihiroSuda commented Dec 5, 2024

@AkihiroSuda are you now on Fedora 41?

Yes
(I used F41 in the OP too. F41 seems to install podman-5.2.5-1.fc41.x86_64 by default until running sudo dnf upgrade)

@kwilczynski
Copy link
Member

@AkihiroSuda are you now on Fedora 41?

Yes (I used F41 in the OP too. F41 seems to install podman-5.2.5-1.fc41.x86_64 by default until running sudo dnf upgrade)

@AkihiroSuda, the reason why I asked is that @sohankunkerkar has been running into similar issues (or network-related ones), where things have also stopped working after an upgrade to Fedora 41.

However, I am talking here about a set-up using Kubernetes local cluster via the hack/local-up-cluster.sh, using CRI-O and simple bridge CNI (latest release of the plugins or older alike) no longer work after the upgrade. Albeit, things still work fine in Fedora 39 and 40.

We also tested containerd with the same local development cluster, which uses simple bridge CNI, and it also does not work under Fedora 41, but works in older releases.

What we are seeing is that the internal in-cluster networking is nonfunctional. Nothing can reach any ClusterIPs etc. External access and local (to the host) network work fine.

I wonder if the newer kernel that brings a lot of Netfilter changes (or something done specifically in Fedora) is the culprit. I am not sure yet.

@AkihiroSuda
Copy link
Author

Unlikely to be a kernel issue, as slirp4netns still works.
The situation is same on CentOS Stream 9 too.

@kwilczynski
Copy link
Member

Unlikely to be a kernel issue, as slirp4netns still works. The situation is same on CentOS Stream 9 too.

Also, on RHEL 9+, too. I believe.

I assumed that this might have been the kernel, as the newer GNOME version does not immediately strike me as the culprit. There were also rumours that legacy iptables support is not working correctly. I don't have any concrete data to back this up, though.

I don't know much about how slirp4netns works internally. However, the bridge CNI is so simple, it should still work.

Anyone tried this on any recent Debian or Ubuntu?

@Luap99
Copy link
Member

Luap99 commented Dec 5, 2024

Well the question is what exactly is the cluster doing? If you try to connect to the main host ip (i.e. ip of default route interface which is picked by pasta as default) then this is not going to work as the ip is the same in the namespace as thus never routed to the host. The only reason where this can work is with the simple podman reproducer shown because we do add the same firewall DNAT rules in the rootless-netns so it is able to redirect the traffic there. But if you try to reach something only listening on the host it is not going to work.
As mentioned in the blog you will need to use host.containers.internal as hostname which is mapped to a special ip in order to reach the host.

I don't have time to setup the cluster to check myself right now but how exactly does the network setup look like? What addresses are assigned where and then which connection is failing?

@AkihiroSuda
Copy link
Author

AkihiroSuda commented Dec 6, 2024

If you try to connect to the main host ip (i.e. ip of default route interface which is picked by pasta as default) then this is not going to work as the ip is the same in the namespace as thus never routed to the host.

The main host IP (192.168.5.15) is used so that every node in the Kubernetes cluster can connect to other nodes with same IPs.

This IP (192.168.5.15) is not the same in the namespace, as a custom network is created:

networks:
  default:
    ipam:
      config:
        # Each of the nodes has to have a different IP.
        # The node IP here is not accessible from other nodes.
        - subnet: ${U7S_NODE_SUBNET}

host.containers.internal

This is resolved to 192.168.5.15.
kube-apiserver is failing to connect to the kubelet via this IP: dial tcp 192.168.5.15:10250: i/o timeout

how exactly does the network setup look like?

The IP of the Podman host: 192.168.5.15

[suda@lima-podman usernetes]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:55:55:4e:07:c5 brd ff:ff:ff:ff:ff:ff
    inet 192.168.5.15/24 brd 192.168.5.255 scope global dynamic noprefixroute eth0
       valid_lft 2482sec preferred_lft 2482sec
    inet6 fe80::5055:55ff:fe4e:7c5/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

The IP of the Podman container: 10.100.122.100 (not 192.168.5.15, as a custom network is created)

[suda@lima-podman usernetes]$ podman network inspect usernetes_default | jq -r .[0].subnets.[0].subnet
10.100.122.0/24

[suda@lima-podman usernetes]$ podman exec usernetes_node_1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 06:9f:05:c6:69:1c brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.100.122.100/24 brd 10.100.122.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::283c:67ff:feda:da4e/64 scope link 
       valid_lft forever preferred_lft forever

The IP subnet of the Kubernetes services: 10.96.0.0/16

[suda@lima-podman usernetes]$ cat kubeadm-config.yaml
[...]
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration    
networking:
  serviceSubnet: "10.96.0.0/16"                                                 
  podSubnet: "10.244.0.0/16"
[...]

What addresses are assigned where and then which connection is failing?

(duplicated from the OP)

$ kubectl get -n kube-flannel pods
NAME                    READY   STATUS             RESTARTS      AGE
kube-flannel-ds-pnnrt   0/1     CrashLoopBackOff   7 (58s ago)   15m

: ↑ kubectl can connect to kube-apiserver

$ kubectl logs -n kube-flannel daemonsets/kube-flannel-ds 
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
Error from server: Get "https://192.168.5.15:10250/containerLogs/kube-flannel/kube-flannel-ds-pnnrt/kube-flannel": dial tcp 192.168.5.15:10250: i/o timeout

: ↑ kube-apiserver is failing to connect to kubelet

$ podman exec usernetes_node_1 sh -euxc 'cat /var/log/containers/kube-flannel-ds-*_kube-flannel_kube-flannel-*.log'
+ cat /var/log/containers/kube-flannel-ds-pnnrt_kube-flannel_kube-flannel-81d4059f4344ffb796b1ac0de247cf71d4b5dfc837a03e0307a54103e8e618ed.log
2024-12-02T20:59:22.893780871Z stderr F I1202 20:59:22.892812       1 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
2024-12-02T20:59:22.893836867Z stderr F W1202 20:59:22.893326       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2024-12-02T20:59:52.908415802Z stderr F E1202 20:59:52.908036       1 main.go:229] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-pnnrt': Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout

: ↑ the flannel pod is failing to connect to KUBERNETES_SERVICE_HOST

@Luap99
Copy link
Member

Luap99 commented Dec 6, 2024

The pasta/slirp4netns ip would be visible in podman unshare --rootless-netns ip a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants