-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Usernetes doesn't work with Pasta mode of Podman (works fine with slirp4netns mode): Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout
#2260
Comments
Found a minimum reproducer that doesn't need Usernetes:
Probably this has been a known issue? |
@Luap99 This doesn't look like the two known issues we were looking at, right? I wasn't aware of any issues hitting the host by its IP |
If HOST_IP == ip that pasta picked for the namespace then yes. https://blog.podman.io/2024/10/podman-5-3-changes-for-improved-networking-experience-with-pasta/ Routing wise that ip will always stay in the namespace as it is th eip of the pasta interface, you will have to use host.containers.internal to connect to the host. |
Although I do wonder, this could work as we create the firewall rules in the rootless-netns so I would assume the nat mapping to take care of it. I have to take a closer look at the exact details here |
These steps work fine for me on f40 with
|
After running
|
@AkihiroSuda are you now on Fedora 41? |
Yes |
@AkihiroSuda, the reason why I asked is that @sohankunkerkar has been running into similar issues (or network-related ones), where things have also stopped working after an upgrade to Fedora 41. However, I am talking here about a set-up using Kubernetes local cluster via the hack/local-up-cluster.sh, using CRI-O and simple bridge CNI (latest release of the plugins or older alike) no longer work after the upgrade. Albeit, things still work fine in Fedora 39 and 40. We also tested containerd with the same local development cluster, which uses simple bridge CNI, and it also does not work under Fedora 41, but works in older releases. What we are seeing is that the internal in-cluster networking is nonfunctional. Nothing can reach any ClusterIPs etc. External access and local (to the host) network work fine. I wonder if the newer kernel that brings a lot of Netfilter changes (or something done specifically in Fedora) is the culprit. I am not sure yet. |
Unlikely to be a kernel issue, as slirp4netns still works. |
Also, on RHEL 9+, too. I believe. I assumed that this might have been the kernel, as the newer GNOME version does not immediately strike me as the culprit. There were also rumours that legacy iptables support is not working correctly. I don't have any concrete data to back this up, though. I don't know much about how slirp4netns works internally. However, the bridge CNI is so simple, it should still work. Anyone tried this on any recent Debian or Ubuntu? |
Well the question is what exactly is the cluster doing? If you try to connect to the main host ip (i.e. ip of default route interface which is picked by pasta as default) then this is not going to work as the ip is the same in the namespace as thus never routed to the host. The only reason where this can work is with the simple podman reproducer shown because we do add the same firewall DNAT rules in the rootless-netns so it is able to redirect the traffic there. But if you try to reach something only listening on the host it is not going to work. I don't have time to setup the cluster to check myself right now but how exactly does the network setup look like? What addresses are assigned where and then which connection is failing? |
The main host IP (192.168.5.15) is used so that every node in the Kubernetes cluster can connect to other nodes with same IPs. This IP (192.168.5.15) is not the same in the namespace, as a custom network is created: networks:
default:
ipam:
config:
# Each of the nodes has to have a different IP.
# The node IP here is not accessible from other nodes.
- subnet: ${U7S_NODE_SUBNET}
This is resolved to 192.168.5.15.
The IP of the Podman host: 192.168.5.15 [suda@lima-podman usernetes]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:55:55:4e:07:c5 brd ff:ff:ff:ff:ff:ff
inet 192.168.5.15/24 brd 192.168.5.255 scope global dynamic noprefixroute eth0
valid_lft 2482sec preferred_lft 2482sec
inet6 fe80::5055:55ff:fe4e:7c5/64 scope link noprefixroute
valid_lft forever preferred_lft forever The IP of the Podman container: 10.100.122.100 (not 192.168.5.15, as a custom network is created) [suda@lima-podman usernetes]$ podman network inspect usernetes_default | jq -r .[0].subnets.[0].subnet
10.100.122.0/24
[suda@lima-podman usernetes]$ podman exec usernetes_node_1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 06:9f:05:c6:69:1c brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.100.122.100/24 brd 10.100.122.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::283c:67ff:feda:da4e/64 scope link
valid_lft forever preferred_lft forever The IP subnet of the Kubernetes services: 10.96.0.0/16 [suda@lima-podman usernetes]$ cat kubeadm-config.yaml
[...]
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
networking:
serviceSubnet: "10.96.0.0/16"
podSubnet: "10.244.0.0/16"
[...]
(duplicated from the OP) $ kubectl get -n kube-flannel pods
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-pnnrt 0/1 CrashLoopBackOff 7 (58s ago) 15m
: ↑ kubectl can connect to kube-apiserver
$ kubectl logs -n kube-flannel daemonsets/kube-flannel-ds
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)
Error from server: Get "https://192.168.5.15:10250/containerLogs/kube-flannel/kube-flannel-ds-pnnrt/kube-flannel": dial tcp 192.168.5.15:10250: i/o timeout
: ↑ kube-apiserver is failing to connect to kubelet
$ podman exec usernetes_node_1 sh -euxc 'cat /var/log/containers/kube-flannel-ds-*_kube-flannel_kube-flannel-*.log'
+ cat /var/log/containers/kube-flannel-ds-pnnrt_kube-flannel_kube-flannel-81d4059f4344ffb796b1ac0de247cf71d4b5dfc837a03e0307a54103e8e618ed.log
2024-12-02T20:59:22.893780871Z stderr F I1202 20:59:22.892812 1 main.go:212] CLI flags config: {etcdEndpoints:http://127.0.0.1:4001,http://127.0.0.1:2379 etcdPrefix:/coreos.com/network etcdKeyfile: etcdCertfile: etcdCAFile: etcdUsername: etcdPassword: version:false kubeSubnetMgr:true kubeApiUrl: kubeAnnotationPrefix:flannel.alpha.coreos.com kubeConfigFile: iface:[] ifaceRegex:[] ipMasq:true ifaceCanReach: subnetFile:/run/flannel/subnet.env publicIP: publicIPv6: subnetLeaseRenewMargin:60 healthzIP:0.0.0.0 healthzPort:0 iptablesResyncSeconds:5 iptablesForwardRules:true netConfPath:/etc/kube-flannel/net-conf.json setNodeNetworkUnavailable:true}
2024-12-02T20:59:22.893836867Z stderr F W1202 20:59:22.893326 1 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
2024-12-02T20:59:52.908415802Z stderr F E1202 20:59:52.908036 1 main.go:229] Failed to create SubnetManager: error retrieving pod spec for 'kube-flannel/kube-flannel-ds-pnnrt': Get "https://10.96.0.1:443/api/v1/namespaces/kube-flannel/pods/kube-flannel-ds-pnnrt": dial tcp 10.96.0.1:443: i/o timeout
: ↑ the flannel pod is failing to connect to KUBERNETES_SERVICE_HOST |
The pasta/slirp4netns ip would be visible in |
Usernetes (Kubernetes in Rootless Docker/Podman/nerdctl) works fine with Rootless Podman v5 + slirp4netns.
However, it doesn't seem to work with Pasta:
I haven't figured out whether this is Podman's misconfiguration of pasta, or a bug of pasta itself.
I'm opening an issue here anyway so as to inform that Podman shouldn't drop the support for slirp4netns yet.
Reproduction steps
network.default_rootless_network_cmd
to "pasta" or "slirp4netns"kube-flannel-ds
🔴 pasta (CrashLoopBackOff):
🟢 slirp4netns (Running):
Host Environment
The text was updated successfully, but these errors were encountered: