Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALICO] Wrong interface in ip route #3772

Open
antoinetran opened this issue Jan 13, 2025 · 2 comments
Open

[CALICO] Wrong interface in ip route #3772

antoinetran opened this issue Jan 13, 2025 · 2 comments

Comments

@antoinetran
Copy link

What is wrong
The communication inside pods for DNS fails randomly. For example:

nslookup kubernetes.default.svc

RKE version:
1.4.8
Docker version: (docker version,docker info preferred)

Operating system and kernel: (cat /etc/os-release, uname -r preferred)
Almalinux 8
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Cloud OVH
cluster.yml file:

Steps to Reproduce:

  • deploy Rancher with default CNI calico
network:
  plugin: calico
  mtu: 1450
  • force network interface
        kubectl --kubeconfig ~/kube_config_rancher-cluster.yaml -n kube-system set env daemonset/calico-node IP_AUTODETECTION_METHOD=interface={{ network_calico_autodetect }}
        kubectl --kubeconfig ~/kube_config_rancher-cluster.yaml -n kube-system delete pod -l k8s-app=calico-node

Results:
calico-node pods are all redeployed with correct network interface, however some routes related to kubernetes pods CIDRs are created with the wrong interface. Some are wit the correct interface.

Workaround

while read -r line ; do
  echo "Deleting $line"
  sudo ip route delete $line
done < <(ip route | grep -v blackhole | grep "10.41." | grep -v 192.168.21)
ip route | grep "10.41." | grep -v 192.168.21

Then immediately , calico-node pods recreates the correct routes related to the correct network interfaces.

@antoinetran
Copy link
Author

My analysis is that calico-node pods are first created with default autodetect. I don't know why the first found interface is not the same everywhere, but then it has some time to create the wrong ip routes. Then after forcing the correct interface thanks to env variables (see above), it does not overwrite the already existing route, with wrong interface, even if it differs from the newly found interface.

Proper solution: provide a way from rancher-cluster.yaml to immediately provides the calico mechanism to autodetect network interface. A good configuration value would be "kubernetes-internal-ip" (https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection#autodetection-methods).

I looked at rke1 code related to CNI calico, there is no existing way to configure the network interface.

@antoinetran
Copy link
Author

Related to #711

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant