-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP ILB support / support scope local routes to be configured #4109
Comments
Azure/Oracle internal load balancer uses DSR too. as an option to solve it - is assign VIP address on all control plane nodes manually (ip alias). To make it automatically by Talos: ifconfig lo:0 inet VIP netmask 255.255.255.255 up
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce in machine config add parameter like network:
interfaces:
- interface: eth0
vip:
dsr: true
ip: <IP> All control plane nodes will have VIP address at the same time. And external LB will work fine (after health checks of cause). |
Priority of the issue is contingent on customer negotiations. |
Feature Request
Talos should be mimicking the google-agent (watch metadata for forwardedIPs and add them to the routing table with the scope local) in order to support ILB for at least the kubernetes api endpoint.
The Problem
The only way in GCP to create a load-balancer without a public internet address is to use a Internal Load Balancer (ILB). An ILB does not use tcp-proxy for LB, but forwards the packet directly to the backend, without modification, and expects the backend to accept this packet and leverage direct server return (DSR)
This creates a few problems for Talos or any other VM which has no google-agent running.
The Solution
Source IPs that can be delivered to the Talos node by the ILB are stated in the GCE metadata. Talos could watch for this field and add/remove routes on the interface with scope local.
Conditions
This solution should only be active when:
Tried workarounds
After discussing this with @rsmitty and @smira in the #support slack channel we came up with a workaround using:
This works with non-strict firewall rules. However this creates async traffic as the return traffic has the source ip of the DHCP address in stead of the Load Balancer. (This currently blocks my whole project from going forward)
Using google cloud controller manager to create a loadbalance service in K8S to create an ILB for the kubernetes api only addresses the source ip problem. As now the CNI (cilium in my case) will take care of the source ip, and this gets now accepted by the node. However, as kube-apiserver is running on HostNetwork, return traffic will not be flowing back to the CNI. And the source port + source ip are not rewritten. Thereby creating async traffic because the source ip & the source port (6443 in stead of the nodeport used by the GCE Cloud Controller Manager) are not matchable by the statefull firewall.
Another solution would be to create a yaml patch with the route just like the google-agent would create dynamically. However, there is currently no way to create a route with scope local in Talos. (if it would it would help me really out here)
Example code
To prove my dire need for this feature asap, I started to create some code. Although I can create some simple go code, I do not feel at all comfortable to create some hooks where I need them.
Also it needs scope local route support in talos route functions. But it does provide all the metadata and logic parts. See nberlee@d5f6cad
However, feel free to totally rewrite it.
The text was updated successfully, but these errors were encountered: