-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standalone Egress NAT #299
base: main
Are you sure you want to change the base?
Standalone Egress NAT #299
Conversation
@p-strusiewiczsurmacki-mobica |
Thank you for checking it. It's better if we can run e2e tests for stand-alone egress mode. |
Hi, @p-strusiewiczsurmacki-mobica ! Thank you for waiting. The main changes look good to me. |
66abdf2
to
df10c75
Compare
I agree this would be great. However I need to think about how this test should be done. The problem with kindnet is, that it requires different CNI config on each node, so I would need to add some script that would get node subnet from the kind cluster's nodes after the cluster was created and use that data to update CNI config after |
Yes, it isn't easy to configure the cluster to use kindnet and coil. OK, in this PR, it's OK no e2e test for egress-only mode. |
@terassyi I've added egress-only e2e tests to |
59227cc
to
2eb29ee
Compare
2c074c1
to
8958f4b
Compare
@terassyi Just one important thing - it turned out that |
@p-strusiewiczsurmacki-mobica
In this PR, it's ok to turn on When I tried to turn out But, I think |
Could you update the |
@terassyi Done. :)
It seems that controller-gen only accepts directories for manifest generation, so I had to add some workaround for that (copying |
When trying to run e2e test, it fails with following error.
I ran following commands. $ make -C .. manifests
$ WITH_KINDNET=true TEST_IPV6=false make start
$ make install-coil-egress-v4 |
I believe you're missing a file So, you'd have to run
|
Thanks! It seems procedures to run e2e tests are getting complex, so cloud you update |
It seems webhook-related tests in small test fail. |
Lastly, I want you to update CI to run e2e tests for all combinations, such as egress-only and egress-only, with ipv6 and egress+ipam, etc. When you want to run CI, please mention me. |
a058344
to
9ed43c5
Compare
@terassyi Small test should be fixed and actions were added to the CI. Should I squash the commits before it'll be merged? |
Thanks! It passes all CIs.
Yes, please. |
9ed43c5
to
09d9a0e
Compare
@terassyi It's squashed now :) |
Thanks! I found small test seems to be flaky. |
b065851
to
7483dee
Compare
@terassyi I've fixed the warnings and added some fixes for the CI jobs.
But I believe that this might be out of our reach here, as it seems it is somewhat known issue with controller runtime's cache. I've tried disabling the cache altogether but for some reason it did not work for me. I believe those were also not introduced by my changes as I can see them in other runs of CI, e.g: https://github.com/cybozu-go/coil/actions/runs/10804727159/job/29970647406 If you see anything else I could fix just let me know. |
Hi @terassyi, |
Hi, I'm checking this to be able to merge. |
I re-reviewed all changes again, and I want to discuss the need for I think it's enough to handle these flags in coild. Now, we check the features we want to use by values given by the coil CNI plugin. The only concern I have now is following the code. But we can solve this by moving this error handling to coild. If we can keep the CNI configuration simple and no changes, it's better for all users. |
@terassyi |
v2/runners/coild_server.go
Outdated
@@ -88,7 +92,8 @@ func (n natSetup) Hook(l []GWNets, log *zap.Logger) func(ipv4, ipv6 net.IP) erro | |||
} | |||
|
|||
// NewCoildServer returns an implementation of cnirpc.CNIServer for coild. | |||
func NewCoildServer(l net.Listener, mgr manager.Manager, nodeIPAM ipam.NodeIPAM, podNet nodenet.PodNetwork, setup NATSetup, logger *zap.Logger) manager.Runnable { | |||
func NewCoildServer(l net.Listener, mgr manager.Manager, nodeIPAM ipam.NodeIPAM, podNet nodenet.PodNetwork, setup NATSetup, cfg *config.Config, logger *zap.Logger, | |||
aliasFunc func(interfaces map[string]bool, conf *nodenet.PodNetConf, logger *zap.Logger, pod *corev1.Pod) error) manager.Runnable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add the callback function to add the interface alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, in egress only mode we need to set interface alias on Add
call. However the function that sets alias is dependent on the existance of the pod's veth pair which is not created for simpe-test
. I've added the callback function so nil
returning function can be passed here which makes it possible to run simple-test
for egress-only mode.
At least that's the best I was able to come up with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
It's ok to me!
@p-strusiewiczsurmacki-mobica I added some comments :) |
dbaeacb
to
a5be1f8
Compare
Hi @terassyi - sorry you had to wait so long. I was on a leave for some time. Anyway - I've fixed the issues you've found and added answer to your question regarding the callback function. I'll tests the changes tomorrow, but can I ask you to trigger the test workflow anyway? |
Hi, @p-strusiewiczsurmacki-mobica I approved to run CI👍 |
Well, I reworked those tests a bit. Let's see now. EDIT: It works on my fork: https://github.com/p-strusiewiczsurmacki-mobica/coil/actions/runs/12769527188 |
a5be1f8
to
2f91b00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating!
I reviewed the whole changes and added some small comments.
After reflecting on these, we can merge this!
Signed-off-by: Patryk Strusiewicz-Surmacki <[email protected]> Co-authored-by: Tomoya Terashima <[email protected]>
3d7c727
to
edae8ae
Compare
@p-strusiewiczsurmacki-mobica For merging, I'm testing compatibility. We also have to consider how to migrate the controller separation. Now, I'm not sure I work on this as another PR. |
This PR introduces Standalone Egress NAT as discussed in #274 .
coil-controller
is now divided intocoil-ipam-controller
andcoil-egress-controller
.coild
has now configuration flags to disable/enable egress and/or IPAM features.-coil
has nowcapabilities
fields that can be used to disable/enable IPAM/Egress.setup.md
.veth
aliases will now use pod's UUID instead of container's ID (couldn't get e2e test working using container's ID, I believe container ID is changed during pod restart, but as IPAM is disabled it is not updated as required).PR was tested using egress related E2E tests with both Kindnet and Calico and tests that are provided in the repository passed.