-
Notifications
You must be signed in to change notification settings - Fork 131
fix: reuse existing client endpoint when configuring server interface #3157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @xDev789. Thanks for your PR! I am @adamjensenbot.
Make sure this PR appears in the liqo changelog, adding one of the following labels:
|
|
/rebase |
312ef30 to
b76e8e0
Compare
|
/rebase test=true |
|
Hi @xDev789, I've checked your PR and something is not clear to me. You are forcing the wg interface to use always the same peer's endpoint, but what happens if it changes? I tested it with cilium and I've noticed that this field is populated with an IP which is the one assigned to the |
|
/rebase test=true |
Reuse existing client endpoint when configuring server interface.
b76e8e0 to
1800b80
Compare
|
I tried to move pods from one node to another, keeping the wrong IP in the server, and it seems that everything is working. It seems that the peer's endpoint is set in server mode but it's balue is ignored. I need to take some additional tests |
|
Hi @cheina97! Thank you for reviewing my PR. You are right to be sceptical about reusing the same peer endpoint but it is only being explicitly set when configureDevice function gets called (i.e. when PublicKey custom resource gets reconciled). In case the peer endpoint changes, wireguard will update the endpoint due to its automatic peer discovery. It is the same mechanism Liqo currently relies on, except without this PR, reconciling PublicKey CR results in connection interruption because discovered peer endpoint gets erased. As for the testing part, I've tested it on two Kind clusters with two nodes each and it worked as expected. We also use the stable version of Liqo with these patches applied to establish tunnels between the control and worker clusters. |
I've tried and it seems that wireguard was not able to reconfigure automatically the IP since it is set forcefully by the controller. Instead, without explicit setup, that IP is updated correctly. I'm not against this change, but I would prefer to wait a little bit and test it properly. Can you share all the scenarios you have tried? I would like to know what provider have you tried and which CNI were you using (also with Kind clusters) |
I totally agree with you, I strongly believe in shipping quality product and I'm not trying to rush this change either. We use RKE2 with Cilium in Kube-Proxy replacement mode for both server and client clusters. With Kind clusters, I also used Cilium. I've tried rescheduling the gateway client pod on another node. I've also tried the HA scenario when another pod obtains the lease and becomes the active gateway. Both scenarios resulted in connection being reestablished as expected. Could you share more details on the problematic case? You said it worked initially but some other tests helped you spot an issue. |
Thanks for your help, we just need to test it properly with other CNIs. I just need some time to test it and to process |
Reuse existing client endpoint when configuring server interface.
Description
Wireguard container now preserves peer's endpoint when configuring the server interface, which allows for uninterrupted connectivity in an event of PublicKey resource reconciliation.
Fixes #3090
How Has This Been Tested?