fix: reuse existing client endpoint when configuring server interface #3157

xDev789 · 2025-11-28T11:29:58Z

Reuse existing client endpoint when configuring server interface.

Description

Wireguard container now preserves peer's endpoint when configuring the server interface, which allows for uninterrupted connectivity in an event of PublicKey resource reconciliation.

Fixes #3090

How Has This Been Tested?

make unit
liqoctl install on Kind clusters

adamjensenbot · 2025-11-28T11:30:06Z

Hi @xDev789. Thanks for your PR!

I am @adamjensenbot.
You can interact with me issuing a slash command in the first line of a comment.
Currently, I understand the following commands:

/rebase: Rebase this PR onto the master branch (You can add the option test=true to launch the tests
when the rebase operation is completed)
/merge: Merge this PR into the master branch
/build Build Liqo components
/test Launch the E2E and Unit tests
/hold, /unhold Add/remove the hold label to prevent merging with /merge

Make sure this PR appears in the liqo changelog, adding one of the following labels:

feat: 🚀 New Feature
fix: 🐛 Bug Fix
refactor: 🧹 Code Refactoring
docs: 📝 Documentation
style: 💄 Code Style
perf: 🐎 Performance Improvement
test: ✅ Tests
chore: 🚚 Dependencies Management
build: 📦 Builds Management
ci: 👷 CI/CD
revert: ⏪ Reverts Previous Changes

cheina97 · 2025-12-16T12:30:27Z

/rebase

cheina97 · 2025-12-16T13:39:36Z

/rebase test=true

cheina97 · 2025-12-16T14:17:54Z

Hi @xDev789, I've checked your PR and something is not clear to me. You are forcing the wg interface to use always the same peer's endpoint, but what happens if it changes? I tested it with cilium and I've noticed that this field is populated with an IP which is the one assigned to the cilium_host@cilium_net interface on the node. What happens if the gateway is rescheduled on another node? How you tested this PR?

cheina97 · 2025-12-16T14:28:27Z

/rebase test=true

Reuse existing client endpoint when configuring server interface.

cheina97 · 2025-12-16T15:13:00Z

I tried to move pods from one node to another, keeping the wrong IP in the server, and it seems that everything is working. It seems that the peer's endpoint is set in server mode but it's balue is ignored. I need to take some additional tests

xDev789 · 2025-12-16T15:18:33Z

Hi @cheina97! Thank you for reviewing my PR. You are right to be sceptical about reusing the same peer endpoint but it is only being explicitly set when configureDevice function gets called (i.e. when PublicKey custom resource gets reconciled). In case the peer endpoint changes, wireguard will update the endpoint due to its automatic peer discovery. It is the same mechanism Liqo currently relies on, except without this PR, reconciling PublicKey CR results in connection interruption because discovered peer endpoint gets erased. As for the testing part, I've tested it on two Kind clusters with two nodes each and it worked as expected. We also use the stable version of Liqo with these patches applied to establish tunnels between the control and worker clusters.

Noticed a potential issue

cheina97 · 2025-12-16T15:40:19Z

Hi @cheina97! Thank you for reviewing my PR. You are right to be sceptical about reusing the same peer endpoint but it is only being explicitly set when configureDevice function gets called (i.e. when PublicKey custom resource gets reconciled). In case the peer endpoint changes, wireguard will update the endpoint due to its automatic peer discovery. It is the same mechanism Liqo currently relies on, except without this PR, reconciling PublicKey CR results in connection interruption because discovered peer endpoint gets erased. As for the testing part, I've tested it on two Kind clusters with two nodes each and it worked as expected. We also use the stable version of Liqo with these patches applied to establish tunnels between the control and worker clusters.

I've tried and it seems that wireguard was not able to reconfigure automatically the IP since it is set forcefully by the controller. Instead, without explicit setup, that IP is updated correctly. I'm not against this change, but I would prefer to wait a little bit and test it properly.

Can you share all the scenarios you have tried? I would like to know what provider have you tried and which CNI were you using (also with Kind clusters)

xDev789 · 2025-12-17T10:48:15Z

I've tried and it seems that wireguard was not able to reconfigure automatically the IP since it is set forcefully by the controller. Instead, without explicit setup, that IP is updated correctly. I'm not against this change, but I would prefer to wait a little bit and test it properly.

Can you share all the scenarios you have tried? I would like to know what provider have you tried and which CNI were you using (also with Kind clusters)

I totally agree with you, I strongly believe in shipping quality product and I'm not trying to rush this change either. We use RKE2 with Cilium in Kube-Proxy replacement mode for both server and client clusters. With Kind clusters, I also used Cilium. I've tried rescheduling the gateway client pod on another node. I've also tried the HA scenario when another pod obtains the lease and becomes the active gateway. Both scenarios resulted in connection being reestablished as expected. Could you share more details on the problematic case? You said it worked initially but some other tests helped you spot an issue.

cheina97 · 2025-12-18T16:12:33Z

I've tried and it seems that wireguard was not able to reconfigure automatically the IP since it is set forcefully by the controller. Instead, without explicit setup, that IP is updated correctly. I'm not against this change, but I would prefer to wait a little bit and test it properly.
Can you share all the scenarios you have tried? I would like to know what provider have you tried and which CNI were you using (also with Kind clusters)

I totally agree with you, I strongly believe in shipping quality product and I'm not trying to rush this change either. We use RKE2 with Cilium in Kube-Proxy replacement mode for both server and client clusters. With Kind clusters, I also used Cilium. I've tried rescheduling the gateway client pod on another node. I've also tried the HA scenario when another pod obtains the lease and becomes the active gateway. Both scenarios resulted in connection being reestablished as expected. Could you share more details on the problematic case? You said it worked initially but some other tests helped you spot an issue.

Thanks for your help, we just need to test it properly with other CNIs. I just need some time to test it and to process

pull-request-size bot added the size/M label Nov 28, 2025

cheina97 previously approved these changes Dec 16, 2025

View reviewed changes

adamjensenbot force-pushed the master branch from 312ef30 to b76e8e0 Compare December 16, 2025 12:30

Reuse existing client endpoint when configuring server interface.

1800b80

Reuse existing client endpoint when configuring server interface.

adamjensenbot force-pushed the master branch from b76e8e0 to 1800b80 Compare December 16, 2025 14:29

cheina97 self-requested a review December 16, 2025 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: reuse existing client endpoint when configuring server interface #3157

fix: reuse existing client endpoint when configuring server interface #3157

Uh oh!

xDev789 commented Nov 28, 2025 •

edited

Loading

Uh oh!

adamjensenbot commented Nov 28, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

xDev789 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

xDev789 commented Dec 17, 2025 •

edited

Loading

Uh oh!

cheina97 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: reuse existing client endpoint when configuring server interface #3157

Are you sure you want to change the base?

fix: reuse existing client endpoint when configuring server interface #3157

Uh oh!

Conversation

xDev789 commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Uh oh!

adamjensenbot commented Nov 28, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

xDev789 commented Dec 16, 2025

Uh oh!

cheina97 commented Dec 16, 2025

Uh oh!

xDev789 commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cheina97 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xDev789 commented Nov 28, 2025 •

edited

Loading

xDev789 commented Dec 17, 2025 •

edited

Loading