Skip to content

Comments

egressgw: Let the EGW manager relax rp_filter on egress device#7

Open
MitchLewis930 wants to merge 1 commit intopr_047_beforefrom
pr_047_after
Open

egressgw: Let the EGW manager relax rp_filter on egress device#7
MitchLewis930 wants to merge 1 commit intopr_047_beforefrom
pr_047_after

Conversation

@MitchLewis930
Copy link

@MitchLewis930 MitchLewis930 commented Jan 30, 2026

User description

PR_047


PR Type

Enhancement, Bug fix


Description

  • Relax rp_filter on egress interfaces to fix reply packet drops

  • Add interface name tracking to gateway configuration

  • Refactor netdevice functions to return interface names

  • Add comprehensive rp_filter testing and validation


Diagram Walkthrough

flowchart LR
  A["Policy Configuration"] -->|derives gateway config| B["Gateway Config"]
  B -->|tracks interface name| C["Interface Name"]
  C -->|used to relax| D["rp_filter Setting"]
  E["Manager Reconciliation"] -->|calls| F["relaxRPFilter"]
  F -->|applies sysctl| D
  G["Netdevice Functions"] -->|refactored to return| C
Loading

File Walkthrough

Relevant files
Enhancement
netdevice.go
Refactor netdevice functions to return interface names     

pkg/datapath/linux/netdevice/netdevice.go

  • Refactored TestForIfaceWithIPv4Address to extract private
    getIfaceWithIPv4Address function
  • Added public GetIfaceWithIPv4Address function that returns interface
    name instead of just error
  • Updated return types to include interface name for all three function
    variants
+13/-4   
manager.go
Add rp_filter relaxation to egress gateway manager             

pkg/egressgateway/manager.go

  • Added sysctl field to Manager struct for system control operations
  • Injected sysctl.Sysctl dependency through Params struct
  • Implemented relaxRPFilter() method to set rp_filter to 2 on egress
    interfaces
  • Integrated relaxRPFilter() call into reconciliation flow after gateway
    config regeneration
+38/-0   
policy.go
Track interface name and gateway configuration status       

pkg/egressgateway/policy.go

  • Added ifaceName field to gatewayConfig struct to track interface name
  • Added localNodeConfiguredAsGateway boolean flag to indicate local node
    gateway role
  • Updated deriveFromPolicyGatewayConfig to populate interface name in
    all code paths
  • Changed from TestForIfaceWithIPv4Address to GetIfaceWithIPv4Address to
    capture interface name
+15/-2   
Tests
manager_privileged_test.go
Add rp_filter validation and testing infrastructure           

pkg/egressgateway/manager_privileged_test.go

  • Added rpFilterSetting struct to represent rp_filter test assertions
  • Injected sysctl.Sysctl dependency into test suite and helper functions
  • Implemented ensureRPFilterIsEnabled to verify rp_filter is set to 1 on
    test interfaces
  • Added assertRPFilter and tryAssertRPFilterSettings functions to
    validate rp_filter values
  • Updated createTestInterface to initialize rp_filter on test interfaces
  • Added test assertion for rp_filter relaxation after policy addition
+59/-4   

Pods running on the Egress GW node fail to communicate with an external
endpoint through the Egress GW due to the rp_filter in an environment
where egress IP is assigned to a different interface than the one with
the default route. The reply packets from the external endpoints are
dropped by the rp_filter

- A request from a local pod hits eth0 with the default route.
  It matches an IEGP, gets masqueraded & bpf-redirected to eth1 with Egress IP.
- Replies hit eth1, are revSNATed, and passed on to the stack.
  rp-filter complains that they are received on eth1, when the route doesn't point towards eth1.

This PR fixes this issue by relaxing rp_filter on interfaces with Egress IP.

Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
@qodo-code-review
Copy link

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Anti-spoofing weakened

Description: The new relaxRPFilter() unconditionally sets net.ipv4.conf..rp_filter=2 for egress
interfaces derived from policy config, which weakens kernel anti-spoofing protections and
may also be abused to relax rp_filter on broader scopes (e.g., all/default) if a policy
can influence ifaceName, potentially increasing exposure to source-address spoofing on the
node.
manager.go [602-627]

Referred Code
func (manager *Manager) relaxRPFilter() error {
	var sysSettings []tables.Sysctl
	ifSet := make(map[string]struct{})

	for _, pc := range manager.policyConfigs {
		if !pc.gatewayConfig.localNodeConfiguredAsGateway {
			continue
		}

		ifaceName := pc.gatewayConfig.ifaceName
		if _, ok := ifSet[ifaceName]; !ok {
			ifSet[ifaceName] = struct{}{}
			sysSettings = append(sysSettings, tables.Sysctl{
				Name:      fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifaceName),
				Val:       "2",
				IgnoreErr: false,
			})
		}
	}

	if len(sysSettings) == 0 {


 ... (clipped 5 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status:
Non-idiomatic naming: The new test helper struct uses unclear and inconsistent identifiers (e.g., iFaceName and
redundant rpFilterSetting.rpFilterSetting) which reduces readability and intent clarity.

Referred Code
type rpFilterSetting struct {
	iFaceName       string
	rpFilterSetting string
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Error dropped silently: When relaxRPFilter() fails during reconciliation, the code retries but does not log or
otherwise surface the underlying error, making production debugging difficult.

Referred Code
if err := manager.relaxRPFilter(); err != nil {
	manager.reconciliationTrigger.TriggerWithReason("retry after error")
	return
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Sysctl change unlogged: The PR changes kernel rp_filter sysctl settings without emitting any audit log event that
would allow operators to reconstruct when/what was changed and whether it succeeded.

Referred Code
func (manager *Manager) relaxRPFilter() error {
	var sysSettings []tables.Sysctl
	ifSet := make(map[string]struct{})

	for _, pc := range manager.policyConfigs {
		if !pc.gatewayConfig.localNodeConfiguredAsGateway {
			continue
		}

		ifaceName := pc.gatewayConfig.ifaceName
		if _, ok := ifSet[ifaceName]; !ok {
			ifSet[ifaceName] = struct{}{}
			sysSettings = append(sysSettings, tables.Sysctl{
				Name:      fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifaceName),
				Val:       "2",
				IgnoreErr: false,
			})
		}
	}

	if len(sysSettings) == 0 {


 ... (clipped 5 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Sysctl key injection: The sysctl name is constructed with fmt.Sprintf("net.ipv4.conf.%s.rp_filter",
ifaceName) using ifaceName derived from policy/config without visible validation, which
may allow unexpected sysctl targets depending on how sysctl.ApplySettings maps names to
/proc/sys.

Referred Code
ifaceName := pc.gatewayConfig.ifaceName
if _, ok := ifSet[ifaceName]; !ok {
	ifSet[ifaceName] = struct{}{}
	sysSettings = append(sysSettings, tables.Sysctl{
		Name:      fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifaceName),
		Val:       "2",
		IgnoreErr: false,
	})

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Consider reverting rp_filter upon policy removal

The current implementation modifies the rp_filter setting for egress interfaces
but fails to revert it when the corresponding policy is removed. It is suggested
to track and restore the original rp_filter value to avoid leaving the system in
a modified state.

Examples:

pkg/egressgateway/manager.go [602-627]
func (manager *Manager) relaxRPFilter() error {
	var sysSettings []tables.Sysctl
	ifSet := make(map[string]struct{})

	for _, pc := range manager.policyConfigs {
		if !pc.gatewayConfig.localNodeConfiguredAsGateway {
			continue
		}

		ifaceName := pc.gatewayConfig.ifaceName

 ... (clipped 16 lines)

Solution Walkthrough:

Before:

// pkg/egressgateway/manager.go

func (manager *Manager) relaxRPFilter() error {
	var sysSettings []tables.Sysctl
	ifSet := make(map[string]struct{})

	// Collect interfaces from currently active policies
	for _, pc := range manager.policyConfigs {
		// ...
		ifaceName := pc.gatewayConfig.ifaceName
		if _, ok := ifSet[ifaceName]; !ok {
			ifSet[ifaceName] = struct{}{}
			// Only sets rp_filter to 2
			sysSettings = append(sysSettings, tables.Sysctl{
				Name: fmt.Sprintf("net.ipv4.conf.%s.rp_filter", ifaceName),
				Val:  "2",
			})
		}
	}

	return manager.sysctl.ApplySettings(sysSettings)
}

After:

// pkg/egressgateway/manager.go

type Manager struct {
    // ...
    // Track interfaces that had rp_filter modified
    rpFilterRelaxedIfaces map[string]string // ifaceName -> original value
}

func (manager *Manager) reconcileRPFilter() error {
	activeIfaces := make(map[string]struct{})
	// Collect interfaces from currently active policies
	for _, pc := range manager.policyConfigs {
		// ...
		activeIfaces[pc.gatewayConfig.ifaceName] = struct{}{}
	}

	// Relax rp_filter for active interfaces
	for iface := range activeIfaces {
		// ... read original value, store it, and set to "2"
	}

	// Revert rp_filter for interfaces no longer in use
	for iface, originalValue := range manager.rpFilterRelaxedIfaces {
		if _, ok := activeIfaces[iface]; !ok {
			// ... revert to originalValue and remove from manager.rpFilterRelaxedIfaces
		}
	}
	return nil
}
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a design flaw where system-level rp_filter settings are modified but not reverted, leading to permanent configuration changes that should be ephemeral.

Medium
General
Use a polling utility for test stability

Replace the manual retry loop and time.Sleep in ensureRPFilterIsEnabled with
testutils.WaitUntil to create a more robust and less flaky test.

pkg/egressgateway/manager_privileged_test.go [531-549]

 func ensureRPFilterIsEnabled(tb testing.TB, sysctl sysctl.Sysctl, iface string) {
 	rpFilterSetting := fmt.Sprintf("net.ipv4.conf.%s.rp_filter", iface)
 
-	for i := 0; i < 10; i++ {
+	// There can be a delay between interface creation and the sysctl being available.
+	err := testutils.WaitUntil(func() bool {
 		if err := sysctl.Enable(rpFilterSetting); err != nil {
-			tb.Fatal(err)
+			// This can fail if the sysctl file is not yet created by the kernel.
+			return false
 		}
+		val, err := sysctl.Read(rpFilterSetting)
+		return err == nil && val == "1"
+	}, 5*time.Second)
 
-		time.Sleep(100 * time.Millisecond)
-
-		if val, err := sysctl.Read(rpFilterSetting); err == nil {
-			if val == "1" {
-				return
-			}
-		}
+	if err != nil {
+		tb.Fatalf("Failed to enable rp_filter on %s: %v", iface, err)
 	}
-
-	tb.Fatal("failed to enable rp_filter")
 }
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that using a fixed sleep in a retry loop in tests is a bad practice. Replacing it with a polling utility like testutils.WaitUntil improves test robustness and reliability.

Medium
Log error on rp_filter relaxation failure

Log the error when manager.relaxRPFilter() fails before triggering a
reconciliation retry to improve debuggability.

pkg/egressgateway/manager.go [727-736]

 	manager.regenerateGatewayConfigs()
 
 	if err := manager.relaxRPFilter(); err != nil {
-		manager.reconciliationTrigger.TriggerWithReason("retry after error")
+		log.WithError(err).Warning("Failed to relax rp_filter, will retry reconciliation")
+		manager.reconciliationTrigger.TriggerWithReason("retry after rp_filter relaxation error")
 		return
 	}
 
 	// The order of the next 2 function calls matters, as by first adding missing policies and

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that an error from manager.relaxRPFilter() is not logged, which hinders debugging. Adding logging is a good practice for observability.

Low
wrap sysctl ApplySettings error

Wrap the error returned from manager.sysctl.ApplySettings in relaxRPFilter to
provide more context on failure.

pkg/egressgateway/manager.go [626]

-return manager.sysctl.ApplySettings(sysSettings)
+if err := manager.sysctl.ApplySettings(sysSettings); err != nil {
+    return fmt.Errorf("failed to apply rp_filter settings: %w", err)
+}
+return nil
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly proposes to wrap the error returned by ApplySettings to provide more context, which is a good practice for error handling and makes debugging easier.

Low
add context to rp_filter assertion errors

Improve error messages in tryAssertRPFilterSettings to include the interface
name, expected value, and actual value for better test failure diagnostics.

pkg/egressgateway/manager_privileged_test.go [680-684]

 if val, err := sysctl.Read(fmt.Sprintf("net.ipv4.conf.%s.rp_filter", setting.iFaceName)); err != nil {
-    return fmt.Errorf("failed to read rp_filter")
+    return fmt.Errorf("failed to read rp_filter for iface %s: %v", setting.iFaceName, err)
 } else if val != setting.rpFilterSetting {
-    return fmt.Errorf("mismatched rp_filter iface: %s rp_filter: %s", setting.iFaceName, val)
+    return fmt.Errorf("expected rp_filter=%s on iface %s, got %s", setting.rpFilterSetting, setting.iFaceName, val)
 }
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: This suggestion improves the error messages in a test helper function, making test failures easier to diagnose by providing more context.

Low
clarify missing-interface error

Improve the error message in getIfaceWithIPv4Address to more clearly state that
no interface was found with the specified IPv4 address.

pkg/datapath/linux/netdevice/netdevice.go [70]

-return "", fmt.Errorf("no interface with %s IPv4 assigned to", ip)
+return "", fmt.Errorf("no interface found with IPv4 address %s", ip)
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: This is a minor but useful improvement to an error message, making it clearer and more grammatically correct, which can help with debugging.

Low
  • More

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants