Skip to content

Comments

Network simulation, integration tests and improved startup#13

Merged
datagutt merged 32 commits intomainfrom
refactor/crate
Feb 12, 2026
Merged

Network simulation, integration tests and improved startup#13
datagutt merged 32 commits intomainfrom
refactor/crate

Conversation

@datagutt
Copy link
Member

@datagutt datagutt commented Feb 11, 2026

Summary by CodeRabbit

  • New Features

    • Added a network-simulation test framework with impairments and scenario-driven generators.
    • New smoothing filter (EWMA) for RTT estimation and exposed RTT baseline metrics.
  • Improvements

    • Enhanced mode now requires ~10% improvement before switching (was ~2%) for stabler routing.
    • Reduced connection switch interval to align with packet flush timing.
    • Added min/fast RTT metrics to reported connection stats.
  • Tests

    • Many new integration tests covering registration, forwarding, failures, impairments, and scenarios.

datagutt and others added 10 commits February 11, 2026 18:06
Increase SWITCH_THRESHOLD from 1.02 to 1.10 so a new connection must be
10% better before traffic is moved to it.  With the shorter 15ms cooldown,
hysteresis is now the primary stability mechanism; 2% was too small to
prevent noise-driven flip-flopping between connections with similar scores.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…iral

Before REG3, forward_via_connection() tracks data packets in packet_log,
creating phantom in-flight counts that never get ACKed. Early NAKs from
these packets also penalize quality scoring. This cascading penalty
starves connections of traffic, causing the system to get stuck at low
throughput after startup.

Reset packet_log, in_flight_packets, congestion state, and quality cache
when REG3 is received so every connection starts with a clean slate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eling

- Introduced a new workspace with a `network-sim` crate for simulating network conditions.
- Implemented `ImpairmentConfig` for configuring network impairments using `tc netem`.
- Added `GemodelConfig` for modeling bursty packet loss with Gilbert-Elliott model.
- Created `Scenario` and `ScenarioConfig` for generating deterministic random-walk impairment scenarios.
- Implemented `Namespace` management for Linux network namespaces, including veth link creation.
- Added tests for impairment application and scenario generation to ensure functionality.
- Introduced a new `harness.rs` module for managing integration tests.
- Added common utilities for integration tests in `mod.rs`.
- Implemented basic connectivity tests in `netns_basic.rs` to validate registration and data forwarding.
- Created failure and recovery tests in `netns_failure.rs` to ensure link failure detection and recovery.
- Developed impairment tests in `netns_impairment.rs` to validate adaptation to network conditions.
- Added scenario-driven tests in `netns_scenario.rs` to assess stability under evolving impairments.
- Updated `Cargo.toml` and `Cargo.lock` to include `tempfile` dependency.
- Enhanced `lib.rs` to expose new test harness functionalities.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

Warning

Rate limit exceeded

@datagutt has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Adds a new network-sim crate (namespaces, impairment, scenarios, harness), refactors RTT tracking to use EWMA with dual sliding windows, consolidates startup-grace configuration, increases enhanced-mode hysteresis to 10%, exposes queued packet counts, updates keepalive timestamping, and adds many integration tests and test utilities.

Changes

Cohort / File(s) Summary
Configuration & repo
\.coderabbit.yaml, Cargo.toml, README.md
Adds CodeRabbit config; adds crates/network-sim workspace member and dev-dep; documents hysteresis change (2%→10%).
New crate: network-sim (manifests)
crates/network-sim/Cargo.toml, Cargo.toml (workspace)
Introduces new crate manifest and workspace entry; bump toolchain and add dev-dep.
Network-sim public API & modules
crates/network-sim/src/lib.rs, crates/network-sim/src/{topology,impairment,scenario,harness,test_util}.rs
New crate modules: Namespace RAII, tc netem/TBF impairment application, deterministic scenario generator, integration test harness for netns-based tests, and test utilities; public types and functions re-exported.
Impairment model & tc orchestration
crates/network-sim/src/impairment.rs
Adds ImpairmentConfig and GemodelConfig, netem/TBF assembly, tc invocation helpers, and unit tests for impairment application.
Test harness & integration tests support
crates/network-sim/src/harness.rs, tests/common/mod.rs, tests/*.rs
Adds NamespaceProcess, SrtlaTestTopology/Stack orchestration, UDP injection helpers, build/skip utilities, and multiple integration test modules exercising registration, forwarding, impairments, and scenarios.
Scenario generator
crates/network-sim/src/scenario.rs
Deterministic seeded random-walk scenario producing per-link ImpairmentConfig frames; unit tests for determinism and bounds.
Topology / netns manager
crates/network-sim/src/topology.rs
Namespace RAII, commands-in-namespace helpers, veth creation and IP assignment, Drop cleanup, and tests.
EWMA filter & RTT refactor
src/ewma.rs, src/connection/rtt.rs, src/lib.rs, src/main.rs
New Ewma module; RttTracker now uses EWMA for smooth/fast/avg-delta and dual sliding windows for rtt_min computation; updated update/reset/is_stable logic and tests.
Connection & batching changes
src/connection/batch_send.rs, src/connection/mod.rs, src/connection/packet_io.rs, src/connection/reconnection.rs
Added BatchSender::queued_count(), include queued in score denom, expose STARTUP_GRACE_MS (moved to parent as pub(crate) and increased to 5000), added clear_pre_registration_state() and invoke it on Reg3.
Selection, hysteresis & cooldown
src/sender/selection/{mod.rs,enhanced.rs,rtt_threshold.rs}, src/tests/*
MIN_SWITCH_INTERVAL_MS reduced from 500→15 ms; SWITCH_THRESHOLD increased 1.02→1.10; code path simplifications and test timing updates.
Sender & housekeeping
src/sender/housekeeping.rs, src/sender/mod.rs
Use consolidated STARTUP_GRACE_MS for grace deadline; adjust reconnection timeout logging levels; include mode in startup log.
Protocol / timestamping
src/protocol/builders.rs
Replace chrono timestamp bytes construction with crate::utils::now_ms() and to_be_bytes() copy.
Stats & instrumentation
src/stats.rs
Add rtt_min_ms and fast_rtt_ms to LinkStats and populate from connection accessors.
Small refactors & tests
src/config.rs, src/sender/packet_handler.rs, src/connection/rtt.rs tests, other minor files
Import reorderings, control-flow simplifications (combined if-let), tests updated to use Ewma update API, and related small edits across modules.

Sequence Diagram(s)

sequenceDiagram
  participant TestRunner as Test Runner
  participant Topology as Namespace Manager (Namespace)
  participant Process as srtla_send / srtla_rec / srt-live-transmit
  participant Tc as tc/netem (via sudo)
  participant Injector as UDP Injector
  participant Collector as Output Collector

  TestRunner->>+Topology: create namespaces, veths, assign IPs
  Topology-->>-TestRunner: namespaces ready
  TestRunner->>+Process: start processes inside namespaces
  Process-->>-TestRunner: processes started (stdout/stderr streams)
  TestRunner->>+Tc: apply impairment (apply_impairment)
  Tc-->>-TestRunner: tc qdisc applied
  TestRunner->>+Injector: inject UDP packets/stream
  Injector-->>-Process: packets arrive -> processing
  Process->>+Collector: emit logs, metrics
  Collector-->>-TestRunner: collect outputs (on stop)
  TestRunner->>Topology: cleanup namespaces (Drop)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 Hopped into namespaces, I build and I test,

EWMA hums softly to smooth every rest,
Two windows keep minima, steady and true,
Ten percent to switch — choose better, not new,
I nibble at packets and dance with the net, hooray!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: introduction of network simulation infrastructure (new crates/network-sim crate with topology, impairment, and test harness modules), integration tests (netns_basic, netns_failure, netns_impairment, netns_scenario), and startup improvements (STARTUP_GRACE_MS increase, RTT tracking enhancements with EWMA and dual-window logic, hysteresis threshold adjustment).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/crate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@datagutt datagutt changed the title feat: implement dual-window RTT tracking and expose metrics in LinkStats @coderabbitai title Feb 11, 2026
@datagutt
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@datagutt datagutt changed the title @coderabbitai title Network simulation, integration tests and improved startup Feb 11, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sender/selection/mod.rs (1)

16-16: ⚠️ Potential issue | 🟡 Minor

Stale doc comment: hysteresis is now 10%, not 2%.

Per the PR objectives, SWITCH_THRESHOLD was changed from 1.02 to 1.10. This line still says "Minimal hysteresis (2%)".

Proposed fix
-//! - Minimal hysteresis (2%) to prevent flip-flopping
+//! - Hysteresis (10%) to prevent flip-flopping
🤖 Fix all issues with AI agents
In `@crates/network-sim/src/harness.rs`:
- Around line 170-197: kill() currently signals the sudo wrapper
(self.pid()/self.child) which may not forward SIGTERM to the inner process;
change kill() to either send the signal to the entire process group or discover
and signal the inner PID before falling back to killing the sudo wrapper: locate
the inner PID from /proc/<sudo_pid> (e.g., inspect /proc/<pid>/task/*/children
or /proc/<pid>/status NSpid field) and call sudo kill on that PID, or send to
the process group by using a negative PID (sudo kill -TERM -- -<pgid>) so the
inner process receives it; ensure the same approach is used for the force-kill
path and keep behavior compatible with Namespace::drop / SrtlaTestStack::drop
cleanup.
- Around line 543-544: Guard against division by zero on packets_per_sec before
computing interval_us: check packets_per_sec (the value used to compute
interval_us) and return an error or handle the zero case (e.g., bail with a
clear message or treat zero as invalid) instead of performing 1_000_000 /
packets_per_sec; alternatively use checked_div on 1_000_000 and handle the None
result. Ensure the change updates the function that computes interval_us so
callers get a clear error when packets_per_sec == 0.
- Around line 537-556: The Python one-liner built in inject_udp_stream creates a
while loop whose body lines are not indented, causing an IndentationError;
update the formatted script (the script string passed to ns.exec_checked in
inject_udp_stream) so that the lines for s.sendto(...), i+=1 and time.sleep(...)
are indented (e.g., prefix with consistent spaces) inside the while block and
keep s.close() and the print statement placed after the loop if intended;
regenerate the script string accordingly so python3 -c receives a syntactically
valid, properly indented multi-line script.

In `@crates/network-sim/src/impairment.rs`:
- Around line 74-78: The build fails due to use of let-chains (e.g., the
if-let-and condition in crates/network-sim/src/impairment.rs around the
self.jitter_ms check and similar at the later block), so update the workspace
Cargo.toml's rust-version field from "1.87" to "1.88" (or any later stable
version) so the let-chain syntax is supported; you can keep edition = "2024"
as-is—just change the rust-version key in the workspace Cargo.toml so the
project compiles with stable Rust 1.88+.

In `@crates/network-sim/src/scenario.rs`:
- Around line 80-82: The calculation of total_steps in frames() can panic when
self.cfg.step is Duration::ZERO (division by zero producing infinity and casting
to u64), so add an explicit guard at the start of frames(): assert or
debug_assert that self.cfg.step != Duration::ZERO and, for runtime safety,
handle the zero-case by returning an empty Vec (or panicking with a clear
message) before computing total_steps; update references to cfg.step and
total_steps in the function (frames(), cfg.step, cfg.duration, total_steps)
accordingly so the function no longer attempts the division when step is zero.

In `@crates/network-sim/src/test_util.rs`:
- Around line 25-33: The function unique_ns_name currently truncates the entire
formatted name which can cut off the uniqueness suffix; change it so the suffix
is always preserved: compute seq via NS_COUNTER.fetch_add(1, Ordering::Relaxed)
and pid, build the uniqueness suffix (e.g., format!("_{pid:x}_{seq}")), then if
you must enforce a max length (or keep the existing 15-char limit) truncate only
the prefix to max_len - suffix.len() and concat prefix + suffix; alternatively
remove the 15-char truncation here and let callers that need IFNAMSIZ-sized
names (e.g., Namespace::new usage in harness.rs) perform interface-specific
truncation. Ensure the function unique_ns_name always returns a name that
includes the full _{pid:x}_{seq} suffix.

In `@src/connection/mod.rs`:
- Around line 343-365: clear_pre_registration_state currently clears packet_log,
in_flight_packets, highest_acked_seq, congestion, and quality_cache but does not
reset the batch queue; update clear_pre_registration_state to call
self.batch_sender.reset() (similar to reset_state's use of batch_sender.reset())
— place the call alongside the other resets (e.g., after self.packet_log.clear()
or with self.congestion.reset()) to ensure queued batches from pre-registration
are discarded.

In `@tests/common/mod.rs`:
- Line 2: The module-level attribute #![allow(dead_code)] conflicts with the
#[expect(dead_code)] on the inject_stream item causing an
unfulfilled_lint_expectations warning; fix by removing the redundancy—either
delete the module-level #![allow(dead_code)] and mark specific items with
#[allow(dead_code)] as needed, or remove the #[expect(dead_code)] on
inject_stream (or change it to #[allow(dead_code)]) so only one suppression
remains; locate the attributes in tests/common/mod.rs (module-level
#![allow(dead_code)] and the #[expect(dead_code)] on inject_stream) and apply
one of those changes.

In `@tests/netns_scenario.rs`:
- Around line 91-95: The loop over frame.configs currently swallows errors from
stack.impair_link by only eprintln!-ing them; change this so failures are
surfaced to the test: capture errors (e.g., accumulate into a Vec or increment a
failure counter) when calling stack.impair_link(i, cfg.clone()) and after the
loop assert that there were zero failures (or call panic!/assert!(false) with a
joined error message) so the test fails if any impairment failed; update the
code around frame.configs.iter().enumerate() / stack.impair_link to record and
report the concrete errors rather than continuing silently.
🧹 Nitpick comments (13)
crates/network-sim/Cargo.toml (1)

11-11: Inconsistent rand version across workspace.

The root crate depends on rand = "0.9" while network-sim uses rand = "0.10". This pulls two separate versions of rand into the dependency tree. Since network-sim is dev-only, it won't bloat the release binary, but aligning versions would reduce compile times and avoid potential type incompatibilities if rand types are ever passed across crate boundaries.

.coderabbit.yaml (1)

78-80: base_branches may be redundant if main is already the default branch.

Per the schema, base_branches is for branches other than the default branch. If main is the repository's default branch, this entry has no effect. Consider removing it or adding only non-default target branches.

src/ewma.rs (1)

24-31: No validation on alpha range in constructors.

Passing alpha <= 0 or alpha > 1 silently produces incorrect smoothing behavior (e.g., alpha = 0 freezes after the first sample, negative alpha diverges). Since this is a shared utility, a debug_assert! would catch misuse early without runtime cost.

🛡️ Optional: add debug assertions
 pub fn new(alpha: f64) -> Self {
+    debug_assert!(alpha > 0.0 && alpha <= 1.0, "alpha must be in (0.0, 1.0]");
     Self {
         value: 0.0,
         alpha_up: alpha,

Similarly for asymmetric():

 pub fn asymmetric(alpha_up: f64, alpha_down: f64) -> Self {
+    debug_assert!(alpha_up > 0.0 && alpha_up <= 1.0, "alpha_up must be in (0.0, 1.0]");
+    debug_assert!(alpha_down > 0.0 && alpha_down <= 1.0, "alpha_down must be in (0.0, 1.0]");
     Self {
src/sender/mod.rs (1)

59-70: Use SchedulingMode's Display implementation to simplify the log formatting.

SchedulingMode implements Display (src/mode.rs:64-72), which already produces the exact same strings ("classic", "enhanced", "rtt-threshold"). This is consistent with how stats.rs formats the mode (line 140: format!("{}", config.mode)). The manual match is redundant.

♻️ Simplify with Display
     info!(
-        "starting srtla_send: local_srt_port={}, receiver={}:{}, ips_file={}, mode={}",
-        local_srt_port,
-        receiver_host,
-        receiver_port,
-        ips_file,
-        match config.mode() {
-            crate::mode::SchedulingMode::Classic => "classic",
-            crate::mode::SchedulingMode::Enhanced => "enhanced",
-            crate::mode::SchedulingMode::RttThreshold => "rtt-threshold",
-        }
+        "starting srtla_send: local_srt_port={}, receiver={}:{}, ips_file={}, mode={}",
+        local_srt_port,
+        receiver_host,
+        receiver_port,
+        ips_file,
+        config.mode()
     );
tests/netns_failure.rs (2)

47-51: Panic detection via string search is fragile.

Checking for "PANIC" / "panic" in stderr won't catch all failure modes (e.g., a segfault or silent abort), and could false-positive on legitimate log messages containing the word "panic" in a descriptive context (e.g., "anti-panic guard"). Consider additionally asserting on the process exit code (e.g., that srtla_send exited with status 0) and, where feasible, checking for positive evidence of failover/recovery (e.g., log lines indicating link switchover or re-registration).

Also applies to: 93-97


54-98: test_link_recovery doesn't inject traffic before or during the failure phase.

Unlike test_link_failure_failover, this test kills link 0 without any prior data flow. If the intent is to verify recovery under load, consider injecting packets before and during the impairment phase so the sender has in-flight state to reconcile on recovery. If the intent is purely to test idle recovery, a brief comment clarifying that would help.

tests/netns_impairment.rs (1)

60-95: Test name overpromises relative to assertions.

test_loss_triggers_window_reduction implies it verifies that congestion window reduction occurs, but the only assertion is the absence of "panic" in stderr. Consider either renaming to something like test_loss_no_crash or adding an assertion that checks log output for window reduction indicators (e.g., a log line mentioning window decrease).

tests/netns_scenario.rs (1)

56-84: Duplicated UDP injection logic bypasses the network_sim helper.

The inline python3 one-liner duplicates the injection logic from network_sim::inject_udp_packets (see crates/network-sim/src/harness.rs, lines 524-533). If the packet format or injection method changes, this code path won't be updated.

Consider making the Namespace type cheaply clonable (or providing a from_name constructor) so the injection helper can be used from the spawned thread, or extract a function in network_sim that accepts a namespace name string.

crates/network-sim/src/topology.rs (2)

32-44: Optional: extract shared argument-building logic.

Both exec and exec_checked construct full_args identically. A small private helper would reduce duplication.

♻️ Example extraction
+    fn ns_args<'a>(&'a self, cmd: &'a str, args: &'a [&'a str]) -> Vec<&'a str> {
+        let mut full_args = vec!["ip", "netns", "exec", &self.name, cmd];
+        full_args.extend_from_slice(args);
+        full_args
+    }
+
     pub fn exec(&self, cmd: &str, args: &[&str]) -> Result<Output> {
-        let mut full_args = vec!["ip", "netns", "exec", &self.name, cmd];
-        full_args.extend_from_slice(args);
-        sudo(&full_args).with_context(|| format!("exec '{cmd}' in ns '{}'", self.name))
+        sudo(&self.ns_args(cmd, args))
+            .with_context(|| format!("exec '{cmd}' in ns '{}'", self.name))
     }
 
     pub fn exec_checked(&self, cmd: &str, args: &[&str]) -> Result<Output> {
-        let mut full_args = vec!["ip", "netns", "exec", &self.name, cmd];
-        full_args.extend_from_slice(args);
-        sudo_checked(&full_args)
+        sudo_checked(&self.ns_args(cmd, args))
             .with_context(|| format!("exec '{cmd}' in ns '{}'", self.name))
     }

46-104: Consider validating interface name length.

The doc on Line 49 states interface names must be ≤ 15 chars, but there's no enforcement here. Callers like SrtlaTestTopology::new truncate, but a defensive check would prevent silent failures from the ip commands.

🛡️ Proposed guard
     pub fn add_veth_link(
         &self,
         peer: &Namespace,
         local_iface: &str,
         peer_iface: &str,
         local_ip: &str,
         peer_ip: &str,
     ) -> Result<()> {
+        if local_iface.len() > 15 || peer_iface.len() > 15 {
+            bail!("interface names must be <= 15 chars (got '{}', '{}')", local_iface, peer_iface);
+        }
+
         // Clean up stale veth (idempotent)
crates/network-sim/src/impairment.rs (1)

186-196: Unused _iface parameter in tc_checked.

The _iface parameter is never used in the function body. Either remove it or incorporate it into the error message for better diagnostics.

♻️ Option A: Remove the parameter
-fn tc_checked(ns: &Namespace, _iface: &str, args: &[&str], ctx: &str) -> Result<()> {
+fn tc_checked(ns: &Namespace, args: &[&str], ctx: &str) -> Result<()> {

Update all call sites accordingly.

♻️ Option B: Use it in the error message
-fn tc_checked(ns: &Namespace, _iface: &str, args: &[&str], ctx: &str) -> Result<()> {
+fn tc_checked(ns: &Namespace, iface: &str, args: &[&str], ctx: &str) -> Result<()> {
     let output = ns.exec("tc", args)?;
     if !output.status.success() {
         bail!(
-            "{ctx}: tc {}\n{}",
+            "{ctx} on {iface}: tc {}\n{}",
             args.join(" "),
             String::from_utf8_lossy(&output.stderr).trim()
         );
     }
crates/network-sim/src/harness.rs (2)

508-515: Drop delegates to NamespaceProcess::drop — subtle but correct.

.take() moves the NamespaceProcess out of the Option and the returned value is immediately dropped, which triggers NamespaceProcess::dropkill(). Worth adding a brief comment for clarity.

📝 Add explanatory comment
 impl Drop for SrtlaTestStack {
     fn drop(&mut self) {
-        // Ensure all processes are killed even if stop() wasn't called
+        // Ensure all processes are killed even if stop() wasn't called.
+        // Each .take() drops the NamespaceProcess, triggering its Drop → kill().
         self.srtla_send.take();
         self.srtla_rec.take();
         self.srt_server.take();
     }
 }

562-595: Binary discovery relies on unwrap() for parent directory traversal.

Lines 568-569 and 574-575 call .parent().unwrap().parent().unwrap(). If CARGO_MANIFEST_DIR has an unexpected structure (e.g., root path), these will panic. Since this is test infrastructure and CARGO_MANIFEST_DIR is always set by cargo to a well-structured path, the risk is low — but a .context() chain would be more consistent with the rest of the codebase's error-handling style.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/network-sim/src/harness.rs`:
- Around line 553-558: The generated Python script string in the `script`
binding (used by `inject_udp_stream`) contains a literal backslash followed by a
source newline (`"\\\n"`), producing an invalid Python line-continuation;
replace that sequence so the Python code actually contains a newline.
Concretely, locate the fragment inside the format! where `d=b'\\x00'*188\\\n    
nstart=time.time(); i=0` is produced and change the `"\\\n         nstart"` to a
proper newline escape like `"\nstart"` (i.e.,
`d=b'\\x00'*188\nstart=time.time(); i=0`) so the Python code has a real newline
between the `*188` and `start`. Ensure the surrounding format! placeholders
(`{dur_secs}`, `{target_ip}`, `{port}`, `{interval_us}`) remain unchanged.
🧹 Nitpick comments (7)
src/registration/probing.rs (1)

154-175: Note: test accessors use #[cfg(test)] instead of #[cfg(feature = "test-internals")].

Per the coding guidelines, internal fields for testing should be exposed via #[cfg(feature = "test-internals")]. These accessors use #[cfg(test)] which limits them to unit tests within the crate. If integration tests or other crates need access, consider switching to the feature flag. Since this is pre-existing code, this can be addressed separately.

As per coding guidelines, "Use conditional compilation with #[cfg(feature = "test-internals")] to expose internal fields for testing".

tests/netns_scenario.rs (2)

57-84: Background injector errors are silently swallowed.

Both the Command output (line 66) and the thread join result (line 99) are discarded with let _ = .... If python3 isn't available or the namespace name becomes stale, zero packets are injected and the test still passes — it just validates the impairment framework, not srtla_send's behavior under load.

Consider at minimum unwrapping the join() to surface thread panics:

Suggested change
-    let _ = inject_handle.join();
+    inject_handle.join().expect("injection thread panicked");

117-193: Test validates crash-freedom but not convergence.

The test name test_step_change_convergence suggests it validates that srtla_send adapts to changed link conditions, but the only assertion is the absence of "panic" in stderr. Consider adding a TODO or renaming to test_step_change_no_crash to set expectations, or adding a lightweight convergence check in a follow-up (e.g., verify traffic eventually flows on the non-degraded link).

crates/network-sim/src/scenario.rs (1)

188-204: Bounds test doesn't cover delay.

values_stay_within_bounds validates rate and loss bounds but doesn't verify that delay stays within [1, base_delay_ms + delay_jitter_ms]. Consider adding a delay assertion for completeness.

Suggested addition
                 let loss = config.loss_percent.unwrap();
                 assert!(loss >= 0.0, "negative loss");
                 assert!(loss <= link_cfg.max_loss_percent, "loss {loss} > max");
+
+                let delay = config.delay_ms.unwrap();
+                assert!(delay >= 1, "delay {delay} < 1");
+                let max_delay = link_cfg.base_delay_ms + link_cfg.delay_jitter_ms;
+                assert!(delay <= max_delay, "delay {delay} > max {max_delay}");
             }
crates/network-sim/src/harness.rs (3)

149-168: stdout_lines() and stderr_lines() silently return empty on second call.

Both methods use .take() on the child's pipes, so a second invocation always returns vec![] with no warning. This is fine given current usage (called once in stop()), but it's a footgun for future callers.

Consider either documenting this as /// **Note:** consumes the pipe; returns vec![] on subsequent calls. or buffering the lines internally.


511-518: SrtlaTestStack::drop relies on implicit NamespaceProcess::drop for cleanup.

The .take() calls extract the Option<NamespaceProcess> values without binding them, so the NamespaceProcess values are dropped immediately, triggering their Drop::drop → kill(). This works but is subtle — a comment would help future readers understand the intent.

Suggested clarification
 impl Drop for SrtlaTestStack {
     fn drop(&mut self) {
-        // Ensure all processes are killed even if stop() wasn't called
-        self.srtla_send.take();
-        self.srtla_rec.take();
-        self.srt_server.take();
+        // Ensure all processes are killed even if stop() wasn't called.
+        // Dropping NamespaceProcess triggers its Drop impl which calls kill().
+        drop(self.srtla_send.take());
+        drop(self.srtla_rec.take());
+        drop(self.srt_server.take());
     }
 }

24-31: which may not be available on all Linux systems.

Minimal containers or some distros don't ship which. command -v is POSIX-mandated and more portable, but requires invoking a shell. For a test-only utility this is low-risk.

More portable alternative
 pub fn check_binary(name: &str) -> Option<PathBuf> {
-    Command::new("which")
-        .arg(name)
+    Command::new("sh")
+        .args(["-c", &format!("command -v {name}")])
         .output()
         .ok()
         .filter(|o| o.status.success())
         .map(|o| PathBuf::from(String::from_utf8_lossy(&o.stdout).trim().to_string()))
 }

@datagutt datagutt merged commit 6a00f93 into main Feb 12, 2026
5 of 6 checks passed
@datagutt datagutt deleted the refactor/crate branch February 14, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant