Open
Conversation
Problems with the previous approach: - All documentation, zero implementation - Hard-coupled to Zenoh (unusable without it, untestable) - Inconsistent API (conflicting patterns, mismatched param names) - Premature feature comparisons against Temporal.io/AWS Step Functions New direction: - Library, not framework (you call it, it doesn't restructure your code) - Transport-agnostic core with Adapter protocol (Zenoh as optional extra) - Testable by default via LocalAdapter (no hardware/network needed) - Function-first API with @task decorator and @task.undo compensation - Sequence composition with automatic reverse-order compensation - Configurable retry policies (linear, exponential, fibonacci backoff) - Working implementation with 25 passing tests https://claude.ai/code/session_01RbxGJBcM18D1NXVUq3TZ1T
Examples that expose where the library breaks: - dual_arm_assembly: needs parallel, nesting, loops - bin_emptying: needs loops, conditionals, partial success - force_insertion: needs guards, result passing, sensor waits - palletizing: needs parallel, sequence timeout, cancellation - inspection_sort: needs branching, error discrimination, hooks - emergency_recovery: needs interrupts, compensation retry, human-in-loop Critical gaps fixed (50 tests, all passing): - Parallel: concurrent execution with coordinated compensation - Nesting: Sequence usable as a step inside another Sequence - Branch: route execution based on runtime selector - Guard: precondition check before running a step - Hooks: lifecycle callbacks (on_step_start/end/error, on_compensate) - Compensation retry: undo_retry parameter for safety-critical tasks - Step protocol: common interface for BoundTask/Parallel/Guard/Branch/Sequence Gaps documented honestly in README Known Limitations table. https://claude.ai/code/session_01RbxGJBcM18D1NXVUq3TZ1T
Replace the entire Python codebase with a zero-dependency Rust library. All 8 previously-documented gaps are now first-class features: - Loops: Loop combinator with condition + max_iterations safety cap - Cancellation: CancellationToken with cooperative checking at step boundaries - Sequence-level timeout: Sequence::timeout() using tokio::select! - Persistence/crash recovery: Journal trait + MemoryJournal, skip completed steps - Partial success: ErrorStrategy::Skip continues past non-critical failures - Resource locking: ResourceLock + Locked wrapper with shared async mutex - Error discrimination: ErrorStrategy enum (Compensate/Skip/Escalate) per step - Sensor streaming: Adapter::subscribe() returns mpsc::Receiver<Value> Architecture: Step trait with async-trait for Box<dyn Step> dynamic dispatch, Context with Arc<RwLock<HashMap>> for safe parallel sharing, composable combinators (Sequence, Parallel, Guard, Branch, Loop) that all impl Step. 27 integration tests, 6 examples covering realistic robotics scenarios. https://claude.ai/code/session_01RbxGJBcM18D1NXVUq3TZ1T
Add 22 adversarial robotics stress tests modeled after CNC, surgical, warehouse, autonomous vehicle, semiconductor, space probe, and other real-world scenarios. These tests exposed 9 bugs in the saga engine: Bugs fixed: - Timeout/cancellation now compensate completed steps (via shared Arc<Mutex<Vec>> between execute and timeout/cancel branches) - Loop self-compensates all completed iterations on body failure (saga pattern: composite steps own their internal compensation) - Branch only compensates the taken path, not all branches - Parallel cancels running siblings on first failure (AtomicBool flag) - Retry respects ErrorStrategy (Escalate/Skip break immediately) - Compensation runs in "compensation mode" — cancellation checks are skipped so adapter calls succeed during undo (new Context flag) - Compensation has configurable per-step timeout (default 30s) - Duplicate step names handled via count-based journal dedup - Error::is_cancelled/is_timeout recurse through wrapper types All 49 tests pass (27 integration + 22 stress), zero warnings. https://claude.ai/code/session_01RbxGJBcM18D1NXVUq3TZ1T
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complete rewrite of relentless from a Python framework to a zero-dependency Rust library for composing async tasks with automatic compensation on failure. The library is built on the saga pattern and designed for robotics workflows where actions need structured undo semantics.
Key Changes
Language & Architecture: Migrated from Python + Zenoh to pure Rust with
tokio+async-traitas the only runtime dependenciesCore Abstraction: Introduced
Steptrait as the fundamental building block, with implementations for:FnTask: Closure-based steps for inline task definitionSequence: Linear execution with reverse-order compensationParallel: Concurrent execution with sibling compensation on failureGuard: Conditional execution with optional fallbackBranch: Runtime selection among multiple stepsLoop: Repeated execution with condition checkingLocked: Resource coordination via shared mutexError Handling: Comprehensive
Errorenum with variants for task failures, timeouts, cancellation, and compensation trackingRetry & Timeout: Configurable
RetryPolicywith exponential/linear/Fibonacci backoff and per-step timeout supportCancellation:
CancellationTokenfor cooperative graceful shutdown with automatic compensationContext & State:
Contextprovides shared execution state, adapter I/O, hooks, and journalingAdapter Pattern:
Adaptertrait for pluggable I/O backends;LocalAdapterfor testingJournaling:
Journaltrait for write-ahead logging and crash recoveryHooks: Lifecycle callbacks (
on_step_start,on_step_end,on_step_error,on_compensate)Value System: Dynamic
Valueenum for type-flexible adapter communicationNotable Implementation Details
Send + Syncand use interior mutability (Arc<RwLock<…>>) to safely share context across parallel branchesCompensate,Skip,Escalate) allow per-step failure discriminationMigration Notes
This is a complete rewrite; the Python API is not preserved. Users should refer to the new Rust examples and documentation for usage patterns.
https://claude.ai/code/session_01RbxGJBcM18D1NXVUq3TZ1T