Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@vercel
Copy link

vercel bot commented Jan 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
rivetkit-serverless Error Error Jan 2, 2026 11:34pm
3 Skipped Deployments
Project Deployment Review Updated (UTC)
rivet-cloud Ignored Ignored Jan 2, 2026 11:34pm
rivet-inspector Ignored Ignored Preview Jan 2, 2026 11:34pm
rivet-site Ignored Ignored Preview Jan 2, 2026 11:34pm

Copy link
Member Author

NathanFlurry commented Jan 2, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 2, 2026

PR Review: Add support for find_workflows

This PR adds a batch version of find_workflow that allows finding multiple workflows in a single database transaction, which is a good performance optimization.

✅ Strengths

  1. Consistent API Design: The new find_workflows method follows the same pattern as the existing find_workflow method and is properly exposed across all context types (Activity, Operation, Standalone, Test).

  2. Performance Optimization: Using a single transaction for multiple queries is the right approach - this will significantly reduce database round-trips compared to calling find_workflow in a loop.

  3. Code Reuse: Good reuse of the existing find_workflow_inner helper method, avoiding code duplication.

  4. Proper Instrumentation: Includes tracing instrumentation with appropriate span names.

  5. Metrics: Uses the existing FIND_WORKFLOWS_DURATION metric (though with a generic "batch" label).

🔍 Issues & Suggestions

1. Potential Transaction Semantics Issue (Medium Priority)

The implementation executes all queries within a single transaction, but then runs them concurrently using try_join_all:

.run(|tx| async move {
    let futures = queries.iter().map(|(workflow_name, tags)| {
        self.find_workflow_inner(workflow_name, tags, &tx)
    });
    try_join_all(futures).await
})

Issue: The futures are created eagerly but executed concurrently. This might cause transaction isolation issues depending on how UniversalDB handles concurrent operations within a single transaction. If the transaction expects sequential access, this could lead to:

  • Race conditions
  • Inconsistent snapshot reads
  • Potential deadlocks

Recommendation: Verify that UniversalDB supports concurrent operations within a single transaction. If not, consider either:

  • Sequential execution: for (name, tags) in queries { ... }
  • Or document that this is intentional and tested

2. Metrics Granularity Loss (Low Priority)

The current implementation uses a hardcoded "batch" label for metrics:

metrics::FIND_WORKFLOWS_DURATION.record(dt, &[KeyValue::new("workflow_name", "batch")]);

Issue: This loses visibility into which workflows are being queried in batch operations. The single-query version tracks individual workflow names.

Recommendation: Consider one of:

  • Add a batch_size metric attribute
  • Add a separate metric for batch operations
  • Include the count of queries in the metric tags

3. Missing Documentation (Low Priority)

While the method has a doc comment, it could be more descriptive:

/// Finds the first incomplete workflow for each (name, tags) pair in a single batch transaction.

Recommendation: Expand the documentation to clarify:

  • What happens if some queries match and others don't (partial results)
  • Order preservation: does the result vector match the input query order?
  • Performance characteristics vs. calling find_workflow in a loop
  • Example usage

For example:

/// Finds the first incomplete workflow for each (name, tags) pair in a single batch transaction.
///
/// This is more efficient than calling `find_workflow` multiple times as it uses a single
/// database transaction for all queries.
///
/// # Returns
/// A vector of optional workflow IDs in the same order as the input queries. Returns `None`
/// for queries that don't match any workflow.
///
/// # Example
/// ```no_run
/// let queries = [
///     ("workflow_a", json\!({"env": "prod"})),
///     ("workflow_b", json\!({"env": "dev"})),
/// ];
/// let results = ctx.find_workflows(&queries).await?;
/// // results[0] corresponds to workflow_a query
/// // results[1] corresponds to workflow_b query
/// ```

4. Type Signature Consideration (Low Priority)

The method signature uses &[(&str, serde_json::Value)]:

pub async fn find_workflows(
    &self,
    queries: &[(&str, serde_json::Value)],
) -> Result<Vec<Option<Id>>>

Observation: This requires serde_json::Value to be cloned for each query (since it's not a reference). For large batch operations, this could add overhead.

Recommendation: Consider if &[(impl AsRef<str>, &serde_json::Value)] or similar would be more efficient. This is likely a minor concern unless batch sizes are very large.

5. Error Handling Clarity (Low Priority)

When using try_join_all, if any query fails, the entire batch fails. This is reasonable but should be documented.

Recommendation: Document the error behavior - does it short-circuit on first error, or collect all errors?

🧪 Testing Suggestions

Consider adding tests for:

  1. Empty query list: What happens with find_workflows(&[])?
  2. Mixed results: Some queries match, others don't
  3. Duplicate queries: Same (name, tags) pair multiple times
  4. Large batches: Performance testing with many queries
  5. Error cases: Invalid tags in one of multiple queries

📋 Minor Notes

  1. Line 1012 in db/kv/mod.rs: The comment says "Retrieves" but the DB trait comment says "Retrieves" - both should consistently use either "Retrieves" or "Finds" to match the method name.

🎯 Verdict

This is a solid addition that provides a useful performance optimization. The main concern is verifying transaction semantics with concurrent operations. The other suggestions are minor improvements to documentation and observability.

Recommendation:

  • ✅ Approve with minor suggestions
  • Verify UniversalDB transaction behavior with concurrent queries
  • Consider adding tests for edge cases

This review follows the patterns and conventions from CLAUDE.md

@jog1t jog1t force-pushed the 12-23-chore_docs_document_abort_signal branch from ee5f7b8 to 9f0dc33 Compare January 2, 2026 23:33
@jog1t jog1t force-pushed the 01-02-chore_gas_add_support_for_find_workflows_ branch from d3bc69d to 79307ef Compare January 2, 2026 23:33
@NathanFlurry NathanFlurry force-pushed the 12-23-chore_docs_document_abort_signal branch from 9f0dc33 to ee5f7b8 Compare January 2, 2026 23:59
@NathanFlurry NathanFlurry force-pushed the 01-02-chore_gas_add_support_for_find_workflows_ branch from 79307ef to d3bc69d Compare January 2, 2026 23:59
@jog1t jog1t force-pushed the 12-23-chore_docs_document_abort_signal branch from ee5f7b8 to 9f0dc33 Compare January 3, 2026 00:05
@jog1t jog1t force-pushed the 01-02-chore_gas_add_support_for_find_workflows_ branch from d3bc69d to 79307ef Compare January 3, 2026 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants