Skip to content

fix(riva): preserve long-dictation leading segments on interim divergence#5

Merged
rbright merged 1 commit intomainfrom
fix/riva-audio-boundary-segments
Feb 27, 2026
Merged

fix(riva): preserve long-dictation leading segments on interim divergence#5
rbright merged 1 commit intomainfrom
fix/riva-audio-boundary-segments

Conversation

@rbright
Copy link
Owner

@rbright rbright commented Feb 27, 2026

Summary

  • track audio_processed for interim Riva responses in stream state
  • preserve divergent one-shot interim segments when audio progression indicates a real boundary
  • keep existing chain/stability/punctuation boundary rules and add regression coverage for both commit and replace paths

Test Plan

  • go test ./apps/sotto/internal/riva
  • just ci-check
  • nix build 'path:.#sotto'

Notes

  • this is aimed at long dictation with pauses where early segments were being dropped
  • heuristic added: commit one-shot divergent interim when audio advanced by >=0.75s and prior interim has >=3 words

Summary by CodeRabbit

  • Improvements
    • Refined interim transcript handling during speech recognition to improve real-time text segmentation accuracy.
    • Enhanced transcript boundary detection to better account for audio processing progress during transcription.

Track interim audio_processed progress and use it when deciding whether to seal a divergent one-shot interim segment. This keeps early long-dictation chunks without reintroducing stale carry-over from quick rewrites.

Add regression coverage for both large-audio-advance commit behavior and small-audio-advance replacement behavior.
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8307e65 and 8028b0c.

📒 Files selected for processing (4)
  • apps/sotto/internal/riva/client.go
  • apps/sotto/internal/riva/client_test.go
  • apps/sotto/internal/riva/stream_receive.go
  • apps/sotto/internal/riva/transcript_segments.go

📝 Walkthrough

Walkthrough

The changes enhance interim boundary detection in the Riva audio stream processing by introducing audio progress tracking. A new field tracks the last processed audio position, enabling the boundary detection function to consider meaningful audio advancement alongside existing chain updates and stability metrics.

Changes

Cohort / File(s) Summary
Stream State Enhancement
apps/sotto/internal/riva/client.go
Added lastInterimAudioProcessed field to Stream struct to track the audio position of the most recent interim transcript.
Boundary Detection Logic
apps/sotto/internal/riva/transcript_segments.go
Expanded shouldCommitInterimBoundary signature to accept audio processing parameters; added advancedAudioBoundary helper function and new threshold constants (interimBoundaryAudioAdvanceSeconds, minInterimWordsForAudioBoundary) to enforce boundaries based on meaningful audio progress and word count.
Stream Processing Integration
apps/sotto/internal/riva/stream_receive.go
Integrated audio progress tracking by resetting lastInterimAudioProcessed on final results and updating it during interim processing; passes both previous and current audio values to shouldCommitInterimBoundary.
Test Coverage
apps/sotto/internal/riva/client_test.go
Added two new tests (TestRecordResponseCommitsOneShotInterimOnAudioAdvance, TestRecordResponseKeepsOneShotInterimWhenAudioAdvanceIsSmall) to validate interim boundary behavior; updated existing test calls to accommodate the expanded shouldCommitInterimBoundary signature.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through audio streams,
Tracking time with sonic dreams,
Three words and 0.75 seconds more,
Mark boundaries we can't ignore!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: preserving segments during interim divergence in the Riva service, which aligns with the PR's objective to retain long-dictation leading segments when audio progression indicates a real boundary.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/riva-audio-boundary-segments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@rbright rbright merged commit 44b2d3b into main Feb 27, 2026
2 checks passed
@rbright rbright deleted the fix/riva-audio-boundary-segments branch February 27, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant