Conversation
…ality The previous np.interp-based resampling caused spectral aliasing when downsampling (e.g. 48 kHz → 16 kHz client audio to worker rate), degrading transcription accuracy. Replace with libsoxr which applies a proper anti-aliasing low-pass filter and polyphase resampling. - Add soxr>=0.3.0 dependency to realtime-sdk - Use soxr.resample() with HQ quality preset in AudioBuffer._resample_if_needed - Add unit tests verifying alias attenuation and frequency preservation https://claude.ai/code/session_017AKfpaH2v1TG2oxCo4dVSp
- Add configurable resample_quality parameter ("fast", "balanced", "high")
mapping to soxr MQ/HQ/VHQ presets for CPU/latency tradeoff control
- Wire through SessionConfig, AudioBuffer, and connection params parsing
- Add DALSTON_REALTIME_RESAMPLE_QUALITY env var for server-wide default
- Add dalston_realtime_resample_duration_seconds histogram metric to
track resampler latency per sample-rate conversion path
- Add parametrized tests verifying all profiles attenuate aliases and
preserve in-band content
https://claude.ai/code/session_017AKfpaH2v1TG2oxCo4dVSp
- Migrate LiteProfile from (str, Enum) to StrEnum (ruff UP042) - Use `import alembic.command` instead of `from alembic import command` (mypy attr-defined) https://claude.ai/code/session_017AKfpaH2v1TG2oxCo4dVSp
The specific type: ignore[misc,assignment] was not being recognized by CI's mypy version. Use unqualified type: ignore for the Redis fallback.
- Move soxr import from module level to _resample_if_needed() so tests that transitively import session.py don't fail when soxr isn't installed (it's in the realtime-sdk extra, not dev) - Fix ruff format in base.py (long line wrapping)
soxr is in the realtime-sdk extra, not dev. Mark the entire test module as skip when soxr is unavailable so CI tests pass.
soxr is needed to run resampling tests in CI. Add it to the dev extra rather than skipping the tests when it's missing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the naive linear interpolation resampling in
AudioBufferwith libsoxr's polyphase resampling filter. This eliminates spectral aliasing that occurs when downsampling (e.g., 48 kHz → 16 kHz) and provides configurable quality profiles for different use cases.Changes
np.interp()withsoxr.resample()inAudioBuffer._resample_if_needed()to use anti-aliased polyphase resampling instead of linear interpolationresample_qualityparameter with three profiles:"fast"(MQ): Low CPU, suitable for telephony-grade input"balanced"(HQ): Good default for real-time paths"high"(VHQ): Broadcast quality, higher CPU costresample_qualitytoSessionConfigandAudioBuffer.__init__(), with environment variable support (DALSTON_REALTIME_RESAMPLE_QUALITY)observe_realtime_resample_duration()to track resampling performance with labels for source/target sample ratesRealtimeSessionto ensureresample_qualityis one of the supported profilessoxr>=0.3.0topyproject.tomlTesting
tests/unit/test_realtime_audio_buffer.py:pytestRelated Issues
Resolves the TODO comment about replacing linear interpolation with proper polyphase resampling to avoid spectral aliasing in downsampling scenarios.
https://claude.ai/code/session_017AKfpaH2v1TG2oxCo4dVSp