Skip to content

Conversation

@Camier
Copy link
Member

@Camier Camier commented Nov 16, 2025

Summary

  • clarify that stable functionality is file-based conversion while live microphone support remains experimental (Phase 1)
  • update performance expectations to note 500–700 ms latency for the current live pipeline
  • adjust project status to reflect Phase 1 microphone support alongside stable file-based conversion

Testing

  • Not run (documentation-only change)

Codex Task

Copilot AI review requested due to automatic review settings November 16, 2025 11:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the README to accurately reflect the current state of the RWC voice conversion system, clarifying that file-based conversion is the stable functionality while live microphone support is experimental. The documentation is revised to set appropriate expectations around latency and system capabilities.

  • Repositioned live microphone support as experimental Phase 1 rather than production-ready
  • Updated performance expectations to reflect realistic 500-700ms latency for live pipeline
  • Removed user-specific system details to make documentation more generic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


- Inference (voice conversion): 2-5 minutes per 30 seconds of audio
- Real-time conversion: 30-50ms latency (live microphone input)
- File-based conversion: depends on model size and hardware; batch processing is recommended for throughput.
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The description is somewhat vague. Consider adding approximate time ranges or examples (e.g., 'typically 2-5 minutes per 30 seconds of audio on GPU') to help users better understand expected performance, similar to the specificity provided in the removed content.

Suggested change
- File-based conversion: depends on model size and hardware; batch processing is recommended for throughput.
- File-based conversion: typically takes 2–5 minutes per 30 seconds of audio on a modern GPU (e.g., NVIDIA RTX 3060); actual performance depends on model size and hardware. Batch processing is recommended for throughput.

Copilot uses AI. Check for mistakes.
- Sample rate: 48000 Hz
- Max channels: 2
- PyAudio and PortAudio are required for microphone capture.
- Device compatibility is unverified; expect limited hardware coverage during this phase.
Copy link

Copilot AI Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The phrase 'Device compatibility is unverified' could be interpreted as no testing has been done. Consider rephrasing to something like 'Device compatibility may vary; comprehensive hardware testing is ongoing' to be more specific about the current state.

Suggested change
- Device compatibility is unverified; expect limited hardware coverage during this phase.
- Device compatibility may vary; comprehensive hardware testing is ongoing. Expect limited hardware coverage during this phase.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants