-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
deferreddocumentationImprovements or additions to documentationImprovements or additions to documentation
Description
📄 Plan: docs/plans/podcast-audio.md
Summary
This PRD document (docs/plans/podcast-audio.md) outlines technical and creative requirements for generating podcast audio using Gemini TTS API with full script control.
Key topics covered:
- Target: 35-minute episodes at ~5,200 words
- Text-first approach: Script with embedded TTS directives -> Gemini TTS
- Model configuration:
gemini-2.5-flash-preview-ttswith Alnilam voice - Script structure with directive syntax (VOICE, PACE, PAUSE, EMPHASIS, TRANSITION)
- Voice identity mapping from VOICE-IDENTITY.md
- Audio generation pipeline with section splitting and stitching
- Implementation code for
generate_audio_tts.py - Quality assurance checklists
- Cost estimation (~$0.10-0.20 per episode)
Review Needed
This plan needs review and potential updates after recent episode creator improvements (Waves 1-5 implemented).
Key sections that may need revision:
- Architecture Overview (lines 23-83) - Verify pipeline matches current implementation
- TTS API Configuration (lines 89-138) - Check if model IDs and voices are still current
- Script Structure Template (lines 250-412) - Compare with current
content_plan.mdtemplate - Audio Generation Pipeline (lines 416-444) - Verify Phase 9 entry requirements match current workflow
- Implementation code (lines 459-613) - Check if
generate_audio_tts.pywas created and is current - File Structure (lines 664-679) - Verify alignment with current episode directory structure
- Environment Setup (lines 683-710) - Confirm dependencies and API key handling
Questions to address:
- Is Gemini TTS the actual audio generation method being used, or is NotebookLM still primary?
- Has
generate_audio_tts.pybeen implemented and tested? - Do the directive categories match what Gemini TTS actually supports?
- How does this integrate with the Wave 1-5 improvements to content planning?
- What is the relationship between
script.mdandcontent_plan.md?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
deferreddocumentationImprovements or additions to documentationImprovements or additions to documentation