Skip to content

Review & Update: Podcast Audio Engine PRD #9

@valorengels

Description

@valorengels

📄 Plan: docs/plans/podcast-audio.md


Summary

This PRD document (docs/plans/podcast-audio.md) outlines technical and creative requirements for generating podcast audio using Gemini TTS API with full script control.

Key topics covered:

  • Target: 35-minute episodes at ~5,200 words
  • Text-first approach: Script with embedded TTS directives -> Gemini TTS
  • Model configuration: gemini-2.5-flash-preview-tts with Alnilam voice
  • Script structure with directive syntax (VOICE, PACE, PAUSE, EMPHASIS, TRANSITION)
  • Voice identity mapping from VOICE-IDENTITY.md
  • Audio generation pipeline with section splitting and stitching
  • Implementation code for generate_audio_tts.py
  • Quality assurance checklists
  • Cost estimation (~$0.10-0.20 per episode)

Review Needed

This plan needs review and potential updates after recent episode creator improvements (Waves 1-5 implemented).

Key sections that may need revision:

  1. Architecture Overview (lines 23-83) - Verify pipeline matches current implementation
  2. TTS API Configuration (lines 89-138) - Check if model IDs and voices are still current
  3. Script Structure Template (lines 250-412) - Compare with current content_plan.md template
  4. Audio Generation Pipeline (lines 416-444) - Verify Phase 9 entry requirements match current workflow
  5. Implementation code (lines 459-613) - Check if generate_audio_tts.py was created and is current
  6. File Structure (lines 664-679) - Verify alignment with current episode directory structure
  7. Environment Setup (lines 683-710) - Confirm dependencies and API key handling

Questions to address:

  • Is Gemini TTS the actual audio generation method being used, or is NotebookLM still primary?
  • Has generate_audio_tts.py been implemented and tested?
  • Do the directive categories match what Gemini TTS actually supports?
  • How does this integrate with the Wave 1-5 improvements to content planning?
  • What is the relationship between script.md and content_plan.md?

Metadata

Metadata

Assignees

No one assigned

    Labels

    deferreddocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions