-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Overview
This issue proposes adding a new speech transcription and audio processing extension to the Cycod CLI toolset, tentatively named cycodsp (cycod speech).
Motivation
Inspired by the recent addition of the ai speech transcribe command in Azure AI CLI (Fast Transcription API), this feature would enable workflows like:
- Download audio from YouTube videos or podcasts
- Transcribe audio using Azure Speech services
- Analyze transcriptions for insights using AI
- Example use case: Process "You Are Not So Smart" podcast episodes to extract insights about cognitive biases and intellectual humility
Existing Foundation
We already have relevant code in personal repos:
- robch/ytd: YouTube downloader + transcriber using Azure Speech SDK
- robch/searchy: Web search and content extraction tool
Proposed Features
Core Commands
# Basic transcription
cycodsp transcribe --file audio.wav
# YouTube workflow
cycodsp youtube --url "https://youtube.com/watch?v=VIDEO_ID" --transcribe
# Podcast processing
cycodsp podcast --url "podcast-episode.mp3" --transcribe
# AI analysis
cycodsp analyze --transcript "episode.txt" --prompt "Extract key insights"Key Capabilities
- Integration with Azure Speech Fast Transcription API
- Multiple output formats (text, SRT, VTT, JSON)
- Speaker diarization support
- AI-powered content analysis
- Integration with existing Cycod chat features
Implementation Approach
- Phase 1: Basic transcription using Azure Speech Fast Transcription API
- Phase 2: Integrate YouTube download from existing ytd repo
- Phase 3: Add AI analysis and Cycod chat integration
- Phase 4: Advanced features (batch processing, RSS monitoring)
Documentation
Detailed proposal and technical design: todo/speech-transcription-ideas.md in branch robch/2512-dec05-speech-transcription-ideas
Branch: https://github.com/robch/cycod/tree/robch/2512-dec05-speech-transcription-ideas
Documentation: speech-transcription-ideas.md
Questions for Discussion
- Should this be a separate CLI tool or integrated into main cycod?
- Preferred approach for YouTube/podcast content licensing compliance?
- Integration strategy with existing Cycod infrastructure?
- Priority of different phases?
Related Links
- Azure AI CLI Speech Transcribe: https://github.com/Azure/azure-ai-cli
- Existing ytd repo: https://github.com/robch/ytd
- Existing searchy repo: https://github.com/robch/searchy
Metadata
Metadata
Assignees
Labels
No labels