This project is a Dockerized Telegram bot written in Go, designed to handle speech transcription and synthesis via a Python backend. The backend code is directly included in this repository, and is based on components from two of my earlier projects.
The Python backend in this repo reuses and adapts functionality from:
These repositories provided the foundation.
- Telegram bot built with Go (
go-telegram-bot-api) - Two main actions triggered via buttons:
- Transcribe: Upload a small audio or video file and receive a transcript
- Synthesize: Upload a voice sample, enter text, choose output language, receive speech audio
- Python backend handles:
- File validation
- Whisper transcription
- Language detection and voice cloning
- Dockerized and runnable via
docker-compose
Set your Telegram bot token:
TELEGRAM_BOT_TOKEN=your_token_hereConfigure backend settings:
PORT=5000
MAX_FILE_MB=20
MAX_FILE_DURATION_SEC=360
CORS_ORIGIN=http://localhost:8080- Docker
- Docker Compose
docker-compose up --build