BubbleVoice 🎧

Telegram bot that converts voice messages into text using local Whisper and ffmpeg. Supports a multilingual interface (EN / RU / UK).

Features

🎙️ Voice message transcription
🌍 Multilingual interface (English / Русский / Українська)
🗣️ Language selection via /start and /language commands
🧠 Local Whisper model (no external APIs)
☁️ Optional Deepgram cloud transcription backend
🔊 Audio conversion via ffmpeg (OGG/MP3/MP4 → WAV 16 kHz)
⚙️ Configurable via environment variables
📝 Structured logging

Requirements

Python 3.10+
ffmpeg
Telegram Bot Token

Screenshots

Setup

1. Clone the repository

git clone https://github.com/machinatororis/telegram-voice2text-bot.git
cd telegram-voice2text-bot

2. Create virtual environment

python -m venv .venv
source .venv/bin/activate  # Linux / macOS

or

.venv\Scripts\activate     # Windows

3. Install dependencies

pip install -r requirements.txt

Configuration (.env)

Create a .env file in the project root (it is ignored by git).

You can use .env.example as a starting point.

Transcription backends

BubbleVoice supports multiple speech-to-text backends. The active backend is selected via environment variables.

Local Whisper (default)

By default, the bot uses local Whisper for transcription:

TRANSCRIBER_BACKEND=whisper

This mode:

Runs fully locally.
Does not require external APIs.
Requires ffmpeg and Whisper to be available on the system.

Deepgram (cloud backend)

Deepgram can be used as an alternative cloud-based transcription backend.

Requirements:

A Deepgram account.
A valid Deepgram API key.

Environment variables:

TRANSCRIBER_BACKEND=deepgram
DG_API_KEY=your_deepgram_api_key

Required variables

BOT_TOKEN=your_telegram_bot_token

Optional variables

LOG_LEVEL=INFO

FFMPEG_PATH (optional)

By default the app tries to run ffmpeg from the system PATH.

If you deploy to a server or use a custom ffmpeg installation, you can manually specify the path to the executable:

Linux:

FFMPEG_PATH=/usr/local/bin/ffmpeg

Windows:

FFMPEG_PATH=C:\ffmpeg\bin\ffmpeg.exe

If FFMPEG_PATH is set but invalid, the app will fall back to searching ffmpeg in PATH.

Run the bot (Local development — polling)

python main.py

The main.py file is the application entry point and initializes logging, configuration, and bot handlers.

This mode uses Telegram long polling and is intended for local development and debugging.

Run in cloud (Webhook mode with FastAPI)

For cloud deployments, BubbleVoice can run in webhook mode using FastAPI. In this mode, Telegram sends updates to the bot via HTTP POST requests instead of long polling.

Start the webhook server

Run the FastAPI application using uvicorn:

uvicorn webapp:app --host 0.0.0.0 --port 8000

The server exposes the following endpoints:

POST /webhook — receives Telegram updates sent by the Telegram API
GET /health — health check endpoint for cloud platforms

This mode is recommended for:

cloud deployments
containerized environments (Docker, PaaS)
platforms with limited CPU resources where long polling is inefficient

Webhook security

For additional security, it is recommended to use a secret webhook path. This helps prevent random HTTP requests from reaching the webhook endpoint.

Example:

WEBHOOK_SECRET=my-super-secret-token

Webhook URL example:

https://your-domain.com/webhook/my-super-secret-token

In this case, Telegram will only send updates to the correct secret URL.

Notes

.env is intentionally excluded from git.
Use .env.example for reference.
For server deployments, FFMPEG_PATH is recommended if ffmpeg is not in PATH.
Audio is sent to Deepgram as 16 kHz mono WAV.
The nova-3-general model is used with automatic language detection.
If Deepgram returns an error or times out, the bot logs the error and automatically falls back to local Whisper without crashing.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
app		app
docs		docs
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
webapp.py		webapp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BubbleVoice 🎧

Features

Requirements

Screenshots

Setup

1. Clone the repository

2. Create virtual environment

3. Install dependencies

Configuration (.env)

Transcription backends

Local Whisper (default)

Deepgram (cloud backend)

Required variables

Optional variables

FFMPEG_PATH (optional)

Linux:

Windows:

Run the bot (Local development — polling)

Run in cloud (Webhook mode with FastAPI)

Start the webhook server

Webhook security

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BubbleVoice 🎧

Features

Requirements

Screenshots

Setup

1. Clone the repository

2. Create virtual environment

3. Install dependencies

Configuration (.env)

Transcription backends

Local Whisper (default)

Deepgram (cloud backend)

Required variables

Optional variables

FFMPEG_PATH (optional)

Linux:

Windows:

Run the bot (Local development — polling)

Run in cloud (Webhook mode with FastAPI)

Start the webhook server

Webhook security

Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages