🎙️ Whiscribe (Audio → Text Converter)

Whiscribe is a lightweight, CPU-only speech-to-text app powered by faster-whisper. Run it in your browser with a single Python script — no GPU, no cloud, no account required.

✨ Features

Runs entirely on CPU — no GPU, MPS or external services required
Lightweight and minimal — single Python script, no infrastructure stack (no Nginx, no databases, no Go services)
Model flexibility — supports all standard (major and multilingual) and distilled faster-whisper models
Fully configurable — adjust VAD sensitivity, segment limits and beam size
Handles large files — supports audio files up to 100 MB
Docker-ready — non-root container with health check and XSRF protection
MIT licensed — free to use, modify and distribute

📸 Screenshots

🆚 How It Compares

Tool	CPU-only	Browser UI	Setup complexity	Status
Whiscribe	✅	✅ Streamlit	Single Python script	✅ Active
Whishper	✅	✅ Svelte	Docker Compose + Go backend	⚠ v3 frozen (v4 in progress)
Whisper-WebUI	⚠ Manual config	✅ Gradio	CUDA 12.8 default	✅ Active
Scriberr	✅	✅ React	Go + SQLite + Docker	⚠ Inactive since Dec 2024

Whiscribe's niche: the lowest-friction local transcription option — no multi-service stack, no GPU assumption, no database.

🖥️ Requirements

Requirement	Notes
Python 3.9+	Tested with 3.9 and 3.12
Packages	`faster-whisper`, `streamlit`, `torch`

🚀 Installation

# 1. Clone the repo
git clone https://github.com/sungurerdim/whiscribe.git
cd whiscribe

# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# 4. Run the app
streamlit run whiscribe.py

⚙️ Usage

Upload an audio file (opus, mp3, wav, flac, m4a, aac, mp4, ogg, webm, mov, 3gp, aiff, aif) up to 100 MB
Adjust segment duration, VAD threshold or beam size if needed
Click "Transcribe" and wait for a short while
View, copy or download the transcript (as .txt)
Click Reset to start over with a new file

Default Settings

Setting	Default
Model	faster-whisper-small
Min. speech	250 ms
Max. speech	30 s
VAD threshold	0.15
Beam size	5
Max file size	100 MB

Environment Variables

Variable	Default	Description
`LOG_LEVEL`	`INFO`	Log verbosity: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`

Programmatic Usage

The transcription engine can be used independently of the Streamlit UI:

from config import TranscriptionConfig
from transcriber import load_model, transcribe_bytes

# Load a model (cached for 1 hour)
model = load_model("Systran/faster-whisper-small")

# Read audio file
with open("audio.mp3", "rb") as f:
    audio_bytes = f.read()

# Transcribe with custom config
config = TranscriptionConfig(vad_threshold=0.2, beam_size=3)
text, elapsed = transcribe_bytes(audio_bytes, model, config)
print(f"Transcript ({elapsed:.1f}s): {text}")

🐳 Docker

# Build
docker build -t whiscribe .

# Run
docker run -p 8501:8501 whiscribe

📝 License

🤝 Contributing

Bug reports and pull requests are welcome!

Development Setup

# Clone and setup
git clone https://github.com/sungurerdim/whiscribe.git
cd whiscribe
make setup          # Windows
make setup-unix     # Linux/macOS

# Run locally
make run            # Windows
make run-unix       # Linux/macOS

# Lint
make lint           # Windows
make lint-unix      # Linux/macOS

# Test
make test           # Windows
make test-unix      # Linux/macOS

Code Style

Python 3.9+ compatible
Linted with ruff (config in pyproject.toml)
Type annotations on all function signatures

Troubleshooting

Problem	Cause	Solution
"No speech detected"	Audio too quiet or wrong format	Lower VAD threshold (e.g. 0.05), verify file plays correctly
Model download fails	Network issue or disk full	Check internet connection and disk space, or set a custom cache folder
Out of memory	Model too large for available RAM	Use a smaller model (tiny or base)
Slow transcription	Large file + large model on CPU	Use a smaller model or reduce beam size
Import error on startup	Missing or incompatible dependency	Run `pip install -r requirements.txt` in your virtual environment

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
screenshots		screenshots
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.py		config.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
transcriber.py		transcriber.py
whiscribe.py		whiscribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Whiscribe (Audio → Text Converter)

✨ Features

📸 Screenshots

🆚 How It Compares

🖥️ Requirements

🚀 Installation

⚙️ Usage

Default Settings

Environment Variables

Programmatic Usage

🐳 Docker

📝 License

🤝 Contributing

Development Setup

Code Style

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Whiscribe (Audio → Text Converter)

✨ Features

📸 Screenshots

🆚 How It Compares

🖥️ Requirements

🚀 Installation

⚙️ Usage

Default Settings

Environment Variables

Programmatic Usage

🐳 Docker

📝 License

🤝 Contributing

Development Setup

Code Style

Troubleshooting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages