Lost your place in an audiobook? Paste a sentence. Get the exact timestamp.
PageMatch transcribes your audiobook once using NVIDIA's Parakeet model running locally on your Apple Silicon GPU via MLX. After that, finding any moment in a 20-hour book takes under a second — just paste a sentence from the text.
No internet. No API keys. No subscriptions. Runs entirely on your machine.
Most audiobook tools use Whisper — an autoregressive model that processes audio chunk-by-chunk and accumulates timestamp drift over hours of audio. PageMatch takes a different approach:
| Whisper | Parakeet + MLX | |
|---|---|---|
| Architecture | Autoregressive | Non-autoregressive (CTC/TDT) |
| Timestamps | Per-chunk, drift accumulates | Absolute, sentence-level |
| Hardware (Apple Silicon) | CPU or slow MPS | Native GPU via unified memory |
| Long files (2 GB+) | Needs manual chunking | Built-in chunk_duration param |
| Speed | 1–4× realtime | 8–20× realtime on M-series |
| Deterministic | No (beam search) | Yes |
- Paste-to-timestamp — paste any sentence, get
HH:MM:SS.mmmback instantly - Apple Silicon GPU acceleration — Parakeet runs on the M-chip Neural Engine via MLX unified memory
- One-time indexing — transcribe once (~5–10 min for a 10-hour book), search forever in milliseconds
- Near-zero timestamp drift — non-autoregressive model produces absolute timestamps, no per-chunk error
- Top-3 ranked results — confidence scores, matched audio text, single- and cross-segment search
- Drift correction — fine-tune with an offset spinner; correction is stored permanently in the index
- One-click VLC playback — jump straight to the moment in a running VLC instance, no new windows
- Drag & drop GUI — dark PyQt6 interface with live progress and speed readout
- Batch indexer — pre-index an entire library overnight with one command
- Multilingual — switch to
parakeet-tdt-0.6b-v3for 25 European languages
PageMatch uses uv for dependency management — it handles the parakeet-mlx native dependencies cleanly and is dramatically faster than pip.
curl -LsSf https://astral.sh/uv/install.sh | shgit clone https://github.com/yourusername/pagematch.git
cd pagematch
uv syncuv sync creates a virtualenv and installs everything from pyproject.toml — parakeet-mlx, PyQt6, rapidfuzz, faster-whisper, and dev tooling (ruff, black, pre-commit).
uv run python src/gui.py- Drop your audiobook (
.m4b,.mp3,.m4a,.flac,.wav) onto the drop zone - Build Index — Parakeet transcribes locally on your GPU; first run downloads the model (~600 MB), subsequent runs start immediately
- Paste any sentence from the book into the text box
- Find Timestamp — top 3 matches appear with timestamps and confidence scores
- Click Play in VLC to jump there instantly (requires VLC)
# First-time index (downloads model on first run)
uv run python src/find_my_place.py --audio "book.m4b" --index
# Specific model (multilingual)
uv run python src/find_my_place.py --audio "book.m4b" --index --model parakeet-tdt-0.6b-v3
# Bake in a drift correction
uv run python src/find_my_place.py --audio "book.m4b" --index --offset 2.0
# Force rebuild
uv run python src/find_my_place.py --audio "book.m4b" --index --reindexuv run python src/find_my_place.py --audio "book.m4b" --text "The morning brought no relief"uv run python src/find_my_place.py --audio "book.m4b"uv run python src/index_folder.py /path/to/audiobooks/
# Force rebuild all
uv run python src/index_folder.py /path/to/audiobooks/ --reindexParakeet uses absolute timestamps — drift should be near zero for most files. But if your audiobook has a silent intro or publisher bumper that shifts everything:
- Search for a sentence you know the exact position of in your player
- Measure the gap between what PageMatch reports and the actual audio time
- Enter that value (seconds) in the Drift Correction spinner in the GUI
- Click Rebuild — the offset is stored in the index permanently
Via CLI: pass --offset 3.0 during --index and it's saved for all future searches automatically.
| Model | Languages | Speed | Notes |
|---|---|---|---|
parakeet-tdt-0.6b-v2 |
English only | ⚡ Fastest | Default |
parakeet-tdt-0.6b-v3 |
25 European languages | Slightly slower | Non-English books |
Models are downloaded automatically from mlx-community/ on HuggingFace on first run and cached locally.
pagematch/
├── src/
│ ├── find_my_place.py # Core: MLX Parakeet transcription, SQLite index, fuzzy search
│ ├── gui.py # PyQt6 GUI — drag & drop, indexing, search, VLC control
│ └── index_folder.py # Batch indexer CLI
├── pyproject.toml # uv / PEP 517 project config
├── README.md
└── LICENSE
-
macOS with Apple Silicon (M1 / M2 / M3 / M4) — required for MLX / Parakeet GPU acceleration
-
Python 3.10+
-
ffmpeg — required for audio decoding
brew install ffmpeg
-
VLC (optional) — for one-click playback at timestamp
Intel Mac / Linux / Windows?
faster-whisperis already inpyproject.toml. Open an issue if you'd like the non-MLX code path wired up in the GUI.
Indexing — PageMatch feeds your audiobook through Parakeet via parakeet-mlx with chunk_duration=120s and overlap_duration=15s. Because Parakeet is non-autoregressive, it processes each chunk in parallel on the GPU and returns absolute sentence-level timestamps. These are stored in a lightweight SQLite .abfinder.db file sitting next to your audio file — no external database, nothing to configure.
Searching — Your pasted query is matched against all stored segments using rapidfuzz partial ratio scoring. Both single-segment and two-segment sliding window matches are evaluated, catching queries that straddle a transcript boundary. The top 3 results are returned ranked by confidence score.
VLC integration — PageMatch either launches VLC with --start-time set to the matched position, or sends a seek command to an already-running VLC instance via its built-in HTTP interface on localhost:9090 — so you never get a second VLC window opening.
PRs and issues welcome. Some ideas on the roadmap:
- Non-MLX code path for Linux / Windows (faster-whisper backend already in deps)
- Export results to
.srtor chapter markers - Chapter-aware search (show chapter name alongside timestamp)
- Apple Shortcuts / Raycast extension
- Sentence highlighting in transcript preview
Please open an issue before starting large changes.
MIT © 2025 — see LICENSE for details.
Made for people who listen at 3× speed and still lose their place.