Skip to content

4nkitd/vaani

Repository files navigation

Vaani

Vaani is a minimalist, high-performance Text-to-Speech (TTS) desktop application for macOS. It provides a seamless "Paste & Read Aloud" experience with global shortcuts and an elegant "Ethereal Glass" design.

✨ Features

  • Apple Silicon Optimized: Leverages mlx-audio and the Kokoro-82M model for high-quality, on-device inference.
  • Seamless Flow: Intelligent text chunking and audio overlapping (150ms) for a natural, gapless speaking experience.
  • Ethereal Glass UI: A minimalist, translucent interface inspired by modern macOS aesthetics.
  • Global Shortcuts: Summon the app instantly from anywhere with a configurable hotkey (default: Cmd+Shift+V).
  • Smart Caching: Reuses audio for unchanged sentences to provide near-instant playback.
  • Privacy First: 100% offline. Your text never leaves your machine.

🛠️ Tech Stack

  • Framework: Wails v2 (Go + Svelte)
  • Frontend: Svelte, Vite, Lucide Icons
  • Audio Engine: mlx-audio (Kokoro-82M-bf16 model)
  • Native Integration: Gin (REST API), Robotn Hook (Global Shortcuts)

🚀 Getting Started

Prerequisites

  • macOS (Apple Silicon strongly recommended)
  • Go 1.21+
  • Node.js & npm
  • Python 3.10+
  • espeak-ng: Required for phonemization.
    brew install espeak-ng

Installation

Homebrew (Recommended)

Installing via Homebrew handles all system dependencies automatically, including espeak-ng, python, and mlx-audio.

brew tap 4nkitd/vaani
brew install --cask vaani

Manual Installation

  1. Clone the repository:

    git clone https://github.com/ankityadav/vaani.git
    cd vaani
  2. Setup the Audio Engine: We recommend using uv for managed tool installation:

    uv tool install mlx-audio
    # Note: The app expects ~/.local/bin/mlx_audio.server to be available
  3. Install Frontend Dependencies:

    cd ui/frontend
    npm install
  4. Run in Development Mode:

    cd ui
    wails dev
  5. Build for Production:

    wails build

⚙️ Configuration

Open the Settings (gear icon) in the app to configure:

  • Model: Change the underlying MLX model.
  • Global Shortcut: Define your own trigger keys.
  • Emoji Handling: Choose between ignoring or reading out emojis.

📄 License

MIT © Ankit Yadav

About

Text-to-Speech (TTS) desktop application for macOS

Topics

Resources

License

Stars

Watchers

Forks

Contributors