A floating voice assistant with a cute blob companion. Built with Electron + React frontend and Pipecat + Gemini Live backend.
- Dashboard: Minimal, earthy UI with integration placeholders and agent controls
- Floating Blob: Cute coral-colored blob that follows you across windows/desktops
- Changes color based on state (coral=idle, green=listening, purple=speaking)
- Cute blinking eyes
- Particle effects and ripples on voice activity
- Global shortcuts:
⌘⇧SStart ·⌘⇧PPause ·⌘⇧XStop - Real-time voice: Speech-to-speech via Gemini Live with Daily WebRTC
- Backend: Pipecat (Python) with Gemini Live (native speech-to-speech)
- Frontend: Electron + React + Pipecat Client SDK
- Transport: Daily WebRTC for real-time audio
├── backend/ # Python Pipecat voice agent
│ ├── bot.py # Pipecat pipeline (Gemini Live S2S)
│ ├── server.py # FastAPI server for room management
│ ├── requirements.txt
│ ├── start.sh # Setup venv & run server
│ └── .env.example
│
├── frontend/ # Electron + React app
│ ├── electron/ # Main process & preload
│ ├── src/
│ │ ├── components/
│ │ │ ├── Dashboard.tsx # Main control panel
│ │ │ └── Blob.tsx # Floating blob assistant
│ │ ├── context/
│ │ │ └── AgentContext.tsx # Pipecat connection state
│ │ └── hooks/
│ │ └── usePipecatAgent.ts
│ ├── package.json
│ └── start.sh
- Python 3.10+ (not 3.14)
- Node.js 18+
- Google API Key for Gemini Live
- Daily API Key for WebRTC (free tier available)
cd backend
# Copy and configure environment
cp .env.example .env.local
# Edit .env.local with your API keys:
# DAILY_API_KEY=your-daily-api-key
# GOOGLE_API_KEY=your-google-api-key
# Run the setup script (creates venv, installs deps, starts server)
./start.shThe server starts at http://localhost:8080 with:
POST /connect- Creates a voice session (returns Daily room URL + token)GET /health- Health check
cd frontend
# Copy and configure environment
cp .env.example .env.local
# Edit .env.local:
# VITE_PIPECAT_MODE=local
# VITE_PIPECAT_CONNECT_ENDPOINT=http://localhost:8080/connect
# Install dependencies
npm install
# Run the app
npm run dev┌─────────────────────────────────────────────────────────────────┐
│ Frontend (Electron) │
│ ┌─────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Dashboard │ │ AgentContext │ │ Floating │ │
│ │ (React) │───▶│ (Pipecat SDK) │───▶│ Blob │ │
│ └─────────────┘ └────────┬─────────┘ └───────────────┘ │
│ │ │
└──────────────────────────────┼───────────────────────────────────┘
│ WebRTC (Daily)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Daily.co SFU │
│ (Real-time audio relay) │
└──────────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Backend (Python/Pipecat) │
│ ┌─────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ server.py │───▶│ bot.py │───▶│ Gemini Live │ │
│ │ (FastAPI) │ │ (Pipecat) │ │ (S2S LLM) │ │
│ └─────────────┘ └──────────────────┘ └───────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Using Gemini Live for native speech-to-speech:
User Voice → Daily WebRTC → Pipecat → Gemini Live (STT+LLM+TTS) → Daily WebRTC → Speaker
Gemini Live handles everything in one low-latency service:
- STT: Listens to user speech
- LLM: Generates response
- TTS: Speaks the response
- User clicks Start → Frontend calls
/connectendpoint - Server creates Daily room → Launches bot subprocess to join
- Frontend joins same room → Via Pipecat Client SDK + Daily Transport
- Voice pipeline runs → Real-time conversation begins
| Variable | Required | Description |
|---|---|---|
GOOGLE_API_KEY |
Yes | Google API key for Gemini Live |
DAILY_API_KEY |
Yes | Daily.co API key for WebRTC |
PORT |
No | Server port (default: 8080) |
ENV |
No | Set to 'local' for development |
| Variable | Description |
|---|---|
VITE_PIPECAT_MODE |
Set to 'local' |
VITE_PIPECAT_CONNECT_ENDPOINT |
Backend URL (default: http://localhost:8080/connect) |
| Shortcut | Action |
|---|---|
⌘⇧S / Ctrl+Shift+S |
Start agent |
⌘⇧P / Ctrl+Shift+P |
Pause/Resume |
⌘⇧X / Ctrl+Shift+X |
Stop agent |