Lingo 🗣️

A powerful, lightweight web application for text-to-speech (TTS) and speech recognition functionality. No web development knowledge required - simply download the project files and open lingo.html in your Chrome browser to get started! Lingo provides an intuitive interface for reading text aloud and converting speech to text, all running directly in your browser with no dependencies or build process required.

Rant to Developers: This project was very intentionally designed to be free of any specific frameworks (Vue, React, etc.) so it's easily understandable to all web developers and no porting from one framework to another is required. I also wanted this app to be runnable by non-developers simply by opening an HTML page locally. It seems to be the case that nearly ALL AI projects strugle endlessly with TTS/STT never getting it right. Gemini Voice sucks, OpenAI Voice sucks, Github Copilot Voice absolutely sucks, etc and yes voice can be a bit tricky to get right, but trust me this JS has it perfected. You guys no longer have any excuses! This code shows how easy it truly is.

✨ Features

🔊 Text-to-Speech (TTS)

Smart Reading: Read selected text, from cursor position, or the entire document
Cursor Position Reading: Place your cursor anywhere in the text to start reading from that point
Pause & Resume: Pause speech at any time and resume where you left off
Voice Selection: Choose from all available system voices with language indicators
Speed Control: Adjustable speaking rates from slow (0.85x) to ludicrous (1.35x)
Persistent Settings: Voice and speed preferences automatically saved
Cross-Browser Compatible: Works in Chrome, Firefox, Safari, and other modern browsers

🎤 Speech Recognition

Continuous Dictation: Real-time speech-to-text conversion
Auto-Restart: Seamlessly continues listening after pauses
Smart Insertion: Text appears at cursor position, preserving existing content
Visual Feedback: Textarea highlights when actively listening
Chrome Optimized: Full functionality in Chrome/Chromium browsers

⌨️ Keyboard Shortcuts

Ctrl/Cmd + Enter: Start/stop text reading
Ctrl/Cmd + M: Toggle microphone dictation
Escape: Stop all active operations (TTS or speech recognition)

🔗 URL Parameters

?mic=on: Automatically start mic dictation when the page loads

🎨 User Interface

Dark Theme: Easy-on-the-eyes default dark interface
Responsive Design: Optimized for both desktop and mobile devices
Real-time Status: Live feedback on current operations
Accessible: Full keyboard navigation and screen reader support
Simple Architecture: Clean separation of HTML, CSS, and JavaScript

🚀 Quick Start

Simplest Method -- No Web Server required

The easiest way to use Lingo is to simply open the HTML file directly:

Download lingo.html, lingo.css, and lingo.js to the same folder on your computer
Double-click lingo.html or right-click → Open with → Chrome (or any browser that supports Web Speech APIs)
Start using - no server setup required!

Note: Chrome/Chromium browsers provide the best experience with full TTS and speech recognition support.

Running with Local Server (Recommended)

You only need to use the web server if you want to avoid repeated microphone permission prompts. Browsers require microphone permission each time when running from file:// URLs, but remember your choice when running from http://localhost.

For this setup:

Clone or download the project files
Run the startup script:
```
./run.sh
```
Your browser will automatically open to http://localhost:8009/lingo.html

Manual Server Setup (Alternative)

If you prefer to run it manually:

# Start a local HTTP server
python3 -m http.server 8009

# Open your browser to:
# http://localhost:8009/lingo.html

📋 How to Use

Text-to-Speech

Type or paste text into the main textarea
Drag and drop text from other applications directly into the textarea (especially handy on Linux - simply select text from any app and drag it in)
Position your cursor where you want reading to begin, or select specific text to read only that portion
Click "🔊 Read" or press Ctrl/Cmd + Enter
Choose your preferred voice and speaking speed from the dropdowns
Click "⏸️ Pause" to pause reading, then "▶️ Resume" to continue from where you left off
Click "⏹️ Stop" or press Escape to stop reading completely

Tip: If your cursor is at the very end of the text, clicking Read Aloud will start from the beginning.

Speech Recognition

Click "🎤 Mic" or press Ctrl/Cmd + M
Speak clearly - your words will appear in the textarea
The app continues listening until you stop it manually
Click "⏹️ Stop" or press Escape to stop dictation

Tip: Add ?mic=on to the URL to automatically start mic dictation when the page loads (e.g., http://localhost:8009/lingo.html?mic=on).

🛠️ Technical Details

Architecture

Three-File Structure: Clean separation of HTML, CSS, and JavaScript
No Build Process: Runs directly in browser without compilation
Zero Dependencies: Uses only native Web APIs
Portable: Copy lingo.html, lingo.css, and lingo.js to any folder and it works

Browser Compatibility

Browser	TTS Support	Speech Recognition
Chrome/Chromium	✅ Full	✅ Full
Firefox	✅ Full	❌ Not supported
Safari	✅ Full	❌ Not supported
Edge	✅ Full	✅ Full

Web APIs Used

Speech Synthesis API: For text-to-speech functionality
Web Speech API: For speech recognition (webkit-prefixed)
localStorage: For persistent settings
File API: For handling text input/output

📁 Project Structure

lingo/
├── lingo.html         # Main HTML structure
├── lingo.css          # Styles and theming
├── lingo.js           # Application logic
├── run.sh             # Startup script for local development
├── kill.sh            # Stop the local server
└── README.md          # This documentation

🔧 The Run Script

The run.sh script provides a convenient way to start the application:

What it does:

Port Management: Checks for existing servers on port 8009 and terminates them
Server Startup: Launches Python's built-in HTTP server
Browser Launch: Automatically opens your default browser to the app
Process Management: Keeps the server running and handles graceful shutdown

Why we need it:

Security: Modern browsers require HTTPS or localhost for Web APIs
Cross-Origin: Direct file access (file://) blocks certain features
Port 8009: Chosen to avoid conflicts with common development ports

🎯 Use Cases

Content Review: Listen to written content for proofreading
Accessibility: Assist users with reading difficulties
Multitasking: Consume text content while doing other activities
Note Taking: Quickly dictate thoughts and ideas
Language Learning: Hear proper pronunciation of text
Voice Memos: Convert speech to text for documentation

🔍 Development Features

Console Helpers

For testing and debugging, Lingo exposes utility functions:

// Speak any text directly
window.__tts.speakNow("Hello world");

// Stop current speech
window.__tts.cancel();

Error Handling

Graceful Degradation: Features disable cleanly when unsupported
Auto-Recovery: Speech recognition restarts automatically after interruptions
User Feedback: Clear status messages for all operations

🤝 Contributing

Lingo follows a simple three-file architecture for easy maintenance. When contributing:

Keep files organized - HTML in lingo.html, styles in lingo.css, logic in lingo.js
Test across browsers - especially Chrome vs Firefox
Maintain responsive design - mobile and desktop compatibility
Preserve accessibility - keyboard navigation and screen readers

📝 License

This project is open source. Feel free to use, modify, and distribute as needed.

🔮 Future Enhancements

Language detection and automatic voice matching
Export functionality for dictated text
Custom voice pitch and volume controls
Batch processing for multiple text files
Integration with cloud speech services for enhanced recognition

Lingo - Bringing voice to your text and text to your voice! 🎙️✨

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
LICENSE.md		LICENSE.md
README.md		README.md
kill.sh		kill.sh
lingo-logo.png		lingo-logo.png
lingo-screenshot.png		lingo-screenshot.png
lingo.css		lingo.css
lingo.desktop		lingo.desktop
lingo.html		lingo.html
lingo.js		lingo.js
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lingo 🗣️

✨ Features

🔊 Text-to-Speech (TTS)

🎤 Speech Recognition

⌨️ Keyboard Shortcuts

🔗 URL Parameters

🎨 User Interface

🚀 Quick Start

Simplest Method -- No Web Server required

Running with Local Server (Recommended)

Manual Server Setup (Alternative)

📋 How to Use

Text-to-Speech

Speech Recognition

🛠️ Technical Details

Architecture

Browser Compatibility

Web APIs Used

📁 Project Structure

🔧 The Run Script

What it does:

Why we need it:

🎯 Use Cases

🔍 Development Features

Console Helpers

Error Handling

🤝 Contributing

📝 License

🔮 Future Enhancements

About

Uh oh!

Releases

Packages

Languages

License

Clay-Ferguson/lingo

Folders and files

Latest commit

History

Repository files navigation

Lingo 🗣️

✨ Features

🔊 Text-to-Speech (TTS)

🎤 Speech Recognition

⌨️ Keyboard Shortcuts

🔗 URL Parameters

🎨 User Interface

🚀 Quick Start

Simplest Method -- No Web Server required

Running with Local Server (Recommended)

Manual Server Setup (Alternative)

📋 How to Use

Text-to-Speech

Speech Recognition

🛠️ Technical Details

Architecture

Browser Compatibility

Web APIs Used

📁 Project Structure

🔧 The Run Script

What it does:

Why we need it:

🎯 Use Cases

🔍 Development Features

Console Helpers

Error Handling

🤝 Contributing

📝 License

🔮 Future Enhancements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages