A lightweight alternative to Windows H (Windows Speech Recognition) that provides more flexibility and language support using OpenAI's Whisper model.
This tool is designed to replace the built-in Windows H speech recognition feature, offering enhanced capabilities for multilingual speech-to-text transcription. It's particularly useful for users who need to transcribe speech in multiple languages (English, Spanish, etc.) with high accuracy.
- 🎤 Simple one-click recording interface
- 🌍 Multi-language support (English, Spanish, and more)
- 📝 Real-time transcription
- 🎯 High accuracy using OpenAI's Whisper model
- 🖥️ Clean, simple interface
Here's a screenshot of the application interface:
- Go to the Releases page
- Download the latest version of
whisper-gui.exe - Run the executable
-
Clone the repository:
git clone https://github.com/elpargo/whisper-windows-gui/releases cd whisper-gui -
Create and activate a virtual environment:
python -m venv venv .\venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Build the executable:
.\build.ps1
- Run the application
- Click the microphone button (or press Space/Enter) to start recording
- Speak in your desired language
- Click the button again (or press Space/Enter) to stop recording
- The transcription will appear in the text area
- Click "Save" to save the transcription to a file
Note: You can use either the Space or Enter key interchangeably to start/stop recording.
While Windows H provides basic speech recognition, it has limitations:
- Limited language support
- Requires internet connection
- Less accurate for non-English languages
- No easy way to save transcriptions
Whisper GUI addresses these issues by:
- Supporting multiple languages
- Working offline
- Providing higher accuracy
- Offering easy transcription saving
- Built with Python and PyQt6
- Uses OpenAI's Whisper model for transcription
- Compiled with PyInstaller for easy distribution
This project is released under the MIT License. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- OpenAI Whisper for the speech recognition model
- PyQt6 for the GUI framework
- 💾 Save transcriptions to text files
- Invoke on global OS keybinding (ie: replace windows + H entirely)
- output to text input field directly (Also a windows + H feature)
