Alyssum is an offline translator that combines the power of Argos Translate with Tesseract OCR and faster-whisper.
Translate text, documents, books, and even on-screen content — all without an internet connection.
Privacy-friendly and designed for quick everyday use.
Alyssum is licensed under the GNU Affero General Public License v3.0.
You are free to use, modify, and distribute it under the terms of the AGPL-3.0-or-later.
This software also contains third-party components released under various open-source licenses. Their license texts are provided in the licenses/ directory.
- Translate text, documents, books, and on-screen content completely offline.
- Supports all languages available in Argos Translate.
- Integrated OCR with Tesseract for capturing text from images and PDFs.
- Voice input powered by faster-whisper for quick speech-to-translation.
- Configurable shortcuts for all main actions — launch OCR, translate, clear windows, copy results, start voice input and translate files.
- File translation for
.txt,.odt,.odp,.docx,.pptx,.epub,.html,.srt, and.pdf. - Browser extension for translating selectable text without manual copy-paste.
- Internal PDF viewer based on PDFjs, for translating text inside PDFs via browser extension.
- GPU acceleration support for faster translation on compatible NVIDIA cards.
Get the latest Alyssum release on the GitHub Releases page.
⚠️ The archive is large and split into three parts (.7z.001,.7z.002,.7z.003). Download all parts and extract only the first part with 7-Zip — the rest will combine automatically.
After extraction:
- Total size: ~6 GB
- Run
Alyssum.exe - Go to Settings to download Argos Translate packages for the languages you want.
- If using voice input, download any Whisper model (larger models give better transcription).
- Getting Started
- Optional: Building .EXE
- Translation Packages
- Tesseract Models
- Voice input
- GPU Acceleration
- Browser Extension
- Registry entries (Windows)
- Linux issues
- Acknowledgments, Licenses and Third-Party Software
⚠️ If you downloaded the compiled release, skip the installation steps below — just extract the archive and run Alyssum.exe.
Note: Alyssum is mainly developed and tested on Windows.
While it should work on Linux, I have limited time to test extensively, so some features may require additional setup or adjustments. Contributions and fixes are welcome.
- Python 3.12
- Git
- Windows (primary), Linux (see Linux issues)
- NVIDIA GPU with CUDA 12.6 support (optional, for GPU acceleration)
-
Clone the repository:
git clone https://github.com/icosane/Alyssum.git
-
Navigate to the folder and create a virtual environment:
python -m venv . -
Activate the virtual environment:
.\Scripts\activate
-
Install requirements:
pip install -r requirements.txt
-
Download Tesseract Portable or Tesseract and place it into:
./AlyssumResources/tesseractRecommended structure:
tesseract/ ├── bin/ ├── include/ ├── lib/ └── share/ ├── man/ └── tessdata/If needed, adjust the
TesseractManagersearch paths inconfig.py.Tip: You can also open the folder in Visual Studio Code or VSCodium, install the Python extension, then press
Ctrl+Shift+P→ Python: Create Environment →.venv→ selectrequirements.txt. -
Enable UTF-8 support in Windows (recommended for files with non-Latin characters):
- Settings → Time & language → Language & region → Administrative language settings → Change system locale
- Check Beta: Use Unicode UTF-8 for worldwide language support
- Reboot for changes to apply.
- Install PyInstaller:
pip install pyinstaller
- Run:
pyinstaller build.spec
(The build.spec file is included in the repository.)
Download Argos Translate packages via the Settings page, or manually from here.
For manual install extract the folder into:
AlyssumResources/models/argostranslate/data/argos-translate/packages
Example structure:
AlyssumResources
└── models
└── argostranslate
└── data
└── argos-translate
└── packages
├── translate-en_fr-1.9
│ ├── model
│ ├── stanza
│ ├── metadata.json
│ ├── README.md
│ └── config.json
└── en_de
├── model
├── stanza
├── metadata.json
├── README.md
└── config.json
Folder naming: langfrom_langto or translate-langfrom_langto-version.
Get models from:
Place them into:
AlyssumResources/tesseract/share/tessdata
Select your preferred Whisper model in Settings — options include tiny, base, small, medium, large-v1, large-v2, large-v3, large, and large-v3-turbo.
.en models are hidden (English-only), and distil models are excluded due to performance issues during testing.
Downloaded models are stored in: AlyssumResources/models/whisper.
If CUDA is available, the app will automatically detect and use it for faster translation.
Note: By default, this extension is Manifest V2 (MV2), which is no longer supported in modern Chrome. I wasn’t able to fully port it to MV3, so the MV3 version is provided separately. In MV3, the extension only supports opening PDFs manually; automatic PDF detection won't work.
- Enable Developer mode in
chrome://extensions/. - Click Load unpacked and select the
alyssum-extfolder.
⚠️ If you try to install the.crxdirectly, Chrome may block it because it is not from the Chrome Web Store. Use Load unpacked instead.
(Optional: For Ungoogled Chromium or Supermium, you can enable chrome://flags/#extension-mime-request-handling → Always prompt for install, then drag the .crx file into Chrome.)
- Open
about:config. - Set
xpinstall.signatures.requiredtofalse. - Open Add-ons Manager → Settings → Install Add-on From File.
- Select
firefox.xpi.
After installation, go to the app settings, copy the API key, and paste it into the extension settings.
Note: The API key only needs to be set once.
⚠️ Important: If using uBlock Origin, make sure to disable theBlock Outsider Intrusion into LANfilter. Otherwise, extension will not be able to communicate with the local server.
-
On web pages: Select any text and click the floating popup button to translate it directly in the popup window.
-
When viewing a PDF in Chrome or Firefox: Click the extension button to open the internal PDF viewer with translation capabilities. You can translate any selectable text just like on regular web pages.
-
When no PDF is open: Clicking the extension button will still open the internal PDF viewer. You can drag and drop any local PDF file into it to view and translate.
The application saves the window size, position, and API key in the system registry. To clear these settings, simply delete the following registry key:
HKEY_CURRENT_USER\Software\icosane\Alyssum
The in-app translation and package installation work without issues, but there may be problems with OCR and audio input (pyaudio).
- OCR requires a screenshot utility. You may try
gnome-screenshot, but in testing (Fedora 42 on Wayland) it did not work reliably. - Audio input with
pyaudiomay fail depending on your distribution.
sudo apt-get install python3.12 python3-pyaudio gcc python3.12-dev gnome-screenshot tesseract(On Fedora use the equivalent dnf or rpm package names.)
When creating a virtual environment in VSCode/VSCodium, make sure to select Python 3.12 explicitly from:
/usr/bin/python3.12
This project uses the following libraries and components, which may be licensed under open-source or proprietary terms.
Full license texts for each component are included in the licenses/ directory.
- Argos Translate - Machine translation library, licensed under the MIT License
- Tesseract OCR Engine - an OCR engine, licensed under the Apache License 2.0
- QFluentWidgets - a fluent design widgets library, licensed under the GNU General Public License v3.0
- argos-translate-files - File translation via Argos Translate, licensed under the GNU Affero General Public License v3.0
- PyQt5 - Python bindings for Qt v5, licensed under the GNU General Public License v3.0
- Flask - Micro web framework, licensed under the BSD 3-Clause License
- faster-whisper - audio to text transcription, licensed under the MIT License
- langdetect - Language detection, licensed under the Apache License 2.0
- pytesseract - a Python wrapper for Google Tesseract, licensed under the Apache License 2.0
- opencv-python - a library for computer vision and image processing, licensed under the MIT License
- pyautogui - GUI automation, licensed under the BSD 3-Clause License
- pillow - Python Imaging Library, licensed under the MIT-CMU License
- nvidia-cuda-runtime - GPU runtime, licensed under the NVIDIA EULA
- nvidia-cudnn - Deep learning acceleration, licensed under the NVIDIA EULA
- nvidia-cublas - GPU BLAS library, licensed under the NVIDIA EULA
- PyInstaller - bundles a Python application and all its dependencies into a single package. Licensed under the GPL 2.0 License and the Apache License 2.0
- waitress - WSGI server, licensed under the Zope Public License (ZPL) 2.1
- jsonify - CSV-to-JSON converter, licensed under the MIT License
- pytorch - Deep learning framework, licensed under the pytorch License
- pyaudio - audio I/O library, licensed under the MIT License
- psutil - process and system monitoring, licensed under the BSD 3-Clause License
- Tesseract portable
- Letter T icons by Luch Phou – Flaticon
- Sl-Alex for ShortcutEdit
This software contains source code provided by NVIDIA Corporation.
NOTE: This software depends on packages that may be licensed under different open-source or proprietary licenses.



