GitHub - icosane/Alyssum: Translate text, speech, books, and documents fully offline with OCR and Whisper.

Alyssum is an offline translator that combines the power of Argos Translate with Tesseract OCR and faster-whisper.
Translate text, documents, books, and even on-screen content — all without an internet connection.
Privacy-friendly and designed for quick everyday use.

License

Alyssum is licensed under the GNU Affero General Public License v3.0.
You are free to use, modify, and distribute it under the terms of the AGPL-3.0-or-later.

This software also contains third-party components released under various open-source licenses. Their license texts are provided in the licenses/ directory.

Features

Translate text, documents, books, and on-screen content completely offline.
Supports all languages available in Argos Translate.
Integrated OCR with Tesseract for capturing text from images and PDFs.
Voice input powered by faster-whisper for quick speech-to-translation.
Configurable shortcuts for all main actions — launch OCR, translate, clear windows, copy results, start voice input and translate files.
File translation for .txt, .odt, .odp, .docx, .pptx, .epub, .html, .srt, and .pdf.
Browser extension for translating selectable text without manual copy-paste.
Internal PDF viewer based on PDFjs, for translating text inside PDFs via browser extension.
GPU acceleration support for faster translation on compatible NVIDIA cards.

Screenshots

Main Window

Settings

OCR in action

Download compiled release

Get the latest Alyssum release on the GitHub Releases page.

⚠️ The archive is large and split into three parts (.7z.001, .7z.002, .7z.003). Download all parts and extract only the first part with 7-Zip — the rest will combine automatically.

After extraction:

Total size: ~6 GB
Run Alyssum.exe
Go to Settings to download Argos Translate packages for the languages you want.
If using voice input, download any Whisper model (larger models give better transcription).

Getting Started

⚠️ If you downloaded the compiled release, skip the installation steps below — just extract the archive and run Alyssum.exe.

Note: Alyssum is mainly developed and tested on Windows.
While it should work on Linux, I have limited time to test extensively, so some features may require additional setup or adjustments. Contributions and fixes are welcome.

Prerequisites

Python 3.12
Git
Windows (primary), Linux (see Linux issues)
NVIDIA GPU with CUDA 12.6 support (optional, for GPU acceleration)

Installation

Clone the repository:

git clone https://github.com/icosane/Alyssum.git

Navigate to the folder and create a virtual environment:
```
python -m venv .
```
Activate the virtual environment:
```
.\Scripts\activate
```
Install requirements:
```
pip install -r requirements.txt
```
Download Tesseract Portable or Tesseract and place it into:
```
./AlyssumResources/tesseract
```
Recommended structure:
```
tesseract/
├── bin/
├── include/
├── lib/
└── share/
    ├── man/
    └── tessdata/
```
If needed, adjust the TesseractManager search paths in config.py.

Tip: You can also open the folder in Visual Studio Code or VSCodium, install the Python extension, then press Ctrl+Shift+P → Python: Create Environment → .venv → select requirements.txt.
Enable UTF-8 support in Windows (recommended for files with non-Latin characters):
- Settings → Time & language → Language & region → Administrative language settings → Change system locale
- Check Beta: Use Unicode UTF-8 for worldwide language support
- Reboot for changes to apply.

Optional: Building .EXE

Install PyInstaller:
```
pip install pyinstaller
```
Run:
```
pyinstaller build.spec
```

(The build.spec file is included in the repository.)

Translation Packages

Download Argos Translate packages via the Settings page, or manually from here.

For manual install extract the folder into:

AlyssumResources/models/argostranslate/data/argos-translate/packages

Example structure:

AlyssumResources
└── models
    └── argostranslate
        └── data
            └── argos-translate
                └── packages
                    ├── translate-en_fr-1.9
                    │   ├── model
                    │   ├── stanza
                    │   ├── metadata.json
                    │   ├── README.md
                    │   └── config.json
                    └── en_de
                        ├── model
                        ├── stanza
                        ├── metadata.json
                        ├── README.md
                        └── config.json

Folder naming: langfrom_langto or translate-langfrom_langto-version.

Tesseract Models

Get models from:

Place them into:

AlyssumResources/tesseract/share/tessdata

Voice input

Select your preferred Whisper model in Settings — options include tiny, base, small, medium, large-v1, large-v2, large-v3, large, and large-v3-turbo.

.en models are hidden (English-only), and distil models are excluded due to performance issues during testing.

Downloaded models are stored in: AlyssumResources/models/whisper.

GPU Acceleration

If CUDA is available, the app will automatically detect and use it for faster translation.

Browser Extension

Chromium-based browsers

Note: By default, this extension is Manifest V2 (MV2), which is no longer supported in modern Chrome. I wasn’t able to fully port it to MV3, so the MV3 version is provided separately. In MV3, the extension only supports opening PDFs manually; automatic PDF detection won't work.

Enable Developer mode in chrome://extensions/.
Click Load unpacked and select the alyssum-ext folder.

⚠️ If you try to install the .crx directly, Chrome may block it because it is not from the Chrome Web Store. Use Load unpacked instead.

(Optional: For Ungoogled Chromium or Supermium, you can enable chrome://flags/#extension-mime-request-handling → Always prompt for install, then drag the .crx file into Chrome.)

Firefox-based browsers (or Firefox ESR, Developer Edition, Nightly)

Open about:config.
Set xpinstall.signatures.required to false.
Open Add-ons Manager → Settings → Install Add-on From File.
Select firefox.xpi.

After installation, go to the app settings, copy the API key, and paste it into the extension settings.

Note: The API key only needs to be set once.

⚠️ Important: If using uBlock Origin, make sure to disable the Block Outsider Intrusion into LAN filter. Otherwise, extension will not be able to communicate with the local server.

Usage:

On web pages: Select any text and click the floating popup button to translate it directly in the popup window.
When viewing a PDF in Chrome or Firefox: Click the extension button to open the internal PDF viewer with translation capabilities. You can translate any selectable text just like on regular web pages.
When no PDF is open: Clicking the extension button will still open the internal PDF viewer. You can drag and drop any local PDF file into it to view and translate.

Registry entries (Windows)

The application saves the window size, position, and API key in the system registry. To clear these settings, simply delete the following registry key:

HKEY_CURRENT_USER\Software\icosane\Alyssum

Linux issues

The in-app translation and package installation work without issues, but there may be problems with OCR and audio input (pyaudio).

OCR requires a screenshot utility. You may try gnome-screenshot, but in testing (Fedora 42 on Wayland) it did not work reliably.
Audio input with pyaudio may fail depending on your distribution.

Minimum required packages

sudo apt-get install python3.12 python3-pyaudio gcc python3.12-dev gnome-screenshot tesseract

(On Fedora use the equivalent dnf or rpm package names.)

Development notes

When creating a virtual environment in VSCode/VSCodium, make sure to select Python 3.12 explicitly from:

/usr/bin/python3.12

Acknowledgments, Licenses and Third-Party Software

This project uses the following libraries and components, which may be licensed under open-source or proprietary terms.
Full license texts for each component are included in the licenses/ directory.

Core Libraries

Argos Translate - Machine translation library, licensed under the MIT License
Tesseract OCR Engine - an OCR engine, licensed under the Apache License 2.0
QFluentWidgets - a fluent design widgets library, licensed under the GNU General Public License v3.0
argos-translate-files - File translation via Argos Translate, licensed under the GNU Affero General Public License v3.0
PyQt5 - Python bindings for Qt v5, licensed under the GNU General Public License v3.0
Flask - Micro web framework, licensed under the BSD 3-Clause License
faster-whisper - audio to text transcription, licensed under the MIT License

Supporting Libraries & Tools

langdetect - Language detection, licensed under the Apache License 2.0
pytesseract - a Python wrapper for Google Tesseract, licensed under the Apache License 2.0
opencv-python - a library for computer vision and image processing, licensed under the MIT License
pyautogui - GUI automation, licensed under the BSD 3-Clause License
pillow - Python Imaging Library, licensed under the MIT-CMU License
nvidia-cuda-runtime - GPU runtime, licensed under the NVIDIA EULA
nvidia-cudnn - Deep learning acceleration, licensed under the NVIDIA EULA
nvidia-cublas - GPU BLAS library, licensed under the NVIDIA EULA
PyInstaller - bundles a Python application and all its dependencies into a single package. Licensed under the GPL 2.0 License and the Apache License 2.0
waitress - WSGI server, licensed under the Zope Public License (ZPL) 2.1
jsonify - CSV-to-JSON converter, licensed under the MIT License
pytorch - Deep learning framework, licensed under the pytorch License
pyaudio - audio I/O library, licensed under the MIT License
psutil - process and system monitoring, licensed under the BSD 3-Clause License

Resources & References

This software contains source code provided by NVIDIA Corporation.

NOTE: This software depends on packages that may be licensed under different open-source or proprietary licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
AlyssumResources		AlyssumResources
Extensions		Extensions
alyssum-ext		alyssum-ext
assets		assets
licenses		licenses
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
build.spec		build.spec
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

License

Features

Screenshots

Download compiled release

Contents

Getting Started

Prerequisites

Installation

Optional: Building .EXE

Translation Packages

Tesseract Models

Voice input

GPU Acceleration

Browser Extension

Chromium-based browsers

Firefox-based browsers (or Firefox ESR, Developer Edition, Nightly)

Usage:

Registry entries (Windows)

Linux issues

Minimum required packages

Development notes

Acknowledgments, Licenses and Third-Party Software

Core Libraries

Supporting Libraries & Tools

Resources & References

About

Uh oh!

Releases 2

Languages

License

icosane/Alyssum

Folders and files

Latest commit

History

Repository files navigation

License

Features

Screenshots

Download compiled release

Contents

Getting Started

Prerequisites

Installation

Optional: Building .EXE

Translation Packages

Tesseract Models

Voice input

GPU Acceleration

Browser Extension

Chromium-based browsers

Firefox-based browsers (or Firefox ESR, Developer Edition, Nightly)

Usage:

Registry entries (Windows)

Linux issues

Minimum required packages

Development notes

Acknowledgments, Licenses and Third-Party Software

Core Libraries

Supporting Libraries & Tools

Resources & References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Languages