Organize old emails like a library.
Easy Mail Librarian (EML) is a lightweight and fast searching & viewing system for .eml archives, featuring FTS5-based full-text search, incremental indexing, and on-demand email expansion from original files.
Refer to docs/design.md for the design rational of this project.
This app is designed for users who need to manage or rediscover their archived emails:
- ๐ฆ Running out of IMAP storage? โ Free up space by moving old, untouched emails to local archives.
- ๐ช Changed jobs or lost access? โ Keep a portable, searchable backup of emails from previous positions.
- ๐ Proactive about digital hygiene? โ Create organized, local backups of your important correspondence.
- ๐ค Unsure what to delete? โ Safely archive emails locally first, then decide what to keep without the pressure of limited server storage.
- โฆ and anyone who wants instant, offline access to their email history.
- ๐๏ธBackup: Export your emails as
.emlfiles using your favorite email client (like Outlook, Thunderbird, Apple Mail, etc.) and save them to your local hard drive. - ๐Search & View: Use this app to instantly search through your archive and view any email in detailโperfect for when you urgently need to find an old attachment, conversation, or piece of information.
-
๐ฅ Simple functionality simply update library and then search your cotents.
-
๐ Full-text search powered by SQLite FTS5 Search across subject, sender, recipients, and body content.
-
๐ง Explicit indexing model Clean separation between:
- structured metadata (
emailstable) - search index (
emails_fts) - original
.emlfiles (source of truth)
- structured metadata (
-
โก Fast, local, dependency-light no internet connection needed
- SQLite (no external DB)
- Python backend (FastAPI)
- Vanilla JS frontend
-
๐ On-demand email expansion
- Search results are lightweight
- Full email body is loaded only when a result is clicked
-
๐งฉ Research-friendly architecture
- Deterministic behavior
- Inspectable SQL
- Reproducible indexing
- No hidden caching layers
-
๐ฑ Flexible mobility accessible by phones and tablets
Modern information tools increasingly assume cloud connectivity, centralized services, and opaque automation.
This project started from a simple question:
Can a personal information system be powerful, searchable, and user-friendlyโwithout giving up local control, transparency, or simplicity?
Rather than building another platform, this project explores a different direction: a small, self-contained system that does one thing well, remains understandable, and respects the userโs autonomy. This project intentionally avoids:
- Heavy frontend frameworks
- Opaque indexing layers
- Implicit caching
- Email client abstractions
Instead, it emphasizes:
- Traceability
- Determinism
- Minimal state
- Research reproducibility
Ideal for:
- ๐ Academic email corpora
- ๐ฌ NLP / IR experiments
- ๐ ๏ธ Tooling for inspection and analysis
- ๐ง Systems research and prototyping
pip install -r requirements.txt(SQLite with FTS5 is required; most Python builds include it.)
In backend/config.py:
# MAIL_ROOT: the path to your .eml archive. The app will recursively traverse all .eml files
MAIL_ROOT = Path("/path/to/eml/archive")
# DB_PATH: the path where you want to store the database
DB_PATH = Path("/path/to/eml/database.db")Simply run
# In Windows command prompt
.\scripts\run.bat
# In Linux
chmod +x ./scripts/run.sh
./scripts/run.shSimply open http://localhost:8000 in your browser. Click Update Library to initialize the database if you have not executed 3๏ธโฃ. Once done, you are ready to enjoy fast search and convenient viewing.
-
Choose a supported field:
- Full text (
all) - Subject
- Sender
- Recipients
- Body
- Full text (
-
Enter the keyword to search
-
Click
Searchbutton -
Click any result to expand the full text
-
Each search result is a clickable bar
-
Clicking a result:
- Loads the full email content from the original
.eml - Expands inline
- Loads the full email content from the original
-
Clicking again:
- Folds the email body
This ensures:
- ๐ No duplication of large email bodies
- ๐ Excellent scalability
- ๐งช Clear separation between indexing and presentation
.eml archive
โ
โผ
[ indexer.py ]
โ parses
โผ
โโโโโโโโโโโโโโโโโโโโ
โ SQLite Database โ
โ โ
โ emails โ โ structured storage
โ emails_fts โ โ FTS5 index
โโโโโโโโโโโโโโโโโโโโ
โฒ โ
โ โผ
[ search.py ] FTS MATCH
โ
โผ
FastAPI backend
โ
โผ
Minimal JS frontend
(click โ expand โ load full .eml)
| column | description |
|---|---|
| id | primary key |
| path | absolute path to .eml file |
| subject | decoded subject |
| sender | normalized sender email |
| recipients | comma-separated recipients |
| body | plain-text body (for indexing) |
-
External content table (
content='emails') -
Indexed fields:
- subject
- sender
- recipients
- body
-
Ranked using
bm25
This project adopts a local-first, developer-oriented approach to information management. All indexing, querying, and processing run entirely on the local machineโno cloud services, external APIs, or accounts required. This ensures data sovereignty, predictable offline behavior, long-term viability, and privacy by default.
The system is intentionally built from simple, well-understood components rather than a heavyweight framework:
- SQLite + FTS5 for robust full-text search with minimal operational cost
- A lightweight Python backend for orchestration and cross-platform extensibility
- A browser-based UI as a thin interaction layer, not a dependency
This separation enables independent evolution of subsystems, easier debugging, and clear reasoning about system behavior.
Transparency and inspectability are favored over automation and hidden abstractions:
- Explicit workflows instead of opaque pipelines
- Inspectable data formats instead of black boxes
- Clear failure modes instead of silent errors
- Programmatic access as a first-class interface
The Web UI is optional; all core functionality remains directly accessible through code.
To maintain focus and reduce maintenance burden, the project deliberately excludes:
- User accounts and multi-user features
- Cloud synchronization
- Heavy customization frameworks
- Binary-only distribution
These constraints keep the system centered on correctness, clarity, and understandability rather than feature breadth.
Beyond utility, the codebase is designed to be readable as an engineering narrative. Architectural decisions are reflected in directory structure, module boundaries, naming, and documentation that explains why choices were madeโnot just what they do.
The result is a lean, compositional system that delivers practical functionality while remaining transparent, inspectable, and instructive by design.
sqlite3 eml-search.db ".tables"sqlite3 eml-search.db \
"SELECT subject FROM emails_fts WHERE emails_fts MATCH 'campus card';"INSERT INTO emails_fts(emails_fts) VALUES('rebuild');- HTML email rendering (
text/html) - Threading / conversation grouping
- Attachment indexing
- Advanced FTS ranking or custom scoring
- Scenario-based or semantic search integration
- Adding
frontend/favicon.ico - Updating new run scripts in
scripts/
โ Core functionality complete โ Stable indexing and search โ UI interaction fully working
This is a solid, extensible foundation, not a throwaway prototype. Any contribution is welcomed!
Made with โค๏ธ by Paradoxsolver (paradoxsolver@hotmail.com)