An intelligent OCR-based receipt analysis tool that extracts key information from receipt images.
- Extracts merchant name, date, amount, and currency
- Supports multiple receipt types (invoice, receipt, order)
- Handles multiple currency formats (USD, EUR, GBP, CNY, HKD, JPY)
- Provides confidence scores for extracted data
- Includes fuzzy matching for better accuracy
- Caches results for improved performance
- Python 3.7+
- doctr
- matplotlib
- fuzzywuzzy
- python-Levenshtein (optional, for better performance)
- Clone the repository:
git clone https://github.com/xer0bit/receipt-analyzer.git
cd receipt-analyzer
pip install -r requirements.txtfrom analyzer import analyze_receipt
result = analyze_receipt('path/to/receipt.jpg')
print(result)The analyzer returns a dictionary containing:
merchant_name: Extracted business namebill_date: Date in YYYY-MM-DD formatamount: Transaction amountcurrency: Detected currency codedescription: Transaction descriptiontype: Receipt type (invoice/receipt/order)confidence_score: Confidence level (0-1)
Customize the behavior by modifying config.py:
- Supported currencies
- Keywords for receipt classification
- Path configurations
The analyzer includes validation checks and will raise ValueError for invalid receipts. Always wrap the analyze_receipt call in a try-except block:
try:
result = analyze_receipt('path/to/receipt.jpg')
except ValueError as e:
print(f"Error: {e}")Start the API server:
- Start the API server: