Skip to content

Callowlock/PineappleExpenseBackend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This project focuses on preprocessing receipt images to enhance text extraction quality. The pipeline is designed to improve image clarity and contrast before running OCR or text extraction techniques. Tesseract OCR was dropped in favor of AWS Textract, and the final text extraction pipeline is documented in the ReceiptProcessing repository.

Repository Contents

1. PreProcessFinal.ipynb

This notebook handles:

  • Loading and processing receipt images
  • Applying transformations to enhance text visibility
  • Saving processed images for further text extraction

Key Highlights

  • Image preprocessing techniques for improved text extraction
  • Python-driven approach with automation capabilities
  • Transition from Tesseract OCR to AWS Textract for better results
  • See ReceiptProcessing repository for updated receipt text extraction

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors