Skip to content

Callowlock/ReceiptProcessing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This project automates receipt processing using AWS Textract and Step Functions, with a focus on comparing different machine learning models for data extraction. The repository includes Jupyter notebooks demonstrating:

  • Uploading receipts and processing them using AWS Textract
  • Implementing AWS Step Functions to orchestrate the workflow
  • Comparing multiple machine learning models for improved accuracy

Repository Contents

1. UploadReceiptsSaveTextract.ipynb

This notebook handles:

  • Uploading receipt images to an S3 bucket
  • Extracting text from receipts using AWS Textract
  • Saving the extracted data in a structured format for further processing

2. ModelComparison.ipynb

This notebook focuses on:

  • Comparing different machine learning models for receipt data classification
  • Evaluating model performance using accuracy metrics
  • Visualizing model results

3. StepFunctionDescription.ipynb

This notebook explains:

  • The design and implementation of AWS Step Functions as a pipeline to process data
  • Users on the Android app interact with the Step function using a REST API
  • The state transitions involved in processing receipts
  • How extracted data is stored and analyzed

Key Highlights

  • Integration of AWS Textract for automated text extraction from receipts
  • Use of AWS Step Functions to manage and orchestrate workflows
  • Application of machine learning techniques to classify and analyze extracted data
  • Python-driven implementation with structured, reusable Jupyter Notebook workflows
  • Data visualization techniques to compare and evaluate model performance

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors