This project automates receipt processing using AWS Textract and Step Functions, with a focus on comparing different machine learning models for data extraction. The repository includes Jupyter notebooks demonstrating:
- Uploading receipts and processing them using AWS Textract
- Implementing AWS Step Functions to orchestrate the workflow
- Comparing multiple machine learning models for improved accuracy
This notebook handles:
- Uploading receipt images to an S3 bucket
- Extracting text from receipts using AWS Textract
- Saving the extracted data in a structured format for further processing
This notebook focuses on:
- Comparing different machine learning models for receipt data classification
- Evaluating model performance using accuracy metrics
- Visualizing model results
This notebook explains:
- The design and implementation of AWS Step Functions as a pipeline to process data
- Users on the Android app interact with the Step function using a REST API
- The state transitions involved in processing receipts
- How extracted data is stored and analyzed
- Integration of AWS Textract for automated text extraction from receipts
- Use of AWS Step Functions to manage and orchestrate workflows
- Application of machine learning techniques to classify and analyze extracted data
- Python-driven implementation with structured, reusable Jupyter Notebook workflows
- Data visualization techniques to compare and evaluate model performance