This project is an intelligent document processing pipeline built locally to automatically extract key information from documents like certificates and IDs. The goal is to reduce manual data entry, improve accuracy, and learn key skills across the full product development lifecycle.
- Python 3.8+
- An OpenAI API Key
-
Clone the repository:
git clone <your-repo-url> cd ai_doc_processor
-
Create a virtual environment and install dependencies: You can use the automated setup script:
bash setup.sh
Or run the commands manually:
python3 -m venv ai_doc_venv source ai_doc_venv/bin/activate pip install -r requirements.txt -
Set up your environment variables: Create a
.envfile in the root directory and add your API key:OPENAI_API_KEY='sk-...'
To run the main analysis script:
python main.py