A FastAPI-based service that automates the extraction and parsing of information from experience letters and relieving letters using OCR and Large Language Models.
- Document Processing: Supports both PDF and image formats
- OCR Integration: Uses doctr for accurate text extraction
- LLM Processing: Leverages Predibase's Mistral-7B model for intelligent information extraction
- Structured Output: Returns parsed information in consistent JSON format
- Secure API: Includes basic authentication for API endpoints
- Comprehensive Logging: Built-in logging system for debugging and monitoring
The service follows a modular architecture:
- FastAPI for RESTful API endpoints
- Document OCR processing using doctr
- LLM-based text parsing using Predibase
- JSON response formatting
- Python 3.8+
- CUDA-compatible GPU (for OCR processing)
- Required Python packages:
- fastapi
- uvicorn
- python-multipart
- doctr
- predibase
- python-jose[cryptography]
- Clone the repository:
git clone https://github.com/rohandhupar1996/paperface-llm.git
cd OCR_PREDIBASE- Install dependencies:
pip install -r requirements.txt- Configure the application:
- Update
src/configs/config.jsonwith your Predibase API token and model preferences - Modify logging configuration if needed
- Start the server:
python main.pyThe server will start on http://0.0.0.0:9001
- API Endpoints:
- Health Check:
GET /health
- Document Upload and Processing:
POST /upload/
- Authentication:
- Default credentials:
- Username: admin
- Password: secret
Upload one or more experience/relieving letters for processing.
Request:
- Method: POST
- Authentication: Basic Auth
- Content-Type: multipart/form-data
- Body: List of files (PDF/Images)
Response:
[
{
"filename": "example.pdf",
"file_output": {
"FullName": "John Doe",
"Employer": "Example Corp",
"EmployeeId": "EMP123",
"Designation": "Software Engineer",
"Location": "New York",
"DateOfJoining": "01/01/20",
"LastDateOfWorking": "31/12/22"
}
}
]The service includes comprehensive error handling for:
- File type validation
- OCR processing errors
- LLM processing failures
- Authentication failures
Logs are written to app.log and include:
- Request processing details
- OCR operation status
- LLM processing results
- Error traces for debugging
- Basic authentication required for API endpoints
- Configuration stored in separate config file
- Sensitive information should be managed through environment variables
- File type validation before processing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request