An intelligent document processing system that uses the Model Context Protocol (MCP) to extract, analyze, and route business documents automatically.
This project demonstrates how to use MCP to solve a real business challenge: automating document processing workflows. The system can:
- Classify incoming documents (invoices, contracts, emails)
- Extract relevant information using ML models
- Process documents according to their type Maintain context throughout the processing pipeline Expose functionality through a REST API
- Context Objects: Central to MCP, these objects (implemented in
MCPContext) carry information between processing steps and maintain the document's state. - Memory System: Stores context objects between processing steps, with pluggable backends.
- Protocols: Defines clear interfaces for processors and models, ensuring modularity.
- Router: Intelligently routes documents to specialized processors based on content.
This solution addresses several business challenges:
- Reduced Manual Processing: Automates extraction of data from documents
- Consistency: Ensures consistent processing across document types
- Auditability: Maintains processing history and confidence scores
- Scalability: Modular design allows adding new document types easily
- Uses BERT-based models for classification and entity extraction
- T5 model for document summarization
- FastAPI for REST interface
- Pluggable architecture for easy extension
- Comprehensive logging and error handling
- React based UI for better user experience
The MCP Document Processor is designed to solve the common business challenge of processing various types of documents (invoices, contracts, emails, etc.) in a consistent and automated way. It utilizes the Model Context Protocol framework to manage information flow between different components of the system.
- Document Classification: Automatically identifies document types
- Information Extraction: Extracts key information from documents
- Document Routing: Routes documents to the appropriate processors
- Context Management: Maintains context throughout the processing pipeline
- API Interface: Provides a RESTful API for integration with other systems
The system is built around the Model Context Protocol (MCP), which provides:
-
Context Objects: Carry information across processing steps
# Example of MCPContext usage context = MCPContext( document_id=document_id, raw_text=text, metadata=metadata ) # Adding extracted data with confidence scores context.add_extracted_data("invoice_number", "INV-12345", confidence=0.95) # Tracking processing history context.add_to_history( processor_name="InvoiceProcessor", status="completed", details={"processing_time": "0.5s"} )
-
Memory System: Stores context objects between API calls
# Storing context in memory memory.store(document_id, context) # Retrieving context from memory context = memory.retrieve(document_id)
-
Protocols: Define interfaces for processors and models
# Processor protocol example class Processor(Protocol): @abstractmethod def process(self, context: MCPContext) -> MCPContext: """Process the document and update the context.""" pass @abstractmethod def can_handle(self, context: MCPContext) -> bool: """Determine if this processor can handle the given document.""" pass
-
Router: Routes documents to appropriate specialized processors
# Router usage example processor = processor_router.route(context) if processor: processed_context = processor.process(context)
Document Upload → MCPContext Creation → Memory Storage →
Document Processing → Router Selection → Specialized Processor →
Entity Extraction → Context Update → Memory Storage → API Response
The Model Context Protocol implementation in this project offers several key advantages:
The MCPContext class maintains state throughout the document processing lifecycle:
# Context is created during document upload
@router.post("/documents/upload")
async def upload_document(file: UploadFile, memory: MemoryInterface):
# Create a context
context = MCPContext(
document_id=document_id,
raw_text=text,
metadata=metadata
)
# Store in memory for later retrieval
memory.store(document_id, context)The memory system is designed to be pluggable, allowing different storage backends:
# Factory function in memory.py
def get_memory_store(memory_type: str = "in_memory", **kwargs) -> MemoryInterface:
if memory_type == "in_memory":
return InMemoryStorage(default_ttl=kwargs.get("ttl", 3600))
# Additional implementations can be added hereMCP tracks confidence scores for all extracted data, enabling better decision-making:
# In entity_extractor.py
entity_data = {
"text": text[current_entity["start"]:current_entity["end"]],
"start": current_entity["start"],
"end": current_entity["end"],
"confidence": avg_confidence
}Each processing step is recorded in the context's history, providing auditability:
# In router.py
context.add_to_history(
processor_name=processor.__class__.__name__,
status="completed"
)The ProcessorRouter determines the appropriate processor for each document:
# In router.py
def route(self, context: MCPContext) -> Optional[Processor]:
for processor in self.processors:
if processor.can_handle(context):
return processor
return NoneAdding new document types is straightforward by implementing the Processor protocol:
# Example of adding a new processor
class NewDocumentProcessor(BaseProcessor):
def can_handle(self, context: MCPContext) -> bool:
# Logic to determine if this processor can handle the document
pass
def process(self, context: MCPContext) -> MCPContext:
# Document processing logic
passThe system includes specialized processors for different document types:
- Invoice Processor: Extracts vendor, customer, line items, totals, etc.
- Contract Processor: Extracts parties, key dates, terms, etc.
- Email Processor: Extracts sender, recipients, subject, body, etc.
Several ML models are used for different tasks:
- Document Classifier: BERT-based model for document type classification
- Entity Extractor: Named Entity Recognition model for extracting key information
- Summarizer: T5-based model for generating document summaries
The MCP Document Processor includes a modern React-based user interface that provides an intuitive way to interact with the document processing system. The UI is built with Material-UI and offers the following features:
- Dashboard: Overview of processed documents with statistics and quick access to document details
- Document Upload: Drag-and-drop interface for uploading new documents
- Document Processing: Step-by-step workflow for processing documents
- Document Viewer: Detailed view of processed documents with extracted information
- Processing History: Timeline view of all processing steps for auditability
The frontend is built with:
- React: For building the user interface components
- Material-UI: For consistent, responsive design
- React Router: For navigation between different views
- Axios: For API communication with the backend
- Chart.js: For data visualization of document statistics
The frontend communicates with the backend through a RESTful API, with the following main endpoints:
GET /api/documents: Retrieve all documentsPOST /api/documents/upload: Upload a new documentPOST /api/documents/{document_id}/process: Process a documentGET /api/documents/{document_id}: Get document detailsDELETE /api/documents/{document_id}: Delete a document
The MCP Document Processor follows a layered architecture that integrates the frontend, API layer, processing components, and machine learning models:
┌─────────────────────────────────────────────────────────────────────────┐
│ Frontend Layer │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Dashboard │ │ Upload │ │ Document Viewer │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │ │ │
└──────────┼───────────────────┼─────────────────────────┼────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ API Layer │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Document │ │ Document │ │ Document │ │
│ │ Upload API │ │ Process API │ │ Retrieval API │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │ │ │
└──────────┼───────────────────┼─────────────────────────┼────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ MCP Core Components │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ MCPContext │◄────►│ Memory │◄────►│ Processor Router │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │ │
└──────────┼────────────────────────────────────────────┼─────────────────┘
│ │
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ Document Processors │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Invoice │ │ Contract │ │ Email │ │
│ │ Processor │ │ Processor │ │ Processor │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │ │ │ │
└──────────┼───────────────────┼─────────────────────────┼────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ ML Models Layer │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Document │ │ Entity │ │ Summarizer │ │
│ │ Classifier │ │ Extractor │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
The document processing workflow involves multiple steps across the system components:
-
Document Upload:
- User uploads a document through the UI
- Frontend sends the document to the backend API
- Backend creates an MCPContext object with document metadata
- Context is stored in the Memory system
-
Document Classification:
- User initiates processing through the UI
- Backend retrieves the document context from Memory
- Document Classifier model determines document type
- Context is updated with document type information
-
Document Processing:
- Processor Router selects the appropriate processor based on document type
- Selected processor (Invoice, Contract, or Email) processes the document
- Processor uses Entity Extractor to identify key information
- Extracted data is added to the context with confidence scores
-
Result Retrieval:
- Updated context is stored back in Memory
- UI retrieves and displays the processed document information
- User can view extracted data, confidence scores, and processing history
-
Audit and Review:
- All processing steps are recorded in the context's processing history
- UI provides visualization of confidence scores for extracted data
- User can review the document text alongside extracted information
- Python 3.8+
- Node.js 14+ and npm (for the frontend)
- Dependencies listed in requirements.txt
-
Clone the repository
git clone https://github.com/yourusername/mcp_document_processor.git cd mcp_document_processor -
Create and activate a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install backend dependencies
pip install -r requirements.txt
-
Create a data directory for document storage (if it doesn't exist)
mkdir -p data
-
Navigate to the frontend directory
cd frontend -
Install frontend dependencies
npm install
-
From the root directory of the project (with virtual environment activated):
python app.py
This will start the FastAPI server on http://localhost:8000.
-
You can access the API documentation at http://localhost:8000/docs
-
Open a new terminal window/tab
-
Navigate to the frontend directory:
cd /path/to/mcp_document_processor/frontend -
Start the React development server:
npm start
This will start the frontend on http://localhost:3000.
- Open your browser and navigate to http://localhost:3000
- Use the sidebar navigation to:
- View the dashboard
- Upload new documents
- Process and view document details
-
Upload a Document:
- Click on "Upload Document" in the sidebar
- Drag and drop a document (PDF, image, or text file)
- Click "Upload Document" button
-
Process the Document:
- After successful upload, click "Process Document"
- Wait for processing to complete
-
View Results:
- View extracted data, confidence scores, and processing history
- Navigate to the Dashboard to see all processed documents
You can also interact directly with the API:
GET /api/documents: Retrieve all documentsPOST /api/documents/upload: Upload a new documentPOST /api/documents/{document_id}/process: Process a documentGET /api/documents/{document_id}: Get document detailsDELETE /api/documents/{document_id}: Delete a document
- Create a new processor class that inherits from
BaseProcessor - Implement the
can_handleandprocessmethods - Add the processor to the router in
api/routes.py
- Create a new model class that implements the appropriate protocol
- Add configuration in
config/config.yaml - Integrate the model with the relevant processor