OCR & PDF Text Extraction Microservice
The Personal AI Factory v1 โ OCR & PDF Text Extraction Microservice is a production-grade,
serverless backend component designed to extract clean, machine-readable text from textual PDF documents.
Built using TypeScript and deployed as a Vercel Serverless Function, this
microservice exposes a single HTTP API endpoint that fetches a remote PDF file, parses its textual content
using pdf-parse, and returns structured JSON output.
This service is optimized for automation-first architectures, specifically downstream integration with n8n pipelines. Version 1 explicitly supports text-based PDFs only and does not perform OCR on scanned documents or images.
- ๐งฉ Project Overview
- ๐ฏ Objectives & Goals
- โ Acceptance Criteria
- ๐ป Prerequisites
- โ๏ธ Installation & Setup
- ๐ API Documentation
- ๐ฅ๏ธ UI / Frontend
- ๐ข Status Codes
- ๐ Features
- ๐งฑ Tech Stack & Architecture
- ๐ ๏ธ Workflow & Implementation
- ๐งช Testing & Validation
- ๐ Validation Summary
- ๐งฐ Verification Testing Tools
- ๐งฏ Troubleshooting & Debugging
- ๐ Security & Secrets
- โ๏ธ Deployment (Vercel)
- โก Quick-Start Cheat Sheet
- ๐งพ Usage Notes
- ๐ง Performance & Optimization
- ๐ Enhancements & Features
- ๐งฉ Maintenance & Future Work
- ๐ Key Achievements
- ๐งฎ High-Level Architecture
- ๐๏ธ Folder Structure
- ๐งญ How to Demonstrate Live
- ๐ก Summary, Closure & Compliance
This microservice functions as a stateless PDF text extraction API within the Personal AI Factory ecosystem.
- Accepts a publicly accessible PDF URL
- Downloads the PDF at runtime
- Parses textual content using
pdf-parse - Returns extracted text as structured JSON
- Designed for synchronous HTTP execution
- Provide a reliable text extraction layer for automation workflows
- Eliminate dependency on paid OCR services for textual PDFs
- Maintain fast cold-start and execution times
- Enable seamless integration with n8n HTTP Request nodes
- Serve as a foundational V1 component for future OCR and AI expansion
- HTTP 200 returned for valid textual PDFs
- Structured error responses for invalid input
- No OCR processing in Version 1
- Deployable on Vercel without custom infrastructure
- JSON output compatible with automation tools
- Node.js 18 or higher
- Vercel CLI (for deployment)
- Publicly accessible PDF URLs
- Basic REST API knowledge
- Clone the repository
- Install dependencies
- Verify Node.js version compatibility
- Review TypeScript configuration
- Prepare Vercel deployment
Endpoint: /api/ocr-summarize
Method: GET / POST
Input: Public PDF URL (fileURL)
Output: JSON with extracted text
This project does not include a frontend or UI layer. It is designed for backend-to-backend and automation-based consumption via n8n, Postman, or Curl.
| Status | Description |
|---|---|
| 200 | Successful extraction |
| 400 | Invalid or missing fileURL |
| 500 | Internal server error |
- Textual PDF parsing
- Serverless execution
- Automation-friendly JSON output
- No paid OCR dependencies
- Runtime: Vercel Serverless Functions (Node.js 18)
- Language: TypeScript
- PDF Parsing: pdf-parse
- HTTP Client: node-fetch
- Deployment: Vercel
Client / n8n
|
v
Vercel Serverless Function
|
v
pdf-parse
|
v
JSON Response
- Receive HTTP request
- Validate input parameters
- Fetch PDF from URL
- Parse text using pdf-parse
- Return structured JSON response
| ID | Area | Command | Expected Output | Explanation |
|---|---|---|---|---|
| T-01 | API | GET with valid PDF | 200 + text | Valid textual PDF |
| T-02 | API | Missing fileURL | 400 error | Validation check |
- Input validation enforced
- Error handling implemented
- Automation compatibility verified
- Ensure PDF is publicly accessible
- Confirm PDF is text-based
- Check Vercel function logs
- Deploy to Vercel
- Copy endpoint URL
- Provide public PDF URL
- Receive extracted text
- Lightweight dependencies
- Fast cold-start execution
- Performance dependent on PDF size
- OCR for scanned PDFs
- AI summarization layer
- Chunking and vector storage
- Production-ready serverless microservice
- Zero-cost alternative for textual PDF extraction
- Automation-first design
The service acts as an independent extraction node within the Personal AI Factory, feeding structured text into downstream automation and AI systems.
ocr-summarizer-microservice/ โโโ api/ โ โโโ ocr-summarize.ts โโโ types/ โ โโโ pdf-parse.d.ts โโโ node_modules/ โโโ package.json โโโ tsconfig.json โโโ README.md โโโ .gitignore
- Deploy to Vercel
- Send GET request with PDF URL
- Observe extracted text response