A sophisticated AI-powered personal assistant backend designed to provide seamless access to your professional profile through intelligent document retrieval, context-aware responses, and proactive suggestions.
- Intelligent Agent (LangGraph)
Retrieves relevant information from your documents and alerts you when knowledge gaps are detected. - Suggestions Engine
Generates follow-up or suggested questions to guide more effective interactions. - Document Processing
Advanced OCR and text pipeline supporting multiple file formats with smart chunking. - Vector Search
Efficient semantic retrieval with ChromaDB. - Notifications
Automatic email alerts sent via SMTP when missing or incomplete information is found. - File Management
Full CRUD support for documents in the vector database. - API Tracing & Monitoring
Powered by Opik for observability and debugging.
- Framework: FastAPI
- Agent: LangGraph
- Database: PostgreSQL + ChromaDB (vector storage)
- Cache / Rate Limiting: Redis
- Authentication: Clerk
- OCR: Docling + EasyOCR
- Monitoring: Opik
- Language: Python
- Email Notifications: SMTP
- File Upload – PDF files processed with Docling for OCR extraction.
- Markdown Conversion – Unified content representation.
- Smart Chunking – Content split by headings and intelligently chunked with an LLM.
- Context Enhancement – Each chunk enriched with surrounding context.
- Vector Embedding – Enhanced chunks stored in ChromaDB for semantic retrieval.
- Query Processing – User queries handled by the LangGraph agent.
- Vector Search – Relevant embeddings retrieved from ChromaDB.
- Response Generation – Professional, contextually accurate responses produced.
- Suggestions Node – Asynchronous node that generates question suggestions to improve user interaction.
- Notification System – Missing information triggers email notifications via SMTP.
flowchart TD
U[User Query] --> Q[LangGraph Agent]
Q --> R[Vector Search in ChromaDB]
R --> S[Response Generation]
S --> T[User Response]
Q --> SG[Suggestions Node]
SG --> T
Q --> N[Missing Info Detected]
N --> EML[Email Notification via SMTP]
The CVBot Backend exposes modular routes under /api/v1:
-
/vector-store – Manages documents in the vector DB.
POST /vector-store/store-files: Upload & process documents.POST /vector-store/semantic-search: Perform semantic search.GET /vector-store/retrieve-files: List processed files.GET /vector-store/retrieve-embeddings?filename=...: Retrieve embeddings for a file.DELETE /vector-store/delete-files: Remove files.
-
/projects – Manage portfolio projects.
GET /projects: List all projects.POST /projects: Add new project.DELETE /projects/{project_id}: Delete project by ID.
-
/chatbot – AI chatbot interaction.
POST /chatbot/invoke: Send a message to the AI agent. (Rate limited: 20 req/min per user)GET /chatbot/history?session_id=...: Retrieve session history.
vector-store and projects routes are secured with Clerk authentication.
- Tracing & monitoring with Opik for reliable performance and debugging.
- PDF: Processed with Docling (OCR support).
- Markdown: Directly processed.
Chunking Strategy (Inspired by Anthropic's Contextual Retrieval)
- Heading-based Split – Structural splitting.
- LLM-powered Chunking – Optimized chunk sizes.
- Context Addition – Enriched with surrounding text.
- Semantic Embedding – Converted to vectors for similarity search.
The LangGraph-powered agent provides:
- Semantic search through professional documents.
- Contextual answers about experience & skills.
- Identification of knowledge gaps with SMTP email alerts.
- Conversation context maintenance for seamless dialogue.
- Professional, tailored responses.
- Suggested follow-up questions for improved engagement.
- Authentication: Clerk-based authentication.
- API Security: Redis-based rate limiting & strict input validation.
- Navigate to the
deploymentfolder. - Copy
.env.example→.envand configure environment variables (incl. Redis & SMTP). - Run
docker compose upto start services (includes Redis). - Containers initialize; NGINX exposes port
80→ host8081(configurable). - Access via
localhost:8081or a reverse proxy.
Originally built by Kaloyan Stefanov.