About | Features | Technologies | Architecture | Requirements | Starting | License | Contact
RAG AI-Agent is an intelligent question-answering system built on Retrieval-Augmented Generation (RAG) technology, with LangChain and vector databases at its core. The system enables users to interact conversationally with their documents by uploading files (PDFs, TXTs, etc.), which are semantically parsed, embedded, and indexed for natural language queries. When a user asks a question, the system dynamically retrieves the most relevant information from the vector database and leverages advanced language models to generate accurate, contextual responses.
A key feature of this project is its integration with LangChain's ZERO_SHOT_REACT_DESCRIPTION agent type, which empowers the Agent mode. This agent can reason step-by-step, use external tools (like calculators, web search, and date/time), and transparently show its thought process—making the system highly extensible and explainable. The vector database (ChromaDB) ensures efficient semantic search and retrieval, forming the backbone of the RAG pipeline.
The project is full-stack: it combines a FastAPI backend (for API, document processing, vector storage, and agent orchestration) with a modern React frontend (for chat UI, file upload, and conversation management), delivering a seamless and interactive user experience.
- Document Processing: Upload and process PDF, TXT, and DOCX files
- Semantic Search: Find relevant information within your documents using natural language queries
- RAG-based Responses: Generate contextual answers by combining retrieved information with AI capabilities
- Chat Interface: Intuitive conversation-based UI with chat history management(rename, delete, and save conversations)
- Step-by-step Reasoning: View the agent's thought process and steps taken to answer complex questions
- Tool Integration: Access to calculator, current time, and web search tools
- Transparent Decision Making: See exactly how the agent formulates responses
- Document Upload: Upload documents directly in chat conversations
- Vector Storage: Efficient retrieval using semantic vector embeddings
- Context Preservation: Documents are linked to specific conversations
- Chat History: Save and load conversation history
- Chat Management: Create, rename and delete conversations
- User-friendly Interface: Clean, responsive design with intuitive controls
- Dark Mode: Toggle between light and dark themes for better readability
The following tools and frameworks were used in this project:
-
Backend:
- FastAPI - Modern, high-performance web framework for building APIs
- LangChain - Framework for developing applications powered by language models
- ChromaDB - Vector database for storing and retrieving embeddings
- SQLite - Database for storing chat and message information
- OpenAI - Language models and embedding generation
- SQLAlchemy - SQL toolkit and Object-Relational Mapping
-
Frontend:
- React - A JavaScript library for building user interfaces
- React Bootstrap - Bootstrap components built with React
- React Router - Routing for React applications
- Axios - Promise-based HTTP client
- React Icons - Popular icons for React projects
- React Toastify - Toast notifications for React
- Vite - Next generation frontend tooling
-
Document Processing:
- PyPDF - PDF processing
- Unstructured - Document parsing (markdown, etc.)
- LangChain Document Loaders - Various document loading utilities
The application follows a client-server architecture with the following components:
-
Frontend (React):
- Chat interface with conversation management
- File upload functionality
- Agent mode toggle
- Reasoning step visualization
-
Backend (FastAPI):
- API endpoints for chat and message management
- Document processing pipeline
- Vector database integration
- Agent execution logic
-
Vector Database (ChromaDB):
- Stores document embeddings
- Enables semantic search for relevant content
-
Agent System:
- Retrieves relevant context from the vector database
- Uses tools like calculator, web search, and date-time
- Generates responses with reasoning steps
-
Data Storage:
- SQLite database for storing chat history and messages
- File system for document storage
- The pipeline for processing user queries is as follows:
- first the user uploads a document, then the system processes the document and splits it into chunks. The chunks are then embedded and stored in the vector database. When the user asks a question, the system retrieves relevant chunks from the vector database based on the query. The retrieved chunks are then used to generate a response using the OpenAI language model. The response is displayed to the user in the chat interface.
- User Input: User submits a question or query through the chat interface.
- Document Retrieval: The system retrieves relevant documents from the vector database based on the query.
- Context Generation: The retrieved documents are processed to generate context for the query.
- Agent Execution: The agent uses the context to generate a response, optionally using reasoning steps and tools.
- Response Generation: The agent generates a response based on the context and reasoning steps.
- Response Display: The response is displayed in the chat interface, along with any reasoning steps taken by the agent.
- User Interaction: The user can continue the conversation, ask follow-up questions, or upload new documents.
Before starting, ensure you have the following installed:
- Python 3.11
- Node.js 16+ and npm
- OpenAI API key
- Git
# Clone this project
$ git clone https://github.com/romanyn36/RAG-AI-Agent.git
# Navigate to the project directory
$ cd RAG-AI-Agent
# Create a virtual environment
$ python -m venv venv
# Activate the virtual environment
$ source venv/bin/activate # For Linux/Mac
$ venv\Scripts\activate # For Windows
# Install backend dependencies
$ pip install -r requirements.txt
# Create a .env file with your OpenAI API key
$ echo "OPENAI_API_KEY=your_api_key_here" > .env
# Start the backend server
$ uvicorn app:app --reload
# In a separate terminal, navigate to the frontend directory
$ cd agent-frontend
# Install frontend dependencies
$ npm install
# set the environment variable for the backend URL
$ echo "VITE_API_URL=http://127.0.0.1:8000" > .env
# Start the development server
$ npm run dev
# The frontend will be available at http://localhost:5173
# The backend API will be available at http://localhost:8000You can adjust the following settings:
-
In
vector_database.py:- Change
chunk_sizeandchunk_overlapfor document splitting - Adjust
similarity_thresholdfor relevance filtering
- Change
-
In
agent.py:- Modify
PROMPT_TEMPLATEto change how responses are generated - Add or remove tools from the agent
- Modify
RAG AI-Agent is actively under development, and I’m working on exciting new features to make it even more powerful:
Login & Authentication: Secure user access with robust authentication.
User Registration: Onboard new users seamlessly with a registration system.
Personalized Chat History: Store and manage conversation history for each user.
Expanded File Support: Add compatibility for more file types to enhance document processing capabilities.
This project is licensed under the MIT License. For more details, see the LICENSE file.
-
Made by Romani – an AI Engineer and Backend Developer. Feel free to reach out for collaborations, questions, or new projects! You can contact me via email: romani.nasrat@gmail.com
-
You can also find me on:




