Skip to content

Retrieval Augmented Generation (RAG) chatbot system that allows you to ask questions about your documents using local LLM models (Ollama) or cloud models (OpenRouter).

Notifications You must be signed in to change notification settings

zakantonio/python-rag-langchain

Repository files navigation

RAG Chatbot System

A Retrieval Augmented Generation (RAG) chatbot system that allows you to ask questions about your documents using local LLM models (Ollama) or cloud models (OpenRouter).

Features

  • 🤖 LLM Support: Ollama (local) and OpenRouter (cloud) models
  • 📚 Document Processing: PDF, TXT, MD and other text formats
  • 🔗 PDF Linking: Clickable PDF links with navigation to specific pages
  • 🗂️ Vector Stores: Management of multiple document collections
  • 💬 Web Interface: Complete Gradio interface with sidebar for sources
  • 🔍 Source Citations: References to original documents with metadata
  • 🚀 REST API: FastAPI backend with comprehensive documentation
  • 💭 Conversation Memory: Automatic conversation history tracking
  • 📊 System Monitoring: Real-time status dashboard

Quick Installation

1. Environment Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Configuration

Create .env file:

# OpenRouter (for cloud models)
OPENROUTER_API_KEY=your_api_key_here

# Ollama (for local models)
OLLAMA_BASE_URL=http://localhost:11434
DEFAULT_LOCAL_MODEL=llama3

3. Ollama Setup (Optional)

# Install Ollama from https://ollama.ai
ollama pull llama3
ollama pull nomic-embed-text  # For embeddings

Startup

Automatic Method (Recommended)

source venv/bin/activate
python start_system.py

Access:

Manual Method

# Terminal 1 - API
python start_api.py

# Terminal 2 - UI
python start_ui.py

Usage

1. Create Vector Store

  • Go to "Vector Store Management"
  • Create a new store with descriptive name

2. Upload Documents

  • Go to "Document Management"
  • Select file and vector store
  • Click "Upload Document"

3. Chat

  • Go to "Chat"
  • Select vector store and model type
  • Start asking questions!

4. File Linking

When asking questions about documents:

  • Sources appear in the right sidebar
  • File links are clickable and open the document

API Endpoints

Core Endpoints

  • POST /api/v1/chat - Chat with documents
  • POST /api/v1/upload-document - Upload document
  • GET /api/v1/documents/file/{file_path} - Serve PDF file (with optional page parameter)

Vector Store Management

  • GET /api/v1/vector-stores - List vector stores
  • POST /api/v1/vector-stores - Create vector store
  • DELETE /api/v1/vector-stores/{name} - Delete vector store
  • GET /api/v1/vector-stores/{name}/info - Get store information and status

System & History

  • GET /api/v1/conversation/history - Retrieve conversation history
  • DELETE /api/v1/conversation/history - Clear conversation history
  • GET /api/v1/status - System status and component health

Testing

Test Runner (Recommended)

# Run all available tests
python run_tests.py

Individual Tests

# Basic unit tests
python -m pytest tests/test_basic.py -v

# System component tests
python tests/test_real_system.py

# End-to-end API tests (requires running API)
python tests/test_end_to_end.py

Troubleshooting

Ollama not connected:

ollama serve  # Start Ollama
ollama list   # Check models

Embeddings error:

ollama pull nomic-embed-text

Vector store not found:

  • Create the vector store first in the interface

Project Structure

├── app/
│   ├── api/          # FastAPI endpoints
│   ├── core/         # Configuration and settings
│   ├── models/       # Data schemas and models
│   ├── services/     # Business logic and services
│   └── ui/           # Gradio web interface
├── tests/            # Automated tests
├── vector_stores/    # Vector databases (created automatically)
├── documents/        # Saved PDF files (created automatically)
└── requirements.txt  # Python dependencies

Additional Features

Web Interface

  • System Status Tab: Monitor system components and configurations
  • Conversation History: View and manage chat history with automatic tracking
  • Performance Metrics: Real-time system status and component health

Document Processing

  • Intelligent Text Splitting: Advanced text chunking for optimal retrieval
  • Multiple Format Support: PDF, TXT, MD and other text formats
  • Automatic Metadata: Extraction of document metadata and source information

About

Retrieval Augmented Generation (RAG) chatbot system that allows you to ask questions about your documents using local LLM models (Ollama) or cloud models (OpenRouter).

Resources

Stars

Watchers

Forks

Languages