Skip to content

eyanpen/agentic-rag-2

Β 
Β 

Repository files navigation

πŸš€ Multilingual Agentic RAG System

A production-ready Retrieval-Augmented Generation (RAG) system with multilingual support and agentic architecture using open-source LLM models.

https://github.com/AdityaJ9801/Multilingual-Agentic-RAG/blob/5c18ef9cf0da327051cd6a106a7e4f29762abb14/Agentic_RAG_demo_video.webm

✨ Key Features

🌍 Multilingual Support

  • βœ… 5 languages: English, Spanish, French, Chinese, Arabic
  • βœ… Automatic language detection
  • βœ… Multilingual embeddings
  • βœ… Responses in query language

πŸ€– Agentic Architecture

  • Router Agent: Routes queries to specialized handlers
  • Retrieval Agent: Vector search and document retrieval
  • Synthesis Agent: Generates responses using LLM
  • Validation Agent: Fact-checking and quality validation
  • Orchestrator pattern for agent collaboration

πŸ“¦ Production Ready

  • βœ… Fully tested (5/5 tests passed)
  • βœ… Docker containerized
  • βœ… Streamlit web interface
  • βœ… REST API with FastAPI
  • βœ… Vector database (Qdrant)
  • βœ… Local LLM (Ollama)

πŸ“‹ Prerequisites

  • βœ… Docker & Docker Compose (v20.10+)
  • βœ… Python 3.9+
  • βœ… 8GB RAM minimum (16GB recommended)
  • βœ… 20GB disk space
  • βœ… Linux/macOS or Windows with WSL2

πŸš€ Quick Start (5 Minutes)

Step 1: Clone Project

git clone  https://github.com/AdityaJ9801/Multilingual-Agentic-RAG.git
cd Multilingual-Agentic-RAG

Step 2: Start Services

docker-compose up -d
sleep 60  # Wait for services to initialize

Step 3: Ingest Sample Data

bash scripts/ingest_sample_data.sh

Step 4: Install Streamlit

pip install -r streamlit_requirements.txt

Step 5: Launch Application

streamlit run streamlit_app.py

Step 6: Access Application

πŸ“– Usage

Via Streamlit Interface (Recommended)

  1. Open http://localhost:8501
  2. Go to Query tab
  3. Enter your query in any language
  4. Click Submit
  5. View results with sources

Via API

Query the System:

curl -X POST "http://localhost:8000/api/v1/query" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is machine learning?",
    "language": "en",
    "top_k": 5,
    "include_sources": true
  }'

Upload Documents:

curl -X POST "http://localhost:8000/api/v1/ingest" \
  -F "file=@document.txt"

List Documents:

curl http://localhost:8000/api/v1/documents

Check Health:

curl http://localhost:8000/api/v1/health

Configuration

Edit .env file to customize:

  • OLLAMA_MODEL: LLM model to use (mistral, llama2, etc.)
  • OLLAMA_TEMPERATURE: Response creativity (0.0-1.0)
  • CHUNK_SIZE: Document chunk size in characters
  • SUPPORTED_LANGUAGES: Comma-separated language codes
  • EMBEDDING_MODEL: Multilingual embedding model

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    FastAPI Gateway                       β”‚
β”‚              (REST API, Request Validation)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                         β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
   β”‚ Ingestion β”‚          β”‚ Query Engine β”‚
   β”‚ Pipeline  β”‚          β”‚ (Orchestrator)
   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜          β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
        β”‚                        β”‚
        β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚    β”‚                    β”‚                    β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β–Όβ”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”
   β”‚  Document  β”‚  β”‚  Router  β”‚ β”‚ Retrieval β”‚ β”‚Synthesisβ”‚
   β”‚ Processor  β”‚  β”‚  Agent   β”‚ β”‚  Agent    β”‚ β”‚ Agent   β”‚
   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
        β”‚                                           β”‚
        β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
        β”‚    β”‚                                      β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”
   β”‚  Embeddings        β”‚              β”‚ Validation   β”‚
   β”‚  (Sentence-Trans)  β”‚              β”‚ Agent        β”‚
   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  Vector Database  β”‚
   β”‚  (Qdrant)         β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  LLM Service     β”‚
   β”‚  (Ollama)        β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Supported File Formats

  • PDF: .pdf (via pdfplumber and PyPDF2)
  • Text: .txt (UTF-8, Latin-1, CP1252)
  • Markdown: .md
  • JSON: .json
  • CSV: .csv

API Endpoints

Method Endpoint Description
POST /api/v1/ingest Upload and process documents
POST /api/v1/query Submit queries and get responses
GET /api/v1/documents List ingested documents
DELETE /api/v1/documents/{id} Remove a document
GET /api/v1/health Health check
GET /api/v1/agents/status Agent status

πŸ› Troubleshooting

Port Already in Use

docker-compose down
docker-compose up -d

Services Not Starting

docker-compose logs
docker-compose restart

Streamlit Connection Error

# Verify API is running
curl http://localhost:8000/api/v1/health

# Check Streamlit logs in terminal

No Documents Found

# Re-ingest sample data
bash scripts/ingest_sample_data.sh

Slow Responses

  • Check Docker resources: docker stats
  • Verify Ollama is running: docker-compose ps
  • Reduce top_k parameter in queries

πŸ“ Project Structure

multi_agentic_rag/
β”œβ”€β”€ app/                    # Application code
β”œβ”€β”€ scripts/                # Helper scripts
β”œβ”€β”€ sample_data/            # Sample documents
β”œβ”€β”€ streamlit_app.py        # Streamlit frontend
β”œβ”€β”€ docker-compose.yml      # Docker configuration
β”œβ”€β”€ requirements.txt        # Dependencies
β”œβ”€β”€ INSTALLATION_GUIDE.md   # Installation steps
β”œβ”€β”€ ARCHITECTURE.md         # System design
β”œβ”€β”€ API_DOCS.md            # API documentation
└── README.md              # This file

πŸ›‘ Stopping Services

# Stop all services
docker-compose down

# Stop and remove volumes
docker-compose down -v

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 87.8%
  • Shell 10.9%
  • Dockerfile 1.3%