Skip to content

Imhari14/Med-Rag

Repository files navigation

🏥 Clinical Trial RAG Evaluation System

A comprehensive RAG (Retrieval-Augmented Generation) evaluation system with real-time metrics, document chat, and performance analysis using Google Gemini API and DeepEval.

Features

  • 🤖 Google Gemini 2.0 Flash-Lite: Latest Gemini model for fast, accurate responses
  • 📊 Real-time Evaluation: DeepEval metrics with comprehensive analysis
  • 📄 Document Processing: Upload and chat with PDF documents
  • 🧭 Multi-page Navigation: Separate sections for chat, logs, and settings
  • 📈 Performance Analytics: Track and compare evaluation metrics over time
  • 🎯 Top-K Optimization: Find optimal retrieval parameters
  • 💬 Chat Memory: Maintains conversation history and context
  • 📚 Source References: Shows document chunks used for answers

Quick Start

  1. Install dependencies:

    pip install -r requirements.txt
  2. Run the application:

    streamlit run app.py
  3. Get your Gemini API key:

    • Go to Google AI Studio
    • Create a new API key
    • Enter it in the sidebar of the app
  4. Start using the system:

    • Navigate between Chat, Evaluation Logs, and Settings
    • Upload PDF documents for analysis
    • Enable real-time evaluation to track performance
    • Ask questions and view comprehensive metrics

Navigation System

💬 Chat & Evaluation

  • Interactive chat interface with document Q&A
  • Real-time evaluation metrics (Answer Relevancy, Faithfulness, etc.)
  • Source attribution and chunk analysis
  • Configurable Top-K retrieval settings

📊 Evaluation Logs

  • Comprehensive performance analytics
  • Historical evaluation data with trends
  • Top-K performance comparison
  • CSV export functionality
  • Performance recommendations

⚙️ Settings

  • System configuration overview
  • Data management (clear history, logs, documents)
  • API status monitoring
  • System information and statistics

How It Works

  1. Document Processing: PDFs are processed into text chunks with embeddings
  2. Smart Retrieval: Finds relevant chunks using semantic similarity
  3. Response Generation: Gemini generates context-aware responses
  4. Real-time Evaluation: DeepEval metrics assess response quality
  5. Performance Tracking: Logs and analyzes evaluation results over time

Evaluation Metrics

  • Answer Relevancy: Measures how relevant the response is to the question
  • Faithfulness: Checks if the response aligns with the provided context
  • Contextual Relevancy: Evaluates context relevance to the question
  • Contextual Recall: Measures context completeness (auto-generated expected output)

Configuration

  • API Key: Enter your Gemini API key in the sidebar
  • Document Upload: Upload PDFs for analysis
  • Evaluation Settings: Enable/disable metrics and set thresholds
  • Top-K Settings: Adjust number of retrieved chunks (1-20)

Technical Stack

  • LLM: Google Gemini 2.0 Flash-Lite
  • Evaluation: DeepEval framework
  • Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
  • Vector Database: Qdrant (in-memory)
  • Framework: Streamlit with custom CSS
  • Document Processing: PyPDF2

Project Structure

├── app.py                    # Main Streamlit application
├── document_processor.py     # PDF processing and text extraction
├── embedding_generator.py    # Embedding generation utilities
├── rag_evaluator.py         # DeepEval integration and metrics
├── rate_limiter.py          # API rate limiting
├── vector_database.py       # Qdrant vector database operations
├── requirements.txt         # Python dependencies
└── README.md               # This file

Troubleshooting

Common Issues

  1. API Overload (503 errors): Gemini API is experiencing high load

    • Wait 2-3 minutes and try again
    • Use fewer evaluation metrics
    • Try during off-peak hours
  2. Invalid API Key (400 errors): Check your Gemini API key

    • Verify the key is correct in the sidebar
    • Ensure the key has proper permissions
  3. Rate Limiting (429 errors): Too many requests

    • Built-in rate limiting should prevent this
    • Wait before making more requests

Performance Tips

  • Start with fewer evaluation metrics to test the system
  • Use Top-K values between 3-7 for optimal performance
  • Upload smaller documents initially to test functionality
  • Monitor the Evaluation Logs page for performance insights

Advanced Features

  • Auto-generated Expected Output: For contextual recall evaluation
  • Performance Comparison: Compare different Top-K values
  • Export Functionality: Download evaluation logs as CSV
  • 3D UI Effects: Modern, professional interface design
  • Dual Color System: Light navigation, dark content for optimal UX#

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages