A RAG-based DSA code generator using LangChain, Gemini/Ollama, and FAISS with hybrid search (BM25 + FAISS Ensemble). Uploads DSA PDFs, indexes them, and generates structured solutions (Brute Force → Improved → Optimal).
- Python 3.10+
- Streamlit - Web UI
- LangChain - RAG framework
- Ollama (Qwen / Mistral / Llama) - Local LLM inference
- Google Gemini API (optional) - Cloud LLM support
- FAISS - Vector database for semantic search
- BM25 (rank_bm25) - Keyword-based search
- Ensemble Retriever - Hybrid search combining BM25 and FAISS
- Automated Evaluation Framework - Retrieval & Generation scoring pipeline
- Hybrid Search: Combines keyword-based (BM25) and semantic-based (FAISS) retrieval for improved accuracy
- Multi-Language Support: Supports C++, Java, and Python DSA problems
- Structured Solutions: Generates three approaches - Brute Force, Improved, and Optimal
- RAG Pipeline: Uses relevant context from DSA PDFs to generate grounded solutions
- Local LLM Support: Fully offline inference using Ollama models
- Automated Evaluation: Built-in benchmarking system to measure retrieval and generation quality
- Streamlit UI: Clean and intuitive web interface
This project includes a fully automated benchmarking pipeline to measure real RAG performance instead of manual inspection.
1️⃣ Retrieval Evaluation
- Dataset of 50+ DSA problems
- Hybrid retriever returns top-k context
- Checks if important algorithmic keywords exist in retrieved chunks
- Produces Retrieval Hit Rate
Result: 📌 Retrieval Accuracy ≈ 79%
2️⃣ Generation Evaluation A separate judge LLM evaluates generated answers on:
- Relevance to the question
- Faithfulness to retrieved context
- Completeness of explanation
- Clarity of response
Each scored from 1 → 5
Average Scores
| Metric | Score |
|---|---|
| Relevance | 4.16 / 5 |
| Faithfulness | 4.17 / 5 |
| Completeness | 4.03 / 5 |
| Clarity | 4.19 / 5 |
📌 Overall grounded response quality ≈ 78%
Dataset → Retriever → Context → Student LLM → Generated Answer → Judge LLM → Scoring → Report
This allows repeatable and objective RAG benchmarking.
- Python 3.10 or higher
- Ollama installed (recommended)
- (Optional) Google Gemini API key
git clone https://github.com/Devamsingh09/AI-DSA-Assistant.git
cd AI-DSA-Assistant
pip install -r requirements.txt(Optional — Gemini)
GOOGLE_API_KEY=your_api_key_herestreamlit run app/main.pyThe app will be available at http://localhost:8501
AI-DSA-Assistant/
├── app/
│ ├── indexer.py # Creates FAISS and BM25 indexes
│ ├── rag_engine.py # Hybrid search implementation
│ ├── setup.py # Configuration and constants
│ └── main.py # Streamlit UI
│
├── evaluator/
│ ├── dataset.json # 50+ DSA benchmark questions
│ ├── retrieval_eval.py # Retrieval quality scoring
│ ├── generation_eval.py# LLM judge scoring
│ └── run_all.py # Complete evaluation pipeline
│
├── data/pdfs/ # DSA documents
├── faiss_indexes/ # Generated vector indexes
├── requirements.txt
└── README.md
-
Indexing
- PDFs split into chunks
- Stored in FAISS (semantic) + BM25 (keyword)
-
Hybrid Retrieval
- Ensemble Retriever combines both methods
-
Context Generation
- Top relevant chunks retrieved
-
Code Generation
- Local LLM generates structured solution
-
Evaluation (Research Mode)
- Retriever accuracy measured
- LLM judged by another LLM
- Produces automated performance report
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- LangChain for the RAG framework
- Google for the Gemini API
- Ollama for local LLM inference
- FAISS for efficient vector search
- Streamlit for the web interface