Re:You is an AI-powered semantic search and code understanding engine that lets developers recall past implementations, retrieve code snippets, explore commit history, and understand features across repositories — all through natural language queries.
It acts as a long-term memory layer for developers and teams.
- 🔍 Semantic Search Across Repos — Search by meaning, not just keywords.
- 🧠 Contextual Q&A — Ask “How does login work?” and get structured code answers.
- 📄 Code Snippet Retrieval with file paths and metadata.
- 🕒 Commit Insights — Understand how features changed over time.
- 📚 Embeddings-Based Indexing of functions, classes, and commits.
- ⚡ RAG Pipeline using Groq LLM + Chroma vector store.
- GitHub repo cloning (local for MVP)
- Code extraction using AST (Python) & regex (JS)
- Commit extraction using Git
- Chunking functions/classes with metadata
- Embeddings via MiniLM (current)
- ChromaDB as vector store
- Metadata stored alongside chunks
- Hybrid retrieval & reranking (future)
- Retrieval-Augmented Generation (RAG)
- LLM: Groq API (DeepSeek LLaMA model)
- Structured answers with citations
- CLI demo interface (frontend coming soon)
python -m venv venvsource venv/Scripts/activateIf using Linux/macOS:
source venv/bin/activateOnce your virtual environment is active:
pip install -r requirements.txtCreate a .env file in the project root:
GROQ_API_KEY=your_key_here
GITHUB_ACCESS_TOKEN=your_github_access_tokenpython embeddings/store_embeddings.pypython qa/qa_service.pydevmemory/
│── extraction/ # Code + commit extraction
│── embeddings/ # Chunking + vector generation
│── qa/ # RAG pipeline / answer generation
│── retrieval/ # Retrieval logic (vector search)
│── ingestion/ # Repo ingestion + parsing
│── vector_store/ # Auto-generated embeddings DB
│── data/repo/ # Your cloned GitHub repo
│── README.md
│── requirements.txt