Skip to content

A domain-specific knowledge base built with Retrieval-Augmented Generation (RAG). Upload documents and interact with them through a conversational AI interface that provides accurate, context-aware responses with source citations.

License

Notifications You must be signed in to change notification settings

gtraskas/knowledge-rag

Repository files navigation

KnowledgeRAG

A modern document knowledge base with conversational AI capabilities, built using a Retrieval-Augmented Generation (RAG) architecture.

Features

  • 📄 Document ingestion (PDF)
  • 🔍 Semantic search across all documents
  • 💬 Conversational interface for natural queries
  • 🧠 Context-aware responses with citations
  • 🚀 FastAPI backend for high performance

Tech Stack

  • Backend: FastAPI, Python 3.11+
  • RAG Framework: LangChain
  • Vector Database: FAISS
  • LLM Integration: OpenAI API (configurable for other providers)
  • Frontend: Streamlit
  • Deployment: Docker, optionally with docker-compose

Quick Start

  1. Clone the repository

    git clone https://github.com/gtraskas/knowledge-rag.git
    cd knowledge-rag
  2. Set up environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Set up environment variables

    Create a .env file in the root directory and add the following variables:

    OPENAI_API_KEY=your_openai_api_key
  4. Run the application

    bash run.sh

Usage Instructions

Supported File Types

The application supports the following file types for document ingestion:

  • PDF

Uploading Files

  1. Navigate to the upload section of the application.
  2. Drag and drop your files or use the file picker to upload documents.
  3. The application will process and index the documents for semantic search.

Querying the Knowledge Base

  1. Use the conversational interface to ask natural language questions.
  2. Example queries:
    • "What is the main topic of the document titled 'Project Plan'?"
    • "Summarize the key points from the uploaded PDF."
    • "Find all references to 'machine learning' in the documents."
  3. The application will provide context-aware responses with citations to the source documents.

Project Structure

knowledge-rag/
├── Dockerfile
├── frontend.py
├── index.html
├── LICENSE
├── main.py
├── README.md
├── requirements.txt
├── run.sh
├── TODO.md
├── static/
│   └── favicon.ico
└── uploaded_files/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Running with Docker

To run the application using Docker:

  1. Build and start the containers:

    docker-compose up --build
  2. Access the application:

  3. Stop the containers:

    docker-compose down

About

A domain-specific knowledge base built with Retrieval-Augmented Generation (RAG). Upload documents and interact with them through a conversational AI interface that provides accurate, context-aware responses with source citations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published