Skip to content

anass01/rag

Repository files navigation

Simple RAG App with Next.js and LangChain

A simple Retrieval-Augmented Generation (RAG) application built with Next.js and LangChain that allows you to upload documents (PDF, text files) and query them using OpenAI's GPT models or local LLMs via LM Studio.

Features

  • 📄 Upload PDF and text documents
  • 🔍 Automatic document chunking and vectorization
  • 💬 Chat interface for querying documents
  • 🧠 Powered by OpenAI embeddings and GPT models, or local LLMs via LM Studio
  • 🎨 Clean, modern UI
  • 🔒 Run completely locally with LM Studio (including embeddings)

Prerequisites

  • Node.js 18+ and npm/yarn/pnpm
  • Option 1: OpenAI API key (Get one here)
  • Option 2: LM Studio for running LLMs and embeddings locally (free, fully local, no API key required)
  • Python 3.8+ (for ChromaDB server)
  • pip (Python package manager)

Setup

  1. Install dependencies:
npm install
# or
yarn install
# or
pnpm install
  1. Configure environment variables:

Create a .env.local file in the root directory:

cp example.env .env.local

Option 1: Using OpenAI API

Edit .env.local and add your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

Option 2: Using LM Studio (Local LLMs - Fully Local)

  1. Install and launch LM Studio
  2. Download models from the LM Studio interface:
    • A chat model (e.g., Llama 2, Mistral, etc.) for generating responses
    • An embedding model for document embeddings (check LM Studio's supported embedding models)
  3. Start the local server in LM Studio (usually runs on http://localhost:1234)
  4. Configure .env.local:
OPENAI_API_KEY=lm-studio  # Can be any value, LM Studio doesn't validate this
OPENAI_BASE_URL=http://localhost:1234/v1
OPENAI_MODEL=openai/gpt-oss-20b  # Example: The chat model you loaded in LM Studio
OPENAI_EMBEDDINGS_MODEL=lmstudio-community/embeddinggemma-300m-qat-GGUF  # Example: The embedding model in LM Studio

Note: LM Studio supports OpenAI-compatible APIs for both chat and embeddings, allowing you to run the entire RAG pipeline locally without any external API calls.

See example.env for all available configuration options including optional ChromaDB settings.

  1. Start ChromaDB server:

The application uses ChromaDB for persistent vector storage. You need to run a ChromaDB server locally.

Option 1: Using Python (Recommended)

Install ChromaDB Python package:

pip install chromadb

Start the ChromaDB server:

chroma run --path .chroma --port 8000

This will start ChromaDB server on localhost:8000 and persist data in the .chroma directory.

Option 2: Using Docker

docker run -d -p 8000:8000 -v $(pwd)/.chroma:/chroma/chroma chromadb/chroma

Note: Make sure ChromaDB server is running before starting the Next.js application.

  1. Run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
  1. Open your browser:

Navigate to http://localhost:3000

Usage

  1. Upload a document:

    • Click "Choose File" and select a PDF or text file (.pdf, .txt, .md)
    • Click "Upload" to process the document
    • Wait for the confirmation message
  2. Ask questions:

    • Type your question in the chat input
    • Press Enter or click "Send"
    • The app will search through your documents and provide an answer with sources

How It Works

  1. Document Processing:

    • Documents are parsed (PDFs using pdf-parse, text files directly)
    • Text is split into chunks using LangChain's RecursiveCharacterTextSplitter
    • Each chunk is converted to embeddings using the configured embedding model (OpenAI or LM Studio)
    • Embeddings are stored in ChromaDB (persistent vector database)
  2. Query Processing:

    • User queries are converted to embeddings using the configured embedding model (OpenAI or LM Studio)
    • Similarity search finds the most relevant document chunks
    • Context is passed to the configured LLM (OpenAI or local via LM Studio) along with the query
    • Response is generated and displayed with source citations

Tech Stack

  • Next.js 14+ - React framework with App Router
  • LangChain - LLM orchestration framework
  • OpenAI - Embeddings and GPT models (or LM Studio for local LLMs)
  • LM Studio - Optional local LLM server with OpenAI-compatible API
  • ChromaDB - Persistent vector database for local storage
  • TypeScript - Type safety
  • pdf-parse - PDF document parsing

Project Structure

├── app/
│   ├── api/
│   │   ├── upload/route.ts    # Document upload endpoint
│   │   └── chat/route.ts       # Chat query endpoint
│   ├── layout.tsx              # Root layout
│   ├── page.tsx                # Main UI component
│   └── globals.css             # Global styles
├── lib/
│   ├── documents.ts            # Document processing utilities
│   └── vectorStore.ts          # Vector store management
├── package.json
├── tsconfig.json
└── README.md

Limitations

  • Requires ChromaDB server to be running (see setup instructions above)
  • Maximum file size is limited by Next.js (configured to 10MB)
  • ChromaDB server must be started separately before running the application
  • When using LM Studio: Both chat and embedding models must be loaded and the server must be running before querying
  • Ensure LM Studio's embedding model is compatible with OpenAI's embedding API format

License

MIT

About

simple rag can run fully locally

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published