It's a multi-agent system that answers questions by routing queries to the most appropriate knowledge source: either a vector database containing LangChain documentation or Wikipedia for general knowledge questions.
The system uses a LangGraph workflow to create a decision tree that:
- Receives a user question
- Routes it to the most appropriate data source
- Retrieves relevant information
- Returns the results
- Purpose: Answers questions about LangChain concepts
- Implementation:
- Uses Astra DB as a vector database
- Documents from LangChain websites are split into 500-token chunks
- Embeddings are created using HuggingFace's "all-MiniLM-L6-v2" model
- Stored in Astra DB in a table named "multi_agent"
- Retrieved using a retriever interface for semantic search
- Purpose: Answers general knowledge questions
- Implementation:
- Uses
WikipediaAPIWrapperto interface with the Wikipedia API - Configured to return the top 2 most relevant results
- Results are limited to 200 characters per article for conciseness
- Uses
- Purpose: Determines which knowledge source to use for each question
- Implementation:
- Uses a Groq LLM with the llama3-8b-8192 model
- Structured output using Pydantic models ensures consistent decisions
- System prompt instructs the LLM on routing logic:
- LangChain questions → Vector Database
- General knowledge questions → Wikipedia
- Question Input: User submits a question
- Routing: The
route_questionfunction:- Extracts the question from the state
- Uses the question_router to determine the appropriate data source
- Returns either "vectorstore" or "wiki_search" as the next node
- Retrieval:
- If routed to "vectorstore": the
retrievefunction queries the vector database - If routed to "wiki_search": the
wiki_searchfunction queries Wikipedia
- If routed to "vectorstore": the
- Output: The retrieved documents are returned as the final state
We use a retriever interface to interact with the vector database because:
- It provides a standardized way to query vector stores
- It abstracts away the underlying vector database implementation
- It allows for easy swapping of vector stores
- It handles the similarity search operations efficiently
The system uses a GraphState TypedDict to manage and pass state between components:
question: The user's original questiondocuments: The retrieved documents
- Documents are chunked using token-based splitting rather than character-based
- This ensures consistency across different languages and text types
- The tiktoken encoder aligns with how LLMs process text
