-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Content
📝 Description
Currently, our retriever fetches a fixed number (k=3) of documents, regardless of their actual relevance to the user's query. This can lead to the LLM receiving irrelevant context, which degrades the quality of the final answer and can cause hallucinations.
This issue aims to improve retrieval quality by implementing a relevance-based filtering mechanism. Instead of fetching the top-k documents, the retriever will only return documents that meet a specific similarity score threshold.
✅ Task List
-
1. Change Retrieval Method:
- Modify the retriever logic in
rag_service.py. Instead of the default similarity search, use a method that returns documents along with their similarity scores (e.g.,similarity_search_with_score).
- Modify the retriever logic in
-
2. Implement Score Thresholding:
- Filter the search results based on a
score_threshold. Only documents with a similarity score better than the threshold will be passed to the LLM. - Note: We need to determine the nature of the score. For L2 distance (Chroma's default), a lower score is better. For cosine similarity, a higher score is better.
- Filter the search results based on a
-
3. Experiment to Find Optimal Threshold:
- Use the
test_retriever.pyscript to experiment with different queries andscore_thresholdvalues (e.g., 0.4, 0.5, 0.6 for L2 distance) to find a good balance.
- Use the
-
4. Handle the "No Results" Case:
- Ensure the RAG chain handles cases where no documents meet the threshold (i.e., the context is an empty list).
- The system prompt should guide the LLM to give a polite "I don't know" or "I can't find information about that" response in this scenario.
🎯 Acceptance Criteria
- When a user asks a highly relevant question, the chatbot provides an accurate answer based on the retrieved context.
- When a user asks a completely irrelevant question (e.g., "what is the capital of France?"), the retriever should return no documents, and the chatbot should respond that it cannot answer based on the provided information, rather than attempting to hallucinate an answer.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request