[FEATURE] Add RAG-powered Chat with Quiz (multi-turn support)#522
[FEATURE] Add RAG-powered Chat with Quiz (multi-turn support)#522mahek2016 wants to merge 11 commits intoAOSSIE-Org:mainfrom
Conversation
…jpeg) using OpenCV + Tesseract
… safety, cross-platform tesseract detection)
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a new RAGService (FAISS + SentenceTransformer + T5) and a /chat endpoint; introduces OCR-based image text extraction and safer file handling in FileProcessor; modernizes tokenizer padding/truncation usage in MCQ generation and paraphrasing calls. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Server as Server
participant RAG as RAGService
participant Embed as SentenceTransformer
participant Index as FAISSIndex
participant Gen as T5Generator
User->>Server: POST /chat { document_text, question, chat_history? }
Server->>RAG: index_text(document_text)
RAG->>RAG: chunk_text(text, chunk_size=400, overlap=50)
RAG->>Embed: embed(chunks)
Embed-->>RAG: chunk embeddings
RAG->>Index: build normalized FAISS index
Index-->>RAG: index ready
Server->>RAG: query(question, chat_history, top_k=3)
RAG->>Embed: embed(question)
Embed-->>RAG: question embedding
RAG->>Index: search(top_k)
Index-->>RAG: top_k chunks
RAG->>RAG: assemble prompt (system + context + history + question)
RAG->>Gen: generate(answer_prompt)
Gen-->>RAG: decoded answer
RAG-->>Server: answer
Server-->>User: { answer, status: "success" }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (2)
backend/server.py (1)
33-35: Update startup logging to match lazy initialization behavior.Line 34 says the RAG service is initialized, but Line 33 sets it to
Noneuntil first/chatrequest.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 33 - 35, The startup log incorrectly claims the RAG service is initialized while rag_service is set to None; update the message to reflect lazy initialization by either removing or changing the print("RAG SERVICE INITIALIZED SUCCESSFULLY") to a message like "RAG service will be initialized on first /chat request" or move the success log into the actual initialization code path that creates rag_service (the handler for the /chat endpoint) so the success message is emitted only when rag_service is assigned.backend/Generator/rag.py (1)
96-96: Drop the unused FAISS distance binding.
distancesis never consumed; replacing it with_keeps intent clear.Proposed fix
- distances, indices = self.index.search(question_embedding, top_k) + _, indices = self.index.search(question_embedding, top_k)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/Generator/rag.py` at line 96, Replace the unused FAISS distance binding by discarding it: in the call to self.index.search(question_embedding, top_k) ignore the first returned value (currently assigned to distances) and only capture indices (e.g., use _ , indices = self.index.search(...)); update the assignment where distances is currently declared so intent is clear and no unused variable remains.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/Generator/main.py`:
- Around line 4-5: There are duplicate imports of T5ForConditionalGeneration and
T5Tokenizer causing a lint (Ruff F811); update the import statement that
currently lists AutoModelForSequenceClassification, AutoTokenizer,
AutoModelForSeq2SeqLM, T5ForConditionalGeneration, T5Tokenizer to remove the
duplicated T5ForConditionalGeneration and T5Tokenizer symbols so only one import
of those classes remains (keep AutoModelForSequenceClassification,
AutoTokenizer, AutoModelForSeq2SeqLM in that statement and rely on the separate
T5 import line).
- Around line 406-413: process_file currently trusts file.filename when building
file_path which allows path traversal; before constructing file_path (and before
calling file.save) sanitize the uploaded name (e.g., use a safe helper like
werkzeug.utils.secure_filename or at minimum os.path.basename) and then join
with self.upload_folder; additionally resolve and validate the final absolute
path is inside self.upload_folder (compare
os.path.abspath(file_path).startswith(os.path.abspath(self.upload_folder))) and
reject or raise if the check fails to prevent ../ traversal (update references
in process_file, file_path, upload_folder, file.save, and filename).
In `@backend/Generator/rag.py`:
- Around line 31-36: The chunk_text method currently computes the range step as
(chunk_size - overlap) which can be zero or negative if overlap >= chunk_size;
add a parameter validation at the start of chunk_text to raise a ValueError (or
adjust behavior) when overlap is not less than chunk_size and when chunk_size <=
0 or overlap < 0; update the function (chunk_text) to either normalize the
values or explicitly raise with a clear message referencing chunk_size and
overlap before using them to compute the range step so the loop for i in
range(0, len(words), chunk_size - overlap) never receives a non-positive step.
- Around line 47-74: The update to self.current_text is happening before
embedding/index creation succeeds, which can leave the object in a stale state;
change the function to build everything into temporary variables first (e.g.,
temp_chunks = self.chunk_text(text), temp_embeddings =
self.embedder.encode(...), normalize and cast temp_embeddings, create temp_index
= faiss.IndexFlatIP(...); temp_index.add(...)) and only after all steps succeed
assign self.text_chunks = temp_chunks, self.index = temp_index, self.dimension =
temp_embeddings.shape[1], and finally self.current_text = text; also wrap the
embedding/index creation in a try/except to avoid partial assignments on failure
(references: self.current_text, chunk_text, embedder.encode, faiss.normalize_L2,
faiss.IndexFlatIP, self.index, self.dimension).
In `@backend/server.py`:
- Around line 39-48: Several service objects (MCQGen, AnswerPredictor/answer,
BoolQGen, ShortQGen, qg/QuestionGenerator, docs_service/GoogleDocsService,
qa_model) were commented out but remain referenced by route handlers, causing
NameError; either restore their initializations (uncomment and ensure proper
imports/credentials like SERVICE_ACCOUNT_FILE/SCOPES for GoogleDocsService and
model pipeline call for qa_model) or add guards in each dependent handler to
check for None/availability and return a 503/meaningful error. Locate the
commented declarations (MCQGen, answer, BoolQGen, ShortQGen, qg, docs_service,
qa_model) and either re-enable their creation with proper error
handling/logging, or wrap routes that use methods on these instances with
existence checks (e.g., if not MCQGen: return error) to prevent runtime
NameError.
- Around line 499-520: Concurrent requests can interleave calls to the shared
rag_service (created from RAGService) causing wrong document state; fix by
adding a synchronization primitive (e.g., a global threading.Lock named
rag_lock) and acquire it around initialization and the pair of operations that
must be atomic: rag_service.index_text(...) and rag_service.query(...); ensure
you also protect the RAGService instantiation (the rag_service = RAGService()
block) with the same lock so initialization and per-request index/query are
serialized, and release the lock in a finally block or via a context manager to
avoid deadlocks.
- Around line 505-528: The exception handler in the /chat endpoint returns raw
exception text (str(e)) to clients; change it to log the full exception
server-side (including stack trace) and return a safe, generic JSON error
message to the client (e.g., {"error":"Internal server error"}) with a 500
status; keep the existing input validation that returns 400 for missing
document_text/question, and consider catching and mapping known/expected
exceptions from rag_service.index_text or rag_service.query to appropriate 4xx
responses, but never expose internal exception strings in the response.
---
Nitpick comments:
In `@backend/Generator/rag.py`:
- Line 96: Replace the unused FAISS distance binding by discarding it: in the
call to self.index.search(question_embedding, top_k) ignore the first returned
value (currently assigned to distances) and only capture indices (e.g., use _ ,
indices = self.index.search(...)); update the assignment where distances is
currently declared so intent is clear and no unused variable remains.
In `@backend/server.py`:
- Around line 33-35: The startup log incorrectly claims the RAG service is
initialized while rag_service is set to None; update the message to reflect lazy
initialization by either removing or changing the print("RAG SERVICE INITIALIZED
SUCCESSFULLY") to a message like "RAG service will be initialized on first /chat
request" or move the success log into the actual initialization code path that
creates rag_service (the handler for the /chat endpoint) so the success message
is emitted only when rag_service is assigned.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/Generator/main.pybackend/Generator/mcq.pybackend/Generator/rag.pybackend/server.py
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
backend/server.py (1)
507-512:⚠️ Potential issue | 🟠 MajorValidate request JSON shape in
/chatbefore field access.Malformed or non-object JSON currently falls into a 500 path instead of a client 400.
Proposed fix
- data = request.get_json() + data = request.get_json(silent=True) or {} + if not isinstance(data, dict): + return jsonify({"error": "Invalid JSON body"}), 400 document_text = data.get("document_text") question = data.get("question") chat_history = data.get("chat_history", []) + + if not isinstance(chat_history, list): + return jsonify({"error": "chat_history must be a list"}), 400 + if any(not isinstance(turn, dict) for turn in chat_history): + return jsonify({"error": "chat_history entries must be objects"}), 400Also applies to: 532-534
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 507 - 512, The handler reads request.get_json() and directly accesses data.get(...), causing 500s on malformed or non-object JSON; update the /chat request handling to validate that request.get_json() returned a dict (and not None) before field access, returning a 400 client error for invalid JSON; then validate required fields document_text and question are present and are strings, and ensure chat_history if provided is a list (or default to an empty list) before using it; apply the same validation pattern to the other spot where request.get_json() is used (the block that also accesses document_text/question/chat_history).
🧹 Nitpick comments (1)
backend/server.py (1)
520-526: Consider reducing lock scope for better chat throughput.Current global lock includes retrieval + generation, so one slow request blocks all users. A per-document index cache (keyed by content hash) plus narrower write locking would scale better.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 520 - 526, The current global rag_lock around rag_service creation, index_text and query blocks all requests; narrow the lock to only create or update per-document indexes by introducing a per-document cache keyed by a content hash (use document_text hash) and protect that cache with a short-lived lock instead of rag_lock; keep RAGService.query(chat_history, question) outside the lock so concurrent queries run in parallel; modify code that references rag_service, RAGService, index_text and query to check the per-document cache, acquire the lock only when inserting a new RAGService or calling index_text for that document, and then release before calling query.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/Generator/main.py`:
- Around line 393-428: The new extract_text_from_image function is implemented
but never called from the upload pipeline; update process_file to detect image
uploads (by MIME type or file extension) and call
extract_text_from_image(file_path) when an uploaded file is an image, then
attach the returned OCR text (e.g., into the file metadata, a text field, or the
same return structure process_file uses) so image uploads produce extracted
text; ensure errors from extract_text_from_image are handled the same way other
extraction failures are handled in process_file.
- Line 2: Remove the conflicting import of safe_name from pkg_resources and rely
on the local safe_name used in process_file; specifically delete the line
importing safe_name (the top-level "from pkg_resources import safe_name") so it
no longer collides with the local variable named safe_name in the function
process_file, then run linters to confirm Ruff F811 is resolved.
- Around line 443-460: The uploaded temp file saved via file.save(file_path) is
never removed; update the function to delete file_path after extraction (whether
success or error) by moving the extraction logic into a try/except and adding a
finally block that calls os.remove(file_path) (or pathlib.Path.unlink() with an
exists check) so both branches that call self.extract_text_from_pdf and
self.extract_text_from_docx (and the .txt read branch) always remove the temp
file; ensure removal itself is wrapped to ignore FileNotFoundError and does not
mask the original extraction exception.
---
Duplicate comments:
In `@backend/server.py`:
- Around line 507-512: The handler reads request.get_json() and directly
accesses data.get(...), causing 500s on malformed or non-object JSON; update the
/chat request handling to validate that request.get_json() returned a dict (and
not None) before field access, returning a 400 client error for invalid JSON;
then validate required fields document_text and question are present and are
strings, and ensure chat_history if provided is a list (or default to an empty
list) before using it; apply the same validation pattern to the other spot where
request.get_json() is used (the block that also accesses
document_text/question/chat_history).
---
Nitpick comments:
In `@backend/server.py`:
- Around line 520-526: The current global rag_lock around rag_service creation,
index_text and query blocks all requests; narrow the lock to only create or
update per-document indexes by introducing a per-document cache keyed by a
content hash (use document_text hash) and protect that cache with a short-lived
lock instead of rag_lock; keep RAGService.query(chat_history, question) outside
the lock so concurrent queries run in parallel; modify code that references
rag_service, RAGService, index_text and query to check the per-document cache,
acquire the lock only when inserting a new RAGService or calling index_text for
that document, and then release before calling query.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/Generator/main.py`:
- Around line 433-443: The code currently writes uploads directly to file_path
built from safe_name (in Generator.main.py using self.upload_folder, safe_name,
file_path and file.save), which allows concurrent requests with the same
filename to overwrite each other; change the upload flow to generate a unique
temp filename (e.g., include a UUID or use tempfile to create a unique name in
self.upload_folder), write to that unique path instead of file_path, then pass
that unique path to the downstream extraction logic and cleanup; ensure
references to file_path/file.save are updated to use the unique filename and
that any function expecting safe_name or file_path (e.g., the extraction caller)
receives the generated unique path.
- Around line 61-69: The crash occurs because is_word_available in mcq.py calls
s2v_model.get_best_sense without handling s2v_model=None; update
is_word_available to check if s2v_model is None before calling get_best_sense
(e.g., skip sense2vec-based filtering and return the non-sense2vec availability
result or True/appropriate fallback) so identify_keywords can safely pass
s2v_model=None; reference the symbols is_word_available, s2v_model, and
get_best_sense (and identify_keywords which passes s2v_model) when making the
change.
Description
Fixes #508
This PR introduces an AI-powered “Chat with Quiz” feature using a Retrieval-Augmented Generation (RAG) pipeline.
The feature enables users to ask follow-up questions about quiz source content, such as:
Implementation Details
all-MiniLM-L6-v2) for embeddingsIndexFlatIPwith L2 normalization)Scope
This PR implements:
PDF-based chat and persistent storage are intentionally excluded and can be introduced in a separate PR to keep changes scoped and reviewable.
📸 Screenshots / Recordings
Not applicable.
This PR introduces backend functionality (RAG-based conversational layer) and does not modify the UI.
Testing
Checklist
🤖 AI Usage Disclosure
Summary by CodeRabbit
New Features
Bug Fixes / Improvements
Chores