Refactor: Implement Singleton ModelManager to fix VRAM memory leak#521
Refactor: Implement Singleton ModelManager to fix VRAM memory leak#521Siddhazntx wants to merge 5 commits intoAOSSIE-Org:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughIntroduces a ModelManager singleton that lazily loads and shares heavy ML models and NLP tools across generators; refactors MCQGenerator, ShortQGenerator, ParaphraseGenerator, and BoolQGenerator to use the shared resources. AnswerPredictor now moves its NLI model to the detected device, sets eval mode, and runs boolean prediction without grad. Test endpoint renamed. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ShortQGenerator
participant ModelManager
participant NLIModel
Client->>ShortQGenerator: POST /get_shortq_answer (input)
ShortQGenerator->>ModelManager: request shared resources (tokenizer, model, device, nlp, s2v, fdist)
ModelManager-->>ShortQGenerator: return shared references (lazily loaded if needed)
ShortQGenerator->>ShortQGenerator: prepare inputs (tokenize, preprocess, move to device)
ShortQGenerator->>NLIModel: call AnswerPredictor.predict_boolean_answer (inputs on device, no-grad, eval)
NLIModel-->>ShortQGenerator: boolean prediction
ShortQGenerator-->>Client: response (answer)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/test_server.py (1)
77-77: Align test naming/log label with the new endpoint.Endpoint update is correct, but keeping old
test_get_answernaming/output text is misleading during debugging.✏️ Suggested cleanup
-def test_get_answer(): +def test_get_shortq_answer(): endpoint = '/get_shortq_answer' @@ - print(f'/get_answer Response: {response}') + print(f'/get_shortq_answer Response: {response}') @@ - test_get_answer() + test_get_shortq_answer()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/test_server.py` at line 77, The test still uses the old name/label "test_get_answer" while the endpoint variable was changed to '/get_shortq_answer'; rename the test function and any log/assert messages to match the new endpoint (e.g., change function name test_get_answer -> test_get_shortq_answer and update any printed/logged strings or pytest ids), and update any references to the endpoint in the test body that mention the old name so test output and labels reflect endpoint variable endpoint = '/get_shortq_answer'.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/Generator/main.py`:
- Around line 34-57: The ModelManager singleton is not thread-safe: both __new__
and __init__ use unsynchronized check-then-act on _instance and _is_initialized
causing multiple heavy initializations; fix by adding a class-level
threading.Lock (e.g., _init_lock) and use it to guard creation/initialization
with double-checked locking — acquire the lock in __new__ around the _instance
check before calling super().__new__, and/or acquire it at start of __init__ and
re-check _is_initialized inside the lock before loading models (update
references to ModelManager._instance and self._is_initialized accordingly) so
only one thread performs the heavy model loads.
---
Nitpick comments:
In `@backend/test_server.py`:
- Line 77: The test still uses the old name/label "test_get_answer" while the
endpoint variable was changed to '/get_shortq_answer'; rename the test function
and any log/assert messages to match the new endpoint (e.g., change function
name test_get_answer -> test_get_shortq_answer and update any printed/logged
strings or pytest ids), and update any references to the endpoint in the test
body that mention the old name so test output and labels reflect endpoint
variable endpoint = '/get_shortq_answer'.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/Generator/main.py`:
- Line 25: Remove the duplicate import of the threading module by keeping a
single "import threading" statement and deleting the redundant one; locate both
occurrences of the "import threading" import statements in main.py and remove
the second/redundant import so the module is only imported once (resolving the
Ruff F811 redefinition).
|
@coderabbitai , I have addressed both of these findings in the latest commit:
Thanks for catching these! |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3120
✅ Thread-Safety: Confirmed that ✅ Test Naming: Verified that Both issues are now fully addressed. Great job on the quick turnaround and thorough implementation! 🎉 If you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
Addressed Issues:
Fixes #520
Screenshots/Recordings:
Server Startup Output :
Additional Notes:
Summary of Changes:
This is strictly an internal architectural improvement; the external API behavior remains completely unchanged.
Checklist
AI Usage Disclosure
Check one of the checkboxes below:
Summary by CodeRabbit
Refactor
Chores