[Backend] Fix device mismatch for NLI model in AnswerPredictor by Siddhazntx · Pull Request #441 · AOSSIE-Org/EduAid

Siddhazntx · 2026-02-19T18:09:56Z

Addressed Issues:

Closes #442
The NLI model in AnswerPredictor was not explicitly moved to the detected device (CPU/GPU), and input tensors were not aligned with the model device.

Screenshots/Recordings:

N/A - This is a backend architectural fix.

Additional Notes:

The Issue:
While reviewing the backend model loading, I noticed that the distilbert-base-uncased-mnli model in the AnswerPredictor class wasn't being pushed to the hardware device during initialization. Additionally, its input tensors were defaulting to the CPU during prediction.

The fix:
Added .to(self.device) to both the NLI model initialization and the input tensors. This ensures the model actually utilizes the GPU when available and prevents potential PyTorch tensor mismatch crashes (RuntimeError: Expected all tensors to be on the same device).

Note on Testing:
I successfully tested the device synchronization locally and ran the official test_server.py suite. All generation endpoints pass successfully with my fix. During testing, I observed a pre-existing failure in the test_server.py suite on the current main branch that is unrelated to this change. I will open a separate Issue/PR to address that independently.

Checklist

[ x ] My PR addresses a single issue, fixes a single bug or makes a single improvement.
[ x ] My code follows the project's code style and conventions
[ ] If applicable, I have made corresponding changes or additions to the documentation
[ ] If applicable, I have made corresponding changes or additions to tests
[ x ] My changes generate no new warnings or errors
[ x ] I have joined the Discord server and I will share a link to this PR with the project maintainers there
[ x ] I have read the Contribution Guidelines
[ x ] Once I submit my PR, CodeRabbit AI will automatically review it and I will address CodeRabbit's comments.

Summary by CodeRabbit

Refactor
- Improved device handling for NLI model components to ensure reliable GPU/CPU execution.
- Models are now run in evaluation mode and inference avoids gradient tracking, improving stability and reducing resource usage.
- Internal inference inputs are moved to the appropriate device to prevent runtime errors.
Chores
- No public APIs or interfaces were changed.

coderabbitai · 2026-02-19T18:10:17Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68b44bc and 13cad6e.

📒 Files selected for processing (1)

backend/Generator/main.py

📝 Walkthrough

Walkthrough

Explicitly move NLI models to the detected device and set them to eval(); ensure input tensors are moved to the model device and inference runs under torch.no_grad(), preventing device mismatches during prediction.

Changes

Cohort / File(s)	Summary
NLI device & inference fixes `backend/Generator/main.py`	In MCQGenerator.init and AnswerPredictor.init, move NLI model to detected device and call .eval(). In AnswerPredictor.predict_boolean_answer, add `@torch.no_grad`() and move tokenizer-produced tensors to the model's device before calling the model. No public APIs changed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

fix: Correct PyTorch device type checking for CUDA memory cleanup #328: Also modifies backend/Generator/main.py touching AnswerPredictor.predict_boolean_answer and device handling (fixes device-type checks and CUDA cache cleanup), closely related to these device-placement fixes.

Poem

🐰 I hopped the models onto CUDA's shore,

Tensors followed, no crashes anymore.
Quiet eval, no grads in sight,
Inference cozy, device set right—
A carrot for clean runtime delight! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically describes the main change: fixing a device mismatch issue for the NLI model in AnswerPredictor, which aligns perfectly with the changeset.
Linked Issues check	✅ Passed	The PR successfully addresses all coding requirements from issue `#442`: moving the NLI model to self.device during initialization, moving input tensors to the model's device before inference, and ensuring consistent device handling.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to fixing the device mismatch issue in AnswerPredictor's NLI model usage; no unrelated or out-of-scope modifications are present.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

backend/Generator/main.py (2)
253-260: ⚠️ Potential issue | 🟠 Major

Add .eval() to the NLI model after moving it to the device.

The .to(self.device) fix is correct. However, self.nli_model is never put into eval mode. Every other inference model in this file calls .eval() immediately after .to(self.device) (see self.qg_model.eval() at line 418 and self.qae_model.eval() at line 726). Without it, dropout layers remain active during predict_boolean_answer, producing non-deterministic NLI results.
🛠️ Proposed fix
 self.nli_model = AutoModelForSequenceClassification.from_pretrained(self.nli_model_name)

 # Explicitly push the NLI model to the detected hardware (GPU or CPU)
 self.nli_model.to(self.device)
+self.nli_model.eval()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/Generator/main.py` around lines 253 - 260, The NLI model is moved to
the device but not set to eval mode, causing nondeterministic behavior (dropout
active) during predict_boolean_answer; after the existing
self.nli_model.to(self.device) call, call self.nli_model.eval() to mirror how
other inference models (self.qg_model.eval(), self.qae_model.eval()) are handled
so the NLI model runs deterministically in inference.
296-323: 🛠️ Refactor suggestion | 🟠 Major

Missing @torch.no_grad() on predict_boolean_answer.

Every other inference method in this file uses @torch.no_grad() (see _generate_question at line 646, _evaluate_qa at line 774). The NLI forward pass at line 309 will unnecessarily compute and retain gradient tensors, wasting GPU memory — especially relevant now that GPU execution is the target of this fix.
♻️ Proposed fix
+@torch.no_grad()
 def predict_boolean_answer(self, payload):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/Generator/main.py` around lines 296 - 323, The predict_boolean_answer
method is missing the `@torch.no_grad`() decorator so the NLI forward pass
(nli_model(**inputs)) will compute and retain gradients; add `@torch.no_grad`()
above def predict_boolean_answer(self, payload) to disable gradient tracking for
the entire method, keeping the existing device handling (inputs = {key:
value.to(self.device) ...} and torch.cuda.empty_cache()) unchanged so inference
runs without allocating unnecessary GPU gradients.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@backend/Generator/main.py`:
- Around line 253-260: The NLI model is moved to the device but not set to eval
mode, causing nondeterministic behavior (dropout active) during
predict_boolean_answer; after the existing self.nli_model.to(self.device) call,
call self.nli_model.eval() to mirror how other inference models
(self.qg_model.eval(), self.qae_model.eval()) are handled so the NLI model runs
deterministically in inference.
- Around line 296-323: The predict_boolean_answer method is missing the
`@torch.no_grad`() decorator so the NLI forward pass (nli_model(**inputs)) will
compute and retain gradients; add `@torch.no_grad`() above def
predict_boolean_answer(self, payload) to disable gradient tracking for the
entire method, keeping the existing device handling (inputs = {key:
value.to(self.device) ...} and torch.cuda.empty_cache()) unchanged so inference
runs without allocating unnecessary GPU gradients.

Siddhazntx · 2026-03-01T17:55:48Z

@coderabbitai I've applied the fixes in main.py:

Set self.nli_model.eval() for deterministic output.
Added @torch.no_grad() to predict_boolean_answer for memory efficiency.
Updated input tensors to move to self.device to ensure consistency with the model.

Verified locally with test_server.py and boolean predictions are working as expected.

coderabbitai · 2026-03-01T17:57:05Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the main.py file to verify the fixes applied by the user
cat backend/Generator/main.py

Length of output: 30370

@Siddhazntx Excellent work! I've verified all three fixes in backend/Generator/main.py:

✅ Line 254: self.nli_model.eval() - Sets the model to evaluation mode for deterministic inference
✅ Line 285: @torch.no_grad() decorator - Disables gradient tracking for memory efficiency during inference
✅ Line 295-296: Device synchronization for input tensors:

inputs = {key: value.to(self.device) for key, value in inputs.items()}

These are all excellent PyTorch best practices that ensure:

Consistent behavior during inference (eval mode)
Reduced memory footprint (no gradient computation)
Prevention of device mismatch errors (synchronized tensor placement)

Great job testing locally with test_server.py to confirm the boolean predictions are working correctly! The fix comprehensively addresses the device mismatch issue. 🎉

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

fix: push NLI model and input tensors to device in AnswerPredictor

68b44bc

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Siddhazntx marked this pull request as draft February 19, 2026 18:15

Siddhazntx marked this pull request as ready for review February 19, 2026 18:39

Optimize NLI model with eval(), no_grad(), and device-aware inputs

13cad6e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Backend] Fix device mismatch for NLI model in AnswerPredictor#441

[Backend] Fix device mismatch for NLI model in AnswerPredictor#441
Siddhazntx wants to merge 2 commits intoAOSSIE-Org:mainfrom
Siddhazntx:fix-nli-device

Siddhazntx commented Feb 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Siddhazntx commented Mar 1, 2026

Uh oh!

coderabbitai bot commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Siddhazntx commented Feb 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Addressed Issues:

Screenshots/Recordings:

Additional Notes:

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Siddhazntx commented Mar 1, 2026

Uh oh!

coderabbitai bot commented Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Siddhazntx commented Feb 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 19, 2026 •

edited

Loading