Skip to content

⚡ Bolt: Consolidated Aggregate Queries for Statistics#525

Open
RohanExploit wants to merge 1 commit intomainfrom
bolt-consolidated-stats-opt-v1-6363476430810632027
Open

⚡ Bolt: Consolidated Aggregate Queries for Statistics#525
RohanExploit wants to merge 1 commit intomainfrom
bolt-consolidated-stats-opt-v1-6363476430810632027

Conversation

@RohanExploit
Copy link
Owner

@RohanExploit RohanExploit commented Mar 8, 2026

This PR implements significant performance optimizations for the platform's statistics and list endpoints. By consolidating multiple sequential database counts into single aggregate queries using SQLAlchemy's case and label functionality, we've reduced database network round-trips by up to 83% (from 6 queries to 1) in the field officer statistics path.

Key Improvements:

  • Consolidated Aggregates: get_visit_statistics, get_system_stats, and get_stats now use unified SQL queries to calculate multiple metrics in a single pass.
  • Column Projection: The admin user list endpoint now selects only the required fields, reducing memory consumption and improving serialization speed.
  • Robustness: Added a new test suite tests/test_bolt_stats.py to ensure all optimized queries return correct data and fixed a missing dependency in the utility router.
  • Documentation: Updated the performance journal with critical learnings about SQL label matching in Python.

Measurable Impact:

  • Field Officer Stats: 1 DB round-trip instead of 6.
  • System Stats: 1 DB round-trip instead of 3.
  • Utility Stats: 1 DB round-trip for core counts instead of 2.
  • Admin User List: Reduced data transfer and memory overhead via column selection.

PR created automatically by Jules for task 6363476430810632027 started by @RohanExploit


Summary by cubic

Consolidated stats endpoints into single aggregate SQLAlchemy queries and trimmed the admin user list payload to reduce DB round-trips and speed up responses. Field officer stats now run in one query (down from six), with similar wins in admin and utility stats.

  • Performance

    • Unified aggregates in get_visit_statistics, get_system_stats, and get_stats using func.sum(case(...)), func.count(), and func.avg().
    • Admin users endpoint now selects only needed columns.
    • Impact: Field Officer Stats 6→1 queries; System Stats 3→1; Utility Stats 2→1; smaller admin user payloads.
  • Bug Fixes

    • Added tests/test_bolt_stats.py to validate all optimized endpoints.
    • Fixed missing case import and cleaned redundant code in admin stats.

Written for commit a911480. Summary will update on new commits.

Summary by CodeRabbit

  • Performance Improvements

    • Consolidated multiple database queries into single aggregate queries for visit, system, and utility statistics, reducing database round-trips.
  • New Features

    • Added average distance metric to visit statistics.
  • Bug Fixes

    • Corrected response field names (total_users, admin_count, active_users) for consistency.
  • Tests

    • Added comprehensive test suite for statistics endpoints and admin functions.

Optimized statistics endpoints across several routers to reduce database round-trips and improve performance.

1. backend/routers/field_officer.py:
   - get_visit_statistics: Consolidated 6 individual queries into a single aggregate query using func.sum(case(...)) and func.avg().

2. backend/routers/admin.py:
   - get_system_stats: Cleaned up redundant/broken code and ensured it uses a single optimized query.
   - get_users: Implemented column projection to reduce database overhead by avoiding full model instantiation.

3. backend/routers/utility.py:
   - get_stats: Combined total and resolved issue counts into a single query.
   - Added missing 'case' import from sqlalchemy.

4. .jules/bolt.md:
   - Updated with learnings regarding multi-metric aggregate queries and label matching.

5. tests/test_bolt_stats.py:
   - Added new test suite to verify the correctness of all optimized statistics endpoints.
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@netlify
Copy link

netlify bot commented Mar 8, 2026

Deploy Preview for fixmybharat canceled.

Name Link
🔨 Latest commit a911480
🔍 Latest deploy log https://app.netlify.com/projects/fixmybharat/deploys/69ad8760ae675a0008d6034d

@github-actions
Copy link

github-actions bot commented Mar 8, 2026

🙏 Thank you for your contribution, @RohanExploit!

PR Details:

Quality Checklist:
Please ensure your PR meets the following criteria:

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Code is commented where necessary
  • Documentation updated (if applicable)
  • No new warnings generated
  • Tests added/updated (if applicable)
  • All tests passing locally
  • No breaking changes to existing functionality

Review Process:

  1. Automated checks will run on your code
  2. A maintainer will review your changes
  3. Address any requested changes promptly
  4. Once approved, your PR will be merged! 🎉

Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken.

@github-actions github-actions bot added the size/m label Mar 8, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 8, 2026

📝 Walkthrough

Walkthrough

Consolidates multiple individual database queries into single SQLAlchemy aggregate queries across admin, field officer, and utility routers to reduce database round-trips. Adds documentation of aggregate query patterns and introduces comprehensive test coverage for refactored endpoints.

Changes

Cohort / File(s) Summary
Documentation
.jules/bolt.md
Expanded multi-metric aggregate pattern documentation and added guidance on aligning SQLAlchemy aggregate labels with API response keys to prevent AttributeError.
Admin Router
backend/routers/admin.py
Refactored get_users to select specific columns instead of full model; consolidated multi-step user count aggregations in get_system_stats into single query with relabeled fields (total_users, admin_count, active_users).
Field Officer Router
backend/routers/field_officer.py
Replaced multiple individual visit statistic queries with single consolidated aggregate query computing total, verified, geofence status, unique officers, and average distance in one call.
Utility Router
backend/routers/utility.py
Consolidated two separate count queries for total and resolved issues into single aggregate query using case expressions and label aliases; updated pending calculation accordingly.
Test Coverage
tests/test_bolt_stats.py
New test module with fixtures and four endpoint tests validating visit statistics, system stats, utility stats, and admin user retrieval with in-memory database and mocked authentication.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

size/m

Poem

🐰 Once many queries hopped through the night,
Now bundled as one, they run just right—
Fewer round-trips, faster streams,
SQLAlchemy fulfills our dreams! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: consolidating multiple database queries into single aggregate queries for statistics endpoints.
Description check ✅ Passed The PR description comprehensively covers all required template sections: detailed description of performance optimizations, type of change (Performance improvement), clear measurable impact metrics, and testing validation. All critical template elements are present and well-articulated.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bolt-consolidated-stats-opt-v1-6363476430810632027

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".jules/bolt.md">

<violation number="1" location=".jules/bolt.md:58">
P3: This note references `func.label()`, but SQLAlchemy labeling is done with expression `.label(...)` / `ColumnElement.label()`. Update the wording to avoid documenting an incorrect API.</violation>
</file>

<file name="tests/test_bolt_stats.py">

<violation number="1" location="tests/test_bolt_stats.py:11">
P1: The fixture mutates the shared application database engine instead of an isolated test database, so running this test can drop real tables.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


@pytest.fixture
def db_session():
Base.metadata.create_all(bind=engine)
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The fixture mutates the shared application database engine instead of an isolated test database, so running this test can drop real tables.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_bolt_stats.py, line 11:

<comment>The fixture mutates the shared application database engine instead of an isolated test database, so running this test can drop real tables.</comment>

<file context>
@@ -0,0 +1,124 @@
+
+@pytest.fixture
+def db_session():
+    Base.metadata.create_all(bind=engine)
+    session = Session(bind=engine)
+    yield session
</file context>
Fix with Cubic

**Action:** Use a single SQLAlchemy query with `func.count()`, `func.sum(case(...))`, and `func.avg()` to calculate all metrics in one go. This reduces network overhead and allows the database to perform calculations in a single pass.

## 2026-02-28 - Label Mismatch in Aggregate Queries
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: This note references func.label(), but SQLAlchemy labeling is done with expression .label(...) / ColumnElement.label(). Update the wording to avoid documenting an incorrect API.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At .jules/bolt.md, line 58:

<comment>This note references `func.label()`, but SQLAlchemy labeling is done with expression `.label(...)` / `ColumnElement.label()`. Update the wording to avoid documenting an incorrect API.</comment>

<file context>
@@ -52,4 +52,8 @@
+**Action:** Use a single SQLAlchemy query with `func.count()`, `func.sum(case(...))`, and `func.avg()` to calculate all metrics in one go. This reduces network overhead and allows the database to perform calculations in a single pass.
+
+## 2026-02-28 - Label Mismatch in Aggregate Queries
+**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
+**Action:** Ensure SQL labels in aggregate queries exactly match the keys expected by the API response schema or dictionary mapping.
</file context>
Suggested change
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
**Learning:** When using expression `.label()` (SQLAlchemy `ColumnElement.label()`) in aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
Fix with Cubic

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes several statistics endpoints by consolidating multiple sequential DB queries into single aggregate queries (using case/label patterns) and reduces overhead in the admin users list by projecting only required columns. It also adds a dedicated test suite to validate the optimized stats endpoints and updates the Bolt performance journal with a lesson learned about label-name consistency.

Changes:

  • Consolidate field-officer, system, and utility statistics into fewer aggregate SQLAlchemy queries.
  • Optimize /admin/users by selecting only required user columns.
  • Add tests/test_bolt_stats.py coverage and update .jules/bolt.md with aggregate-query guidance.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_bolt_stats.py Adds end-to-end tests for the optimized stats endpoints and admin user list.
backend/routers/utility.py Combines total/resolved issue counts into a single aggregate query.
backend/routers/field_officer.py Replaces multiple visit stats queries with one aggregate query computing all metrics.
backend/routers/admin.py Projects only needed user columns for the admin users list; consolidates stats aggregates.
.jules/bolt.md Documents learnings about label matching for aggregate queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +30 to +33
v1 = FieldOfficerVisit(
issue_id=1,
officer_email="off1@example.com",
officer_name="Officer 1",
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test inserts FieldOfficerVisit rows that reference issue_id values without creating corresponding Issue rows. This passes on SQLite when foreign-key enforcement is off, but will fail on databases/environments where FK constraints are enforced (e.g., Postgres, or SQLite with PRAGMA foreign_keys=ON). Create the referenced Issue records (or explicitly disable FK checks for the test DB) to make the test portable and representative of production behavior.

Copilot uses AI. Check for mistakes.
**Action:** Use a single SQLAlchemy query with `func.count()`, `func.sum(case(...))`, and `func.avg()` to calculate all metrics in one go. This reduces network overhead and allows the database to perform calculations in a single pass.

## 2026-02-28 - Label Mismatch in Aggregate Queries
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The journal entry says "When using func.label()" but SQLAlchemy labeling is done via the .label(...) method on an expression (e.g., func.count(...).label("name")). Using the wrong function name here could confuse readers trying to apply the guidance.

Suggested change
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
**Learning:** When labeling expressions with `.label(...)` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Around line 57-59: The note incorrectly references func.label()—update the
documentation to explain that labels are applied by calling .label() on the SQL
expression (e.g., call .label("total") on func.count(...) or any
column/expression) rather than on the func object; replace any mentions of
func.label() with the correct pattern such as func.count(...).label("name") and
clarify that the SQLAlchemy expression's .label("key") must match the
API/dictionary key expected.

In `@backend/routers/admin.py`:
- Around line 23-30: The query building users = db.query(User.id, User.email,
User.full_name, User.role, User.is_active,
User.created_at).offset(skip).limit(limit).all() is missing a deterministic
order, causing unstable pagination; update the query that builds users to
include an explicit order_by (for example order by User.id or a stable compound
key like (User.created_at, User.id)) before applying offset/limit so pagination
is consistent across runs and data changes.

In `@backend/routers/field_officer.py`:
- Around line 415-416: Replace the boolean comparisons that use == True / ==
False in the aggregate expressions with SQLAlchemy boolean predicates: change
FieldOfficerVisit.within_geofence == True to
FieldOfficerVisit.within_geofence.is_(True) and
FieldOfficerVisit.within_geofence == False to
FieldOfficerVisit.within_geofence.is_(False) so the
func.sum(case(...)).label("within") and func.sum(case(...)).label("outside") use
.is_(True)/.is_(False) for explicit SQL and to avoid Ruff E712.

In `@tests/test_bolt_stats.py`:
- Around line 28-65: The test test_get_visit_statistics inserts
FieldOfficerVisit rows using hardcoded issue_id values but doesn't create
corresponding Issue records; create and persist Issue instances first (e.g.,
construct Issue objects for the same issues, add them via
db_session.add_all(...) and commit) or create Issue objects and attach them to
FieldOfficerVisit via the relationship before calling db_session.add_all([v1,
v2, v3]) and db_session.commit(), ensuring the foreign-key constraint for
FieldOfficerVisit.issue_id is satisfied.
- Around line 4-15: The db_session fixture currently uses the application engine
(engine) and calls Base.metadata.create_all / drop_all against it; replace that
by creating a dedicated test engine and sessionmaker inside the fixture (e.g.,
sqlalchemy.create_engine with an isolated test URL such as sqlite:///:memory: or
a test-specific DB) and bind a new Session from that test engine, then call
Base.metadata.create_all(bind=test_engine) and
Base.metadata.drop_all(bind=test_engine) and close the test session; update
symbols referenced: db_session fixture, Base, engine (stop using), and Session
(use sessionmaker(bind=test_engine) to produce sessions).
- Around line 18-26: The fixture client should clear the global cache so tests
aren't order-dependent: import the recent_issues_cache symbol (the cache used by
the /api/stats endpoint) and call recent_issues_cache.clear() before yielding
the TestClient in the client fixture (and optionally call it again after the
yield to fully reset state); update the client fixture where get_db and
get_current_admin_user are overridden to perform this clear.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 49c01a8f-734f-4613-8183-138082da0516

📥 Commits

Reviewing files that changed from the base of the PR and between 92cf991 and a911480.

📒 Files selected for processing (5)
  • .jules/bolt.md
  • backend/routers/admin.py
  • backend/routers/field_officer.py
  • backend/routers/utility.py
  • tests/test_bolt_stats.py

Comment on lines +57 to +59
## 2026-02-28 - Label Mismatch in Aggregate Queries
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
**Action:** Ensure SQL labels in aggregate queries exactly match the keys expected by the API response schema or dictionary mapping.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

In SQLAlchemy, is func.label()a valid API for naming expressions, or should.label()be called on the SQL expression itself (for examplefunc.count(...).label("total"))?

💡 Result:

func.label() is not the intended API for aliasing/naming an expression in SQLAlchemy; func is for generating SQL function calls (e.g., COUNT(), COALESCE()). To name an expression, call .label() on the SQL expression (or use the standalone sqlalchemy.label() helper), e.g.:

from sqlalchemy import func, select

stmt = select(func.count(MyTable.id).label("total"))

SQLAlchemy’s docs describe labeling as a ColumnElement.label() method (and also provide the standalone label(name, element) function), which returns a Label (typically rendered with AS). [1] [2]


func.label() is the wrong API in this note.

SQLAlchemy labels are applied via .label() on the expression itself (e.g., func.count(...).label("total")). Documenting func.label() points readers at a non-existent API, since func is for generating SQL function calls, not for aliasing.

Suggested fix
-**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
+**Learning:** When using `.label()` on SQLAlchemy expressions in aggregate queries (for example, `func.count(...).label("total")`), any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## 2026-02-28 - Label Mismatch in Aggregate Queries
**Learning:** When using `func.label()` in SQLAlchemy aggregate queries, any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
**Action:** Ensure SQL labels in aggregate queries exactly match the keys expected by the API response schema or dictionary mapping.
## 2026-02-28 - Label Mismatch in Aggregate Queries
**Learning:** When using `.label()` on SQLAlchemy expressions in aggregate queries (for example, `func.count(...).label("total")`), any mismatch between the SQL label and the Python attribute accessed in the response dictionary will cause an `AttributeError`.
**Action:** Ensure SQL labels in aggregate queries exactly match the keys expected by the API response schema or dictionary mapping.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md around lines 57 - 59, The note incorrectly references
func.label()—update the documentation to explain that labels are applied by
calling .label() on the SQL expression (e.g., call .label("total") on
func.count(...) or any column/expression) rather than on the func object;
replace any mentions of func.label() with the correct pattern such as
func.count(...).label("name") and clarify that the SQLAlchemy expression's
.label("key") must match the API/dictionary key expected.

Comment on lines +23 to +30
users = db.query(
User.id,
User.email,
User.full_name,
User.role,
User.is_active,
User.created_at
).offset(skip).limit(limit).all()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a deterministic sort before paginating.

Offset/limit over an unordered query is unstable, so /admin/users?skip= can duplicate or skip users as the planner or underlying data changes.

Suggested fix
-    ).offset(skip).limit(limit).all()
+    ).order_by(User.id).offset(skip).limit(limit).all()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
users = db.query(
User.id,
User.email,
User.full_name,
User.role,
User.is_active,
User.created_at
).offset(skip).limit(limit).all()
users = db.query(
User.id,
User.email,
User.full_name,
User.role,
User.is_active,
User.created_at
).order_by(User.id).offset(skip).limit(limit).all()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/admin.py` around lines 23 - 30, The query building users =
db.query(User.id, User.email, User.full_name, User.role, User.is_active,
User.created_at).offset(skip).limit(limit).all() is missing a deterministic
order, causing unstable pagination; update the query that builds users to
include an explicit order_by (for example order by User.id or a stable compound
key like (User.created_at, User.id)) before applying offset/limit so pagination
is consistent across runs and data changes.

Comment on lines +415 to +416
func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label("within"),
func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label("outside"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and examine the context around lines 415-416
if [ -f "backend/routers/field_officer.py" ]; then
  echo "=== File found, examining lines 410-420 ==="
  sed -n '410,420p' backend/routers/field_officer.py
  echo ""
  echo "=== Full context of the function (searching for function definition) ==="
  grep -n "def.*visit" backend/routers/field_officer.py | head -5
else
  echo "File not found"
fi

Repository: RohanExploit/VishwaGuru

Length of output: 1007


🏁 Script executed:

# Check for other boolean comparisons in the same file
echo "=== Searching for similar boolean comparisons in the file ==="
rg "== (True|False)" backend/routers/field_officer.py -n

Repository: RohanExploit/VishwaGuru

Length of output: 414


🌐 Web query:

Ruff E712 rule Python boolean comparison

💡 Result:

Ruff E712 (true-false-comparison) flags equality comparisons to the boolean singletons True / False (e.g., == True, != False) because they’re usually redundant or less clear than direct truthiness checks. [1][2]

What triggers E712

if cond == True:      # E712
    ...

if cond == False:     # E712
    ...

[2]

Preferred fixes

  • Truthiness check:
if cond:
    ...
  • Falsey check:
if not cond:
    ...

[1][2]

When an explicit boolean check is actually intended

If you specifically need to distinguish True/False from other truthy/falsey values (e.g., None, 0, ""), compare by identity:

if cond is True:
    ...

if cond is False:
    ...

This is also what the rule’s message recommends as an alternative. [2]

Sources: Ruff rule list (E712 description) [1], pycodestyle E712 documentation [2].


🌐 Web query:

SQLAlchemy .is_() method boolean predicates

💡 Result:

In SQLAlchemy, .is_() / .is_not() generate SQL IS / IS NOT predicates.

NULL predicates (most common)

IS NULL is what you usually want for None:

stmt = select(User).where(User.deleted_at.is_(None))      # deleted_at IS NULL
stmt = select(User).where(User.deleted_at.is_not(None))  # deleted_at IS NOT NULL

Note: col == None is special-cased by SQLAlchemy to also render col IS NULL automatically. (docs.sqlalchemy.org)

Boolean predicates

.is_() can be used for booleans too, and SQLAlchemy explicitly calls this out as a reason to use it (some platforms/dialects prefer IS TRUE/FALSE semantics). (docs.sqlalchemy.org)

Typical patterns:

from sqlalchemy import true, false

stmt = select(User).where(User.is_active.is_(true()))    # IS TRUE
stmt = select(User).where(User.is_active.is_(false()))   # IS FALSE

You can also do col.is_(True/False), but using true() / false() is the most explicit “SQL boolean constant” form. (This also avoids confusion with Python boolean comparisons.)

.is_not() vs .isnot()

Dialect caveat

Not every database has a native boolean type / boolean literals; behavior can vary by backend, so IS FALSE-style predicates may not be portable everywhere without dialect support. (github.com)

Citations:


Use SQLAlchemy boolean predicates instead of == True/False.

These comparisons will trip Ruff E712 here, and .is_(True) / .is_(False) keeps the generated SQL explicit.

Suggested fix
-            func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label("within"),
-            func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label("outside"),
+            func.sum(case((FieldOfficerVisit.within_geofence.is_(True), 1), else_=0)).label("within"),
+            func.sum(case((FieldOfficerVisit.within_geofence.is_(False), 1), else_=0)).label("outside"),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func.sum(case((FieldOfficerVisit.within_geofence == True, 1), else_=0)).label("within"),
func.sum(case((FieldOfficerVisit.within_geofence == False, 1), else_=0)).label("outside"),
func.sum(case((FieldOfficerVisit.within_geofence.is_(True), 1), else_=0)).label("within"),
func.sum(case((FieldOfficerVisit.within_geofence.is_(False), 1), else_=0)).label("outside"),
🧰 Tools
🪛 Ruff (0.15.4)

[error] 415-415: Avoid equality comparisons to True; use FieldOfficerVisit.within_geofence: for truth checks

Replace with FieldOfficerVisit.within_geofence

(E712)


[error] 416-416: Avoid equality comparisons to False; use not FieldOfficerVisit.within_geofence: for false checks

Replace with not FieldOfficerVisit.within_geofence

(E712)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routers/field_officer.py` around lines 415 - 416, Replace the boolean
comparisons that use == True / == False in the aggregate expressions with
SQLAlchemy boolean predicates: change FieldOfficerVisit.within_geofence == True
to FieldOfficerVisit.within_geofence.is_(True) and
FieldOfficerVisit.within_geofence == False to
FieldOfficerVisit.within_geofence.is_(False) so the
func.sum(case(...)).label("within") and func.sum(case(...)).label("outside") use
.is_(True)/.is_(False) for explicit SQL and to avoid Ruff E712.

Comment on lines +4 to +15
from backend.database import get_db, Base, engine
from backend.models import FieldOfficerVisit, User, UserRole
from sqlalchemy.orm import Session
from datetime import datetime, timezone

@pytest.fixture
def db_session():
Base.metadata.create_all(bind=engine)
session = Session(bind=engine)
yield session
session.close()
Base.metadata.drop_all(bind=engine)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't run schema setup/teardown against the application engine.

This fixture imports backend.database.engine and then calls create_all/drop_all on it. If that engine is wired to anything except an isolated test database, the suite can mutate or drop real data. Build a dedicated test engine/sessionmaker inside the fixture instead of reusing the app engine.

#!/bin/bash
set -euo pipefail

# Expected: this should show an isolated test-only engine/URL, not the shared application engine.
fd '^database\.py$' backend --exec sed -n '1,220p' {}

echo
rg -n "create_engine|sessionmaker|DATABASE_URL|sqlite|postgres" backend tests
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_bolt_stats.py` around lines 4 - 15, The db_session fixture
currently uses the application engine (engine) and calls
Base.metadata.create_all / drop_all against it; replace that by creating a
dedicated test engine and sessionmaker inside the fixture (e.g.,
sqlalchemy.create_engine with an isolated test URL such as sqlite:///:memory: or
a test-specific DB) and bind a new Session from that test engine, then call
Base.metadata.create_all(bind=test_engine) and
Base.metadata.drop_all(bind=test_engine) and close the test session; update
symbols referenced: db_session fixture, Base, engine (stop using), and Session
(use sessionmaker(bind=test_engine) to produce sessions).

Comment on lines +18 to +26
def client(db_session):
app.dependency_overrides[get_db] = lambda: db_session
# Mock admin user for stats access
from backend.dependencies import get_current_admin_user
app.dependency_overrides[get_current_admin_user] = lambda: User(email="admin@example.com", role=UserRole.ADMIN)

with TestClient(app) as c:
yield c
app.dependency_overrides = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Expected: find a clear/delete/reset hook on the cache, or confirm that tests need to swap in a fresh cache instance.
fd '^cache\.py$' backend --exec sed -n '1,220p' {}

echo
rg -n 'recent_issues_cache|get\("stats"\)|set\([^)]*"stats"' backend tests

Repository: RohanExploit/VishwaGuru

Length of output: 7438


🏁 Script executed:

# Check if test_bolt_stats.py exists and view its contents
fd 'test_bolt_stats.py' tests --exec wc -l {} \;

echo "---"

# View the test file
fd 'test_bolt_stats.py' tests --exec cat -n {} \;

Repository: RohanExploit/VishwaGuru

Length of output: 5224


🏁 Script executed:

# Search for other test files that call /api/stats
rg '"/api/stats"' tests --type py

Repository: RohanExploit/VishwaGuru

Length of output: 131


Clear the global stats cache in the fixture.

The /api/stats endpoint returns cached data via recent_issues_cache.get("stats"), but this fixture only resets DB state and dependency overrides. Since recent_issues_cache has a 5-minute TTL and is never cleared, test_get_utility_stats becomes order-dependent if any earlier test primed the cache. Add recent_issues_cache.clear() before yielding the client.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_bolt_stats.py` around lines 18 - 26, The fixture client should
clear the global cache so tests aren't order-dependent: import the
recent_issues_cache symbol (the cache used by the /api/stats endpoint) and call
recent_issues_cache.clear() before yielding the TestClient in the client fixture
(and optionally call it again after the yield to fully reset state); update the
client fixture where get_db and get_current_admin_user are overridden to perform
this clear.

Comment on lines +28 to +65
def test_get_visit_statistics(client, db_session):
# Add some visits
v1 = FieldOfficerVisit(
issue_id=1,
officer_email="off1@example.com",
officer_name="Officer 1",
check_in_latitude=19.0,
check_in_longitude=72.0,
check_in_time=datetime.now(timezone.utc),
verified_at=datetime.now(timezone.utc),
within_geofence=True,
distance_from_site=10.5,
status="verified"
)
v2 = FieldOfficerVisit(
issue_id=2,
officer_email="off1@example.com",
officer_name="Officer 1",
check_in_latitude=19.1,
check_in_longitude=72.1,
check_in_time=datetime.now(timezone.utc),
within_geofence=False,
distance_from_site=150.0,
status="checked_in"
)
v3 = FieldOfficerVisit(
issue_id=3,
officer_email="off2@example.com",
officer_name="Officer 2",
check_in_latitude=19.2,
check_in_longitude=72.2,
check_in_time=datetime.now(timezone.utc),
within_geofence=True,
distance_from_site=20.0,
status="checked_in"
)
db_session.add_all([v1, v2, v3])
db_session.commit()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Seed Issue rows before inserting FieldOfficerVisit records.

FieldOfficerVisit.issue_id is a real foreign key. Hardcoding 1/2/3 works only on backends that don't enforce it; backends with FK checks enabled will reject these inserts and make the test environment-dependent.

Suggested fix
 def test_get_visit_statistics(client, db_session):
-    # Add some visits
+    from backend.models import Issue
+
+    i1 = Issue(description="Issue 1", category="Road", status="open")
+    i2 = Issue(description="Issue 2", category="Road", status="open")
+    i3 = Issue(description="Issue 3", category="Road", status="open")
+    db_session.add_all([i1, i2, i3])
+    db_session.flush()
+
     v1 = FieldOfficerVisit(
-        issue_id=1,
+        issue_id=i1.id,
         officer_email="off1@example.com",
         officer_name="Officer 1",
         check_in_latitude=19.0,
@@
     v2 = FieldOfficerVisit(
-        issue_id=2,
+        issue_id=i2.id,
         officer_email="off1@example.com",
         officer_name="Officer 1",
         check_in_latitude=19.1,
@@
     v3 = FieldOfficerVisit(
-        issue_id=3,
+        issue_id=i3.id,
         officer_email="off2@example.com",
         officer_name="Officer 2",
         check_in_latitude=19.2,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_bolt_stats.py` around lines 28 - 65, The test
test_get_visit_statistics inserts FieldOfficerVisit rows using hardcoded
issue_id values but doesn't create corresponding Issue records; create and
persist Issue instances first (e.g., construct Issue objects for the same
issues, add them via db_session.add_all(...) and commit) or create Issue objects
and attach them to FieldOfficerVisit via the relationship before calling
db_session.add_all([v1, v2, v3]) and db_session.commit(), ensuring the
foreign-key constraint for FieldOfficerVisit.issue_id is satisfied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants