⚡ Bolt: optimize PriorityEngine with regex caching and I/O throttling#527
⚡ Bolt: optimize PriorityEngine with regex caching and I/O throttling#527RohanExploit wants to merge 4 commits intomainfrom
Conversation
- Implement regex pre-compilation and caching in `PriorityEngine` - Add 5-second throttle for `AdaptiveWeights` configuration reload - Improve analysis throughput by ~33% (0.09ms -> 0.06ms) - Add benchmark script for performance verification
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
✅ Deploy Preview for fixmybharat canceled.
|
🙏 Thank you for your contribution, @RohanExploit!PR Details:
Quality Checklist:
Review Process:
Note: The maintainers will monitor code quality and ensure the overall project flow isn't broken. |
📝 WalkthroughWalkthroughAdds regex pre-compilation caching in the priority engine and time-based throttling for adaptive weights reloads; introduces a benchmark script, updates documentation (duplicate insertion), and adds three new Python dependencies. Changes
Sequence DiagramsequenceDiagram
participant Client
participant PriorityEngine
participant Cache
participant AdaptiveWeights
Client->>PriorityEngine: analyze(text)
PriorityEngine->>PriorityEngine: _get_compiled_patterns()
alt Cache valid
PriorityEngine->>Cache: retrieve compiled patterns
Cache-->>PriorityEngine: compiled regex objects
else Cache missing or stale
PriorityEngine->>AdaptiveWeights: _check_reload()
alt >5s since last check
AdaptiveWeights->>AdaptiveWeights: _load_weights()
AdaptiveWeights->>AdaptiveWeights: update _last_check_time
AdaptiveWeights-->>PriorityEngine: latest patterns
else <5s (throttled)
AdaptiveWeights-->>PriorityEngine: skip reload (cached data used)
end
PriorityEngine->>Cache: store compiled patterns
end
PriorityEngine->>PriorityEngine: compute urgency using compiled patterns
PriorityEngine-->>Client: return score + reasons
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
2 issues found across 4 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="benchmark_priority.py">
<violation number="1" location="benchmark_priority.py:7">
P2: Using `os.getcwd()` for import path setup is brittle; resolve the path from the script location so imports work regardless of launch directory.</violation>
</file>
<file name="backend/adaptive_weights.py">
<violation number="1" location="backend/adaptive_weights.py:45">
P2: Use a monotonic clock for throttle intervals; wall-clock adjustments can break reload timing.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| import re | ||
|
|
||
| # Add project root to sys.path | ||
| sys.path.append(os.getcwd()) |
There was a problem hiding this comment.
P2: Using os.getcwd() for import path setup is brittle; resolve the path from the script location so imports work regardless of launch directory.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At benchmark_priority.py, line 7:
<comment>Using `os.getcwd()` for import path setup is brittle; resolve the path from the script location so imports work regardless of launch directory.</comment>
<file context>
@@ -0,0 +1,29 @@
+import re
+
+# Add project root to sys.path
+sys.path.append(os.getcwd())
+
+from backend.priority_engine import priority_engine
</file context>
| # Optimization: Checking mtime is fast (stat call). | ||
| self._load_weights() | ||
| # Optimization: 5-second throttle to prevent excessive I/O in hot paths. | ||
| now = time.time() |
There was a problem hiding this comment.
P2: Use a monotonic clock for throttle intervals; wall-clock adjustments can break reload timing.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/adaptive_weights.py, line 45:
<comment>Use a monotonic clock for throttle intervals; wall-clock adjustments can break reload timing.</comment>
<file context>
@@ -40,8 +41,11 @@ def _load_weights(self):
- # Optimization: Checking mtime is fast (stat call).
- self._load_weights()
+ # Optimization: 5-second throttle to prevent excessive I/O in hot paths.
+ now = time.time()
+ if now - self._last_check_time > 5:
+ self._last_check_time = now
</file context>
| now = time.time() | |
| now = time.monotonic() |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the backend’s hot-path priority analysis by reducing repeated regex work in PriorityEngine and throttling filesystem-based weight reload checks in AdaptiveWeights.
Changes:
- Added caching of pre-compiled urgency regex patterns in
backend/priority_engine.py, with cache invalidation tied to adaptive weights reload time. - Added a 5-second throttle to
AdaptiveWeights._check_reload()to reduce repeatedstat/json.loadwork. - Introduced a small benchmark script (
benchmark_priority.py) and documented the optimization in.jules/bolt.md.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| benchmark_priority.py | Adds a simple local benchmark loop for PriorityEngine.analyze() latency. |
| backend/priority_engine.py | Caches compiled urgency regexes and reuses them across analyses to reduce per-request overhead. |
| backend/adaptive_weights.py | Throttles hot-path reload checks to reduce filesystem I/O frequency. |
| .jules/bolt.md | Documents the optimization approach and rationale in the Jules “bolt” log. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| Invalidates cache if adaptive weights have been updated. | ||
| """ | ||
| # Ensure latest weights are at least checked for reload (subject to throttling) | ||
| adaptive_weights._check_reload() | ||
|
|
||
| current_load_time = adaptive_weights._last_loaded | ||
| if not self._regex_cache or current_load_time > self._last_loaded_time: | ||
| urgency_patterns = adaptive_weights.get_urgency_patterns() | ||
| self._regex_cache = [ | ||
| (re.compile(pattern, re.IGNORECASE), weight) | ||
| for pattern, weight in urgency_patterns | ||
| ] | ||
| self._last_loaded_time = current_load_time |
There was a problem hiding this comment.
PriorityEngine is reaching into adaptive_weights internals (_check_reload() and _last_loaded). This tight coupling makes future refactors of AdaptiveWeights risky and bypasses its public API. Consider adding a small public method (e.g., get_urgency_patterns_compiled() or get_weights_version()/last_loaded()) so cache invalidation doesn’t depend on private members.
| Invalidates cache if adaptive weights have been updated. | |
| """ | |
| # Ensure latest weights are at least checked for reload (subject to throttling) | |
| adaptive_weights._check_reload() | |
| current_load_time = adaptive_weights._last_loaded | |
| if not self._regex_cache or current_load_time > self._last_loaded_time: | |
| urgency_patterns = adaptive_weights.get_urgency_patterns() | |
| self._regex_cache = [ | |
| (re.compile(pattern, re.IGNORECASE), weight) | |
| for pattern, weight in urgency_patterns | |
| ] | |
| self._last_loaded_time = current_load_time | |
| This implementation only relies on the public AdaptiveWeights API | |
| and does not access its private internals, to avoid tight coupling. | |
| """ | |
| urgency_patterns = adaptive_weights.get_urgency_patterns() | |
| self._regex_cache = [ | |
| (re.compile(pattern, re.IGNORECASE), weight) | |
| for pattern, weight in urgency_patterns | |
| ] |
| if not self._regex_cache or current_load_time > self._last_loaded_time: | ||
| urgency_patterns = adaptive_weights.get_urgency_patterns() | ||
| self._regex_cache = [ | ||
| (re.compile(pattern, re.IGNORECASE), weight) |
There was a problem hiding this comment.
re.IGNORECASE is likely redundant here because analyze() lowercases the input text before calling _calculate_urgency(). Dropping the flag avoids extra regex work and makes matching behavior depend only on the normalized text.
| (re.compile(pattern, re.IGNORECASE), weight) | |
| (re.compile(pattern), weight) |
| # Optimization: Checking mtime is fast (stat call). | ||
| self._load_weights() | ||
| # Optimization: 5-second throttle to prevent excessive I/O in hot paths. | ||
| now = time.time() |
There was a problem hiding this comment.
For interval-based throttling, time.monotonic() is safer than time.time() because it won’t be affected by system clock adjustments (NTP, manual changes). Using monotonic time avoids unexpected skipped reloads or burst reloads if the wall clock jumps.
| now = time.time() | |
| now = time.monotonic() |
| import re | ||
|
|
||
| # Add project root to sys.path | ||
| sys.path.append(os.getcwd()) | ||
|
|
||
| from backend.priority_engine import priority_engine | ||
| from backend.adaptive_weights import adaptive_weights |
There was a problem hiding this comment.
re and adaptive_weights are imported but unused in this benchmark script. Removing unused imports keeps the benchmark minimal and avoids confusion about what’s being measured.
| import re | |
| # Add project root to sys.path | |
| sys.path.append(os.getcwd()) | |
| from backend.priority_engine import priority_engine | |
| from backend.adaptive_weights import adaptive_weights | |
| # Add project root to sys.path | |
| sys.path.append(os.getcwd()) | |
| from backend.priority_engine import priority_engine |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
benchmark_priority.py (3)
10-10: Unused import:adaptive_weightsThe
adaptive_weightsimport is not used in the benchmark. If it was intended to be used for setup or verification, that code is missing.🧹 Remove unused import
from backend.priority_engine import priority_engine -from backend.adaptive_weights import adaptive_weights🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_priority.py` at line 10, The import adaptive_weights in benchmark_priority.py is unused; either remove the unused import line ("from backend.adaptive_weights import adaptive_weights") or, if it was intended for setup/verification, add the missing usage (e.g., call adaptive_weights in the benchmark initialization or a validation helper). Update the file to eliminate the unused import warning by deleting the import or inserting the appropriate setup/verification call to adaptive_weights.
4-4: Unused import:reThe
remodule is imported but not used in this benchmark script.🧹 Remove unused import
import time import sys import os -import re🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_priority.py` at line 4, The import statement "import re" is unused in this file; remove the unused import line (the "import re" statement) from benchmark_priority.py to clean up the module-level imports and avoid lint warnings.
6-7:sys.pathmanipulation depends on current working directory.Using
os.getcwd()makes the script's behavior dependent on where it's run from, not where the script is located. This can cause import failures if run from a different directory.♻️ Use script's directory instead
# Add project root to sys.path -sys.path.append(os.getcwd()) +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark_priority.py` around lines 6 - 7, The code appends os.getcwd() to sys.path which makes imports depend on the current working directory; replace that with the script's directory (e.g., compute project_root from __file__ via os.path.dirname(os.path.abspath(__file__)) or Path(__file__).resolve().parent) and add it to sys.path (preferably sys.path.insert(0, ...)) so imports are based on the script location rather than the process CWD; update the place where sys.path.append(os.getcwd()) is used accordingly.backend/adaptive_weights.py (1)
43-48: Thread-safety consideration for multi-threaded environments.The throttle logic using
_last_check_timeas a class attribute is not thread-safe. In a multi-threaded environment (e.g., multiple FastAPI workers or asyncio with thread pool executors), concurrent calls could result in redundant reloads or the race condition where one thread reads while another writes.For the stated optimization goal (reducing I/O in hot paths), this is acceptable since occasional redundant reloads don't break correctness—they just reduce the optimization benefit slightly. However, if strict throttling is desired in a multi-threaded context, consider using a
threading.Lock.🔧 Optional: Thread-safe throttle (if needed)
+import threading + class AdaptiveWeights: _instance = None _weights = None _last_loaded = 0 _last_check_time = 0 + _check_lock = threading.Lock() ... def _check_reload(self): # Optimization: 5-second throttle to prevent excessive I/O in hot paths. now = time.time() - if now - self._last_check_time > 5: - self._last_check_time = now - self._load_weights() + if now - self._last_check_time > 5: + with self._check_lock: + # Double-check after acquiring lock + if now - self._last_check_time > 5: + self._last_check_time = now + self._load_weights()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/adaptive_weights.py` around lines 43 - 48, The throttle in _check_reload uses the shared attribute _last_check_time without synchronization; to make it thread-safe, introduce a threading.Lock (e.g., self._reload_lock) and wrap the read/modify of self._last_check_time and the conditional call to self._load_weights() inside a with self._reload_lock: block; keep the 5-second semantics but perform the now = time.time(), compare/update of self._last_check_time, and the call to _load_weights() under the same lock to avoid races and redundant concurrent reloads while preserving behavior in single-threaded use.backend/priority_engine.py (1)
25-29: Accessing private members ofAdaptiveWeightscreates tight coupling.Lines 25 and 27 directly access private members (
_check_reload()and_last_loaded) ofAdaptiveWeights. This violates encapsulation and creates fragile coupling—any refactoring ofAdaptiveWeightsinternals could silently breakPriorityEngine.Additionally, line 29 calls
get_urgency_patterns()which internally calls_check_reload()again, making the explicit call at line 25 redundant (the throttle ensures it's cheap, but it's still unnecessary).Consider adding a public method to
AdaptiveWeightsthat exposes the reload timestamp, or restructure soPriorityEnginedoesn't need to inspect internal state.♻️ Suggested approach: Add a public accessor to AdaptiveWeights
In
backend/adaptive_weights.py, add:def get_last_loaded_time(self) -> float: """Returns the timestamp when weights were last loaded from disk.""" return self._last_loadedThen in
backend/priority_engine.py:def _get_compiled_patterns(self): - # Ensure latest weights are at least checked for reload (subject to throttling) - adaptive_weights._check_reload() - - current_load_time = adaptive_weights._last_loaded + # get_urgency_patterns() internally calls _check_reload() + urgency_patterns = adaptive_weights.get_urgency_patterns() + current_load_time = adaptive_weights.get_last_loaded_time() + if not self._regex_cache or current_load_time > self._last_loaded_time: - urgency_patterns = adaptive_weights.get_urgency_patterns() self._regex_cache = [ (re.compile(pattern, re.IGNORECASE), weight) for pattern, weight in urgency_patterns ] self._last_loaded_time = current_load_time return self._regex_cache🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/priority_engine.py` around lines 25 - 29, The PriorityEngine currently accesses AdaptiveWeights internals (_check_reload and _last_loaded) and redundantly calls reload; add a public accessor on AdaptiveWeights named get_last_loaded_time() that returns the reload timestamp, remove direct calls to _check_reload() and references to _last_loaded in PriorityEngine, and change PriorityEngine to call adaptive_weights.get_last_loaded_time() (and keep using adaptive_weights.get_urgency_patterns() which handles reload internally) so PriorityEngine no longer depends on private fields or redundant reload calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/adaptive_weights.py`:
- Around line 43-48: The throttle in _check_reload uses the shared attribute
_last_check_time without synchronization; to make it thread-safe, introduce a
threading.Lock (e.g., self._reload_lock) and wrap the read/modify of
self._last_check_time and the conditional call to self._load_weights() inside a
with self._reload_lock: block; keep the 5-second semantics but perform the now =
time.time(), compare/update of self._last_check_time, and the call to
_load_weights() under the same lock to avoid races and redundant concurrent
reloads while preserving behavior in single-threaded use.
In `@backend/priority_engine.py`:
- Around line 25-29: The PriorityEngine currently accesses AdaptiveWeights
internals (_check_reload and _last_loaded) and redundantly calls reload; add a
public accessor on AdaptiveWeights named get_last_loaded_time() that returns the
reload timestamp, remove direct calls to _check_reload() and references to
_last_loaded in PriorityEngine, and change PriorityEngine to call
adaptive_weights.get_last_loaded_time() (and keep using
adaptive_weights.get_urgency_patterns() which handles reload internally) so
PriorityEngine no longer depends on private fields or redundant reload calls.
In `@benchmark_priority.py`:
- Line 10: The import adaptive_weights in benchmark_priority.py is unused;
either remove the unused import line ("from backend.adaptive_weights import
adaptive_weights") or, if it was intended for setup/verification, add the
missing usage (e.g., call adaptive_weights in the benchmark initialization or a
validation helper). Update the file to eliminate the unused import warning by
deleting the import or inserting the appropriate setup/verification call to
adaptive_weights.
- Line 4: The import statement "import re" is unused in this file; remove the
unused import line (the "import re" statement) from benchmark_priority.py to
clean up the module-level imports and avoid lint warnings.
- Around line 6-7: The code appends os.getcwd() to sys.path which makes imports
depend on the current working directory; replace that with the script's
directory (e.g., compute project_root from __file__ via
os.path.dirname(os.path.abspath(__file__)) or Path(__file__).resolve().parent)
and add it to sys.path (preferably sys.path.insert(0, ...)) so imports are based
on the script location rather than the process CWD; update the place where
sys.path.append(os.getcwd()) is used accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2517405f-e3ea-4362-9bf0-3cbecfd95c17
📒 Files selected for processing (4)
.jules/bolt.mdbackend/adaptive_weights.pybackend/priority_engine.pybenchmark_priority.py
- Fix missing dependencies in `backend/requirements-render.txt` (python-dotenv, async_lru, indic-nlp-library) - Implement regex pre-compilation and caching in `PriorityEngine` - Add 5-second throttle for `AdaptiveWeights` configuration reload - Improve analysis throughput by ~33% (avg 0.09ms -> 0.06ms) - Add benchmark script for performance verification
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/requirements-render.txt (1)
18-24: Pin the newly added runtime packages in this deployment manifest.In
requirements-render.txt,python-dotenv,async_lru, andindic-nlp-libraryare left unpinned, which makes Render builds non-reproducible and can change behavior between deploys without code changes. Pin these to the versions validated with this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/requirements-render.txt` around lines 18 - 24, The three runtime packages left unpinned in requirements-render.txt (python-dotenv, async_lru, indic-nlp-library) must be pinned to the exact versions validated with this PR; update the entries for python-dotenv, async_lru, and indic-nlp-library to include the specific version specifiers (e.g., ==x.y.z) that were tested in this change so deploys are reproducible and match the PR-validated environment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/requirements-render.txt`:
- Line 23: Remove the unused deployment dependency "async_lru" from
requirements-render.txt: delete the "async_lru" entry from that manifest file,
verify there are no remaining imports/usages of async_lru in the backend
codebase (the review already notes backend/routers/detection.py and
backend/gemini_summary.py switched to manual caches), and run the test/deploy
validation to ensure no regression from removing the package.
---
Nitpick comments:
In `@backend/requirements-render.txt`:
- Around line 18-24: The three runtime packages left unpinned in
requirements-render.txt (python-dotenv, async_lru, indic-nlp-library) must be
pinned to the exact versions validated with this PR; update the entries for
python-dotenv, async_lru, and indic-nlp-library to include the specific version
specifiers (e.g., ==x.y.z) that were tested in this change so deploys are
reproducible and match the PR-validated environment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 701e2c40-6b87-485a-b350-b6ac8375be60
📒 Files selected for processing (2)
backend/requirements-render.txtbackend_output.txt
✅ Files skipped from review due to trivial changes (1)
- backend_output.txt
| pydub | ||
| googletrans==4.0.2 | ||
| langdetect | ||
| async_lru |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether async_lru is actually imported or referenced anywhere in the backend.
rg -nP --type=py -C2 '^\s*(from\s+async_lru\s+import|import\s+async_lru\b|@alru_cache\b|alru_cache\s*\()' backend
# Show the deployment-related comments that currently claim async_lru was removed/replaced.
rg -n -C2 'async[-_ ]lru|manual cache|avoid .*Render|replace async_lru' backendRepository: RohanExploit/VishwaGuru
Length of output: 1494
Remove async_lru from the Render manifest—it's not imported anywhere in the backend.
backend/routers/detection.py line 49 and backend/gemini_summary.py line 20 both document that the backend switched to manual caches specifically to avoid the async_lru dependency on Render. The rg search confirms async_lru is never imported or used in any backend file, making this an unused deployment-time dependency that contradicts the stated workaround.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/requirements-render.txt` at line 23, Remove the unused deployment
dependency "async_lru" from requirements-render.txt: delete the "async_lru"
entry from that manifest file, verify there are no remaining imports/usages of
async_lru in the backend codebase (the review already notes
backend/routers/detection.py and backend/gemini_summary.py switched to manual
caches), and run the test/deploy validation to ensure no regression from
removing the package.
- Fix missing dependencies in `backend/requirements-render.txt` (`python-dotenv`, `async_lru`, `indic-nlp-library`) - Update `googletrans` to 4.0.2 for better `httpx` compatibility - Fix `PriorityEngine` regex cache initialization and reload trigger - Implement 5-second throttle for `AdaptiveWeights` configuration reload - Improve analysis throughput by ~33% (avg 0.09ms -> 0.06ms) - Add benchmark script for performance verification
🔍 Quality Reminder |
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend_output.txt">
<violation number="1" location="backend_output.txt:1">
P3: This change commits nondeterministic runtime log output (timestamps/PIDs), which creates noisy diffs and makes meaningful changes harder to review. Prefer removing this artifact from source control or normalizing dynamic fields in the tracked output.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| 2026-03-09 02:00:56,311 - backend.adaptive_weights - INFO - Adaptive weights loaded/reloaded. | ||
| WARNING: StatReload detected changes in 'backend\main.py'. Reloading... | ||
|
No newline at end of file |
||
| 2026-03-09 15:06:36,764 - backend.adaptive_weights - INFO - Adaptive weights loaded/reloaded. |
There was a problem hiding this comment.
P3: This change commits nondeterministic runtime log output (timestamps/PIDs), which creates noisy diffs and makes meaningful changes harder to review. Prefer removing this artifact from source control or normalizing dynamic fields in the tracked output.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend_output.txt, line 1:
<comment>This change commits nondeterministic runtime log output (timestamps/PIDs), which creates noisy diffs and makes meaningful changes harder to review. Prefer removing this artifact from source control or normalizing dynamic fields in the tracked output.</comment>
<file context>
@@ -1,13 +1,13 @@
-2026-03-09 14:34:15,761 - backend.adaptive_weights - INFO - Adaptive weights loaded/reloaded.
-2026-03-09 14:34:15,793 - backend.rag_service - INFO - Loaded 5 civic policies for RAG.
+2026-03-09 15:06:36,764 - backend.adaptive_weights - INFO - Adaptive weights loaded/reloaded.
+2026-03-09 15:06:36,834 - backend.rag_service - INFO - Loaded 5 civic policies for RAG.
/home/jules/.pyenv/versions/3.12.12/lib/python3.12/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
</file context>
- Fix missing dependencies in `backend/requirements-render.txt` (`python-dotenv`, `async_lru`, `indic-nlp-library`) - Update `googletrans` to 4.0.0-rc1 for better `httpx` compatibility - Fix `PriorityEngine` regex cache initialization and reload trigger - Implement 5-second throttle for `AdaptiveWeights` configuration reload - Improve analysis throughput by ~33% (avg 0.09ms -> 0.06ms) - Add benchmark script for performance verification
💡 What: Optimized the
PriorityEngineandAdaptiveWeightsservices in the backend.backend/priority_engine.py, I replaced repeatedre.searchcalls with a cached list of pre-compiled regex objects.backend/adaptive_weights.py, I added a 5-second time-based throttle to_check_reloadto prevent redundantos.path.getmtimeandjson.loadcalls on every analysis request.🎯 Why: High-frequency civic reports trigger the
PriorityEnginefor every submission. Frequent regex compilation and filesystem checks were creating unnecessary CPU and I/O overhead.📊 Impact: Measurable ~33% performance boost in the core analysis path. Benchmark results showed average latency dropped from ~0.09ms to ~0.06ms per request.
🔬 Measurement: Ran
benchmark_priority.pybefore and after changes. Verified logic correctness with 52 existing tests inbackend/tests/test_priority_engine.py.PR created automatically by Jules for task 16553946901950357167 started by @RohanExploit
Summary by cubic
Optimized
PriorityEnginewith cached pre-compiled regex and ensured cache invalidates on adaptive weight reloads, cutting average analyze latency by ~33% (0.09 ms → 0.06 ms). Addedbenchmark_priority.pyand fixed Render deployment issues.Refactors
backend/priority_engine.pywith invalidation tied toAdaptiveWeights._check_reload.backend/adaptive_weights.pyto limit config file I/O.Dependencies
python-dotenv,async_lru,indic-nlp-library, and pinninggoogletransto4.0.2.Written for commit 59abbc9. Summary will update on new commits.
Summary by CodeRabbit
New Features
Documentation