Skip to content

Comments

v0.6.0: Alert API, eval queue, queue status UI, progressive display#30

Closed
NiveditJain wants to merge 8 commits intomainfrom
v0.6.0
Closed

v0.6.0: Alert API, eval queue, queue status UI, progressive display#30
NiveditJain wants to merge 8 commits intomainfrom
v0.6.0

Conversation

@NiveditJain
Copy link
Member

@NiveditJain NiveditJain commented Feb 24, 2026

Summary

  • Alert APIapp.alert('name', fn) registers callbacks that fire after all evals+enrichments complete for a session. Supports Slack webhooks, CI notifications, logging. Error-isolated via Promise.allSettled.
  • Eval Queue — Priority queue for background (--queue-interval) and foreground (UI) processing. Force-refresh options for re-run actions. All execution paths route through the queue so alerts always fire.
  • Queue Status UI — Collapsible panel on /dashboard showing pending/processing/processed counts, background processor indicator, items table, and recent errors. Auto-polls every 5s, self-hides when inactive.
  • Progressive Eval & Enrichment Display — Panels show names immediately on load; cached results appear instantly, uncached stream in one-by-one. Per-item cached badges. Self-sufficient panels (no parent prop dependency).
  • Documentation — Full CHANGELOG, README, API reference, CONTRIBUTING updates. New examples/alerts.js.

Test plan

  • npx tsc --noEmit passes clean
  • npm test — all 415 tests pass
  • Register app.alert('test', ctx => console.log('[ALERT]', ctx)), open a session — verify alert fires
  • --queue-interval=10 — verify background processing and alerts fire
  • Re-run All in eval panel — verify forceRefresh and alert fires with fresh results
  • Re-run single eval — verify forceEvalNames and alert fires with all results
  • Register a throwing alert — verify evals still complete normally
  • Open /dashboard with queue activity — verify Queue Status panel appears
  • Queue Status panel polls and updates every 5s

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Session-wide Alerts API; in-process priority queue with foreground/background processing, force-refresh and targeted runs; CLI flags and examples for queue controls; live Queue Status panel integrated into dashboard.
  • UI/UX

    • Progressive per-item eval/enrichment loading with cached badges, fast cache probes, auto-refreshing queue panel and recent errors.
  • Tests

    • Comprehensive tests for cache probes, queue engine, and alert system.
  • Bug Fixes

    • Safer theme storage and improved error/fallback handling during processing.
  • Chores

    • Version bumped to 0.6.0; docs updated.

NiveditJain and others added 3 commits February 24, 2026 14:55
Show eval/enrichment names the moment the page loads and stream each
result into the UI as it completes. Cached results appear instantly,
slow evals don't block anything.

- Add enricherName param to runEnrichments and runSubagentEnrichments
- Add checkEvalCacheAndList and checkEnrichmentCacheAndList cache probes
- Rewrite EvalResultsPanel and EnrichmentResultsPanel for two-phase
  progressive loading (probe cache → stream uncached results)
- Remove runSessionDashboard batch — panels now self-load
- Fix localStorage.setItem unhandled error in ThemeContext tests
- Bump version to 0.6.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- app.alert() API for post-eval callbacks (Slack, CI, logging)
- Priority eval queue with background processor (--queue-interval)
- Queue Status panel on /dashboard with live polling
- Force-refresh options for re-run actions routed through queue
- CLI flags: --queue-interval, --queue-concurrency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CHANGELOG: added Alert API, Eval Queue, Queue Status UI sections to v0.6.0
- README: alerts section, queue processing section, updated CLI reference and How It Works
- API reference: full app.alert() docs with types, background queue processing section, updated full example and tips
- examples/alerts.js: complete alert example with evals, enrichments, and 3 alert patterns
- CONTRIBUTING: added examples dir, key subsystems table, environment variables table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a chainable alert API and registry, implements an in-process priority eval/enrichment queue (foreground/background) with scanning and status APIs, converts panels to per-item progressive cache probing and rerun paths, adds server actions/CLI/env flags, examples, and tests; removes the previous run-session-dashboard batch orchestrator.

Changes

Cohort / File(s) Summary
Documentation & Release
CHANGELOG.md, CONTRIBUTING.md, README.md, docs/api-reference.md, package.json
Release 0.6.0 docs and examples: documents Alerts API, background queue, CLI flags (--queue-interval, --queue-concurrency) and env vars; package version bump.
Alert System
lib/evals/alert-types.ts, lib/evals/alert-registry.ts, lib/evals/alert-dispatcher.ts, lib/evals/app.ts, lib/evals/index.ts, __tests__/lib/alert-system.test.ts
New alert types, global registry (globalThis), dispatcher using Promise.allSettled, chainable app.alert() registration, and unit tests for registry/dispatcher.
Eval Queue Core
lib/eval-queue.ts, instrumentation.ts, scripts/launch.ts, scripts/parse-script-args.ts
New in-process priority queue with HIGH/LOW priorities, dedupe/priority upgrades, foreground coalescing, scanAndEnqueue, processSession pipeline (evals/enrichments/filters/alerts), batch processing/concurrency, status APIs, background processor starter, and CLI/env wiring.
Server Actions — Cache Probes & Queue
app/actions/check-eval-cache.ts, app/actions/check-enrichment-cache.ts, app/actions/queue-session.ts, app/actions/get-queue-status.ts
Added per-item cache-probe actions for session/subagent scopes, queueAndProcessSession with coalescing and force options, and getQueueStatusAction for UI polling.
Run / Targeted Execution
app/actions/run-enrichments.ts, app/actions/run-subagent-enrichments.ts, app/actions/run-session-dashboard.ts, app/actions/run-session-action.ts
Added optional enricherName to target single enrichers (modifies caching semantics for targeted runs); removed run-session-dashboard batch orchestrator; runSessionAction accepts itemLabel for clearer error messages.
Progressive UI Panels
app/components/eval-results-panel.tsx, app/components/enrichment-results-panel.tsx, app/components/raw-log-viewer.tsx
Panels converted from batch initialResult to two-phase per-item probing and progressive loading; per-item state (loading/done/error), cached badges, new loading/error rows, and removed dashboard-batch dependency.
Queue Status UI
app/components/queue-status-panel.tsx, app/dashboard/page.tsx
New live-polling QueueStatusPanel with counts, items table, recent errors, auto-hide when idle; panel added to dashboard in multiple locations.
Examples & Tests
examples/alerts.js, __tests__/lib/*, __tests__/actions/*, __tests__/actions/check-eval-cache.test.ts, __tests__/actions/check-enrichment-cache.test.ts
Added alerts example and extensive Vitest suites for eval-queue behavior, alert system, and cache-probe actions covering scopes and edge cases.
Integration & Misc
app/components/*, contexts/ThemeContext.tsx, scripts/*
Defensive localStorage write in ThemeContext; CLI parsing for queue interval/concurrency; spawned Next.js env propagation; adjusted component prop surfaces (removed initialResult).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Browser/UI
    participant UI as Dashboard/Panel
    participant Action as Server Action
    participant Queue as In-Memory Queue
    participant Pipeline as ProcessSession
    participant Cache as Per-Item Cache
    participant Alert as Alert Dispatcher

    Client->>UI: load page / trigger rerun
    UI->>Action: checkEvalCacheAndList / checkEnrichmentCacheAndList
    Action->>Cache: probe per-item cache
    Action-->>UI: cached results + uncached names
    UI->>UI: render cached items, placeholders for uncached

    alt foreground re-run for uncached
        UI->>Action: queueAndProcessSession()
        Action->>Queue: push(HIGH priority)
        Action->>Pipeline: processSession immediately
        Pipeline->>Cache: write per-item results
        Pipeline->>Alert: fireAlerts(context)
        Pipeline-->>Action: processing complete
        Action-->>UI: return completion / cached flag
        UI->>UI: update rows with new results
    end

    alt background processing
        Queue->>Queue: scanAndEnqueue() (LOW priority)
        Queue->>Pipeline: processBatch(concurrency)
        Pipeline->>Cache: cache results
        Pipeline->>Alert: fireAlerts(...)
    end

    UI->>Action: getQueueStatusAction() (poll)
    Action->>Queue: getDetailedStatus()
    Action-->>UI: QueueStatusPayload
    UI->>UI: render queue panel if active
Loading
sequenceDiagram
    participant User as Library User
    participant App as createApp
    participant Registry as Alert Registry (globalThis)
    participant Dispatcher as Alert Dispatcher
    participant Handler as Alert Handler

    User->>App: app.alert("name", fn)
    App->>Registry: registerAlert(name, fn)
    Registry-->>App: stored
    App-->>User: returns app

    Note over Dispatcher: During processSession after evals/enrichments
    Dispatcher->>Registry: getRegisteredAlerts()
    Registry-->>Dispatcher: list of alerts
    loop each alert
        Dispatcher->>Handler: invoke alert.fn(context)
        alt success
            Handler-->>Dispatcher: resolve
        else failure
            Handler-->>Dispatcher: reject
            Dispatcher->>Dispatcher: console.error (log), continue
        end
    end
    Dispatcher-->>Dispatcher: await Promise.allSettled
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐇 I hopped the queue from low to high, then sped,

probing cached carrots where the seedlings led.
Alerts I planted, one by one they sprung,
none could stop the others — each song was sung.
Hop, coalesce, and finish — the meadow's work is done.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 24.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main changes: v0.6.0 release introducing Alert API, eval queue, queue status UI, and progressive display features.
Description check ✅ Passed The PR description covers the key features (Alert API, Eval Queue, Queue Status UI, Progressive Display), includes documentation updates, provides a test plan checklist, and references the PR context clearly.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v0.6.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/actions/run-enrichments.ts (1)

17-29: ⚠️ Potential issue | 🟡 Minor

Misleading error wording when a named enricher is missing.

runSessionAction reports Eval "<name>" not found for invalid evalName; with enricherName, that message becomes confusing. Consider making the label derive from opts.kind or adding an override so it reads "Enricher/Enrichment" here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/run-enrichments.ts` around lines 17 - 29, runSessionAction's
generic "Eval \"<name>\" not found" error is confusing when called from
runEnrichments with an enricherName; change the call in runEnrichments to pass
an explicit label (e.g. "Enricher" or "Enrichment") or a new override field so
runSessionAction uses that label instead of "Eval" when constructing the
not-found message. Specifically, update the runEnrichments invocation of
runSessionAction to include a descriptive override (referencing runEnrichments,
the runSessionAction call, the evalName parameter and opts.kind) and modify
runSessionAction to prefer that override when formatting the "not found" error.
🧹 Nitpick comments (10)
lib/eval-queue.ts (2)

462-470: getDetailedStatus() exposes a mutable reference to state.items.

Line 466 returns the live state.items array. While this is safe for server-action callers (serialization creates a copy), any in-process caller that mutates the returned items would corrupt queue state. Consider returning a shallow copy for safety.

♻️ Proposed fix
   return {
     ...getStatus(),
-    items: state.items,
+    items: [...state.items],
     processingKeys: Array.from(state.processing),
     recentErrors: state.errors.slice(-10),
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 462 - 470, getDetailedStatus currently
returns the live state.items array (from getQueueState()) which can be mutated
by in-process callers; change getDetailedStatus to return a shallow copy of
items (e.g., using slice() or spread) instead of state.items so callers can't
corrupt the internal queue state while keeping processingKeys and recentErrors
behavior unchanged.

474-488: Overlapping background cycles when processing takes longer than interval.

setInterval fires on a fixed cadence regardless of whether the previous async callback has completed. If scanAndEnqueue() + processBatch() takes longer than intervalSec, multiple overlapping cycles will run concurrently, causing redundant cache probing and extra I/O pressure on large workspaces.

A simple guard prevents this:

♻️ Proposed fix: add a running guard to prevent overlap
 export function startBackgroundProcessor(intervalSec: number): void {
   const state = getQueueState();
   if (state.intervalId) return; // Already running

   console.log(`[eval-queue] Starting background processor (interval: ${intervalSec}s)`);

+  let running = false;
   state.intervalId = setInterval(async () => {
+    if (running) return;
+    running = true;
     try {
       await scanAndEnqueue();
       await processBatch();
     } catch (err) {
       console.error("[eval-queue] Background processing error:", err);
+    } finally {
+      running = false;
     }
   }, intervalSec * 1000);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 474 - 488, The background processor callback
can overlap when processing takes longer than the interval; modify
startBackgroundProcessor's interval callback to use a running guard on the queue
state (e.g., add a boolean like state.isProcessing or state.running via
getQueueState()) so the callback early-returns if already running, set the flag
before calling scanAndEnqueue() and processBatch(), and clear it in a finally
block (ensuring the flag is cleared even on errors) to prevent concurrent
executions while keeping state.intervalId behavior unchanged.
app/actions/check-eval-cache.ts (1)

41-48: Non-deterministic ordering of cachedResults from Promise.all.

Promise.all resolves in completion order, so cachedResults may not preserve the registration order of evals. If the UI renders results in the array's order, the display order could vary between loads depending on which cache lookups resolve first.

If stable ordering matters for the panels, sort after collection or switch to sequential iteration:

♻️ Option: sequential loop for predictable ordering
-    await Promise.all(items.map(async (item) => {
-      const itemCodeHash = hashItemCode(item.fn);
-      const cached = await getPerItemCache<EvalRunResult>(
-        "evals", projectName, sessionKey, item.name, itemCodeHash, contentHash,
-      );
-      if (cached) cachedResults.push(cached.value);
-      else uncachedNames.push(item.name);
-    }));
+    for (const item of items) {
+      const itemCodeHash = hashItemCode(item.fn);
+      const cached = await getPerItemCache<EvalRunResult>(
+        "evals", projectName, sessionKey, item.name, itemCodeHash, contentHash,
+      );
+      if (cached) cachedResults.push(cached.value);
+      else uncachedNames.push(item.name);
+    }

Note: The trade-off is sequential I/O vs. parallel cache lookups. If you prefer parallel execution, you can sort cachedResults by registration order after collection instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/check-eval-cache.ts` around lines 41 - 48, The current
Promise.all over items causes cachedResults to be filled in completion order,
making result order non-deterministic; update the logic in check-eval-cache (the
block using items.map, hashItemCode, getPerItemCache, cachedResults,
uncachedNames) so ordering is preserved—either iterate items sequentially (for
.. of) and push results in registration order, or keep the parallel cache
lookups but collect {index, value} pairs and after Promise.all sort by index
before pushing into cachedResults; ensure uncachedNames are recorded in the same
deterministic order as items.
app/components/queue-status-panel.tsx (2)

15-23: timeAgo doesn't handle values beyond hours.

Items older than 24 hours will display as e.g., "48h ago" or "168h ago". Consider adding a day-level bucket for readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/queue-status-panel.tsx` around lines 15 - 23, The timeAgo
function currently returns only seconds, minutes or hours and will display large
hour counts for values past 24h; update the timeAgo(ms: number) implementation
to add a day-level bucket (e.g., "Xd ago" for >=24 hours and <7 days, optionally
"Xw ago" for >=7 days) so items older than 24 hours are shown as days (or weeks)
instead of large hour counts; locate the timeAgo function in
app/components/queue-status-panel.tsx and replace the final hours return logic
with branching that computes days (and optionally weeks) and returns a
human-friendly string like "2d ago" or "1w ago".

55-80: Polling can overlap if a request takes longer than the interval.

If getQueueStatusAction() takes longer than 5 seconds (e.g., under heavy load), setInterval will fire a new poll before the previous one completes. While this is non-critical for a read-only status endpoint, you could avoid unnecessary in-flight requests by switching to a setTimeout-based chain:

♻️ Suggested change
  useEffect(() => {
    mountedRef.current = true;

    const poll = async () => {
      try {
        const data = await getQueueStatusAction();
        if (mountedRef.current) setStatus(data);
      } catch {
        // Silently ignore polling errors
      }
+     if (mountedRef.current) {
+       timeoutId = setTimeout(poll, POLL_INTERVAL_MS);
+     }
    };

    poll();
-   const id = setInterval(poll, POLL_INTERVAL_MS);
+   let timeoutId: ReturnType<typeof setTimeout>;

    return () => {
      mountedRef.current = false;
-     clearInterval(id);
+     clearTimeout(timeoutId);
    };
  }, []);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/queue-status-panel.tsx` around lines 55 - 80, The polling in
QueueStatusPanel uses setInterval which can overlap if getQueueStatusAction()
takes longer than POLL_INTERVAL_MS; refactor the useEffect to replace
setInterval with a recursive setTimeout chain: create the async poll function
(currently named poll) to await getQueueStatusAction(), setStatus when
mountedRef.current is true, then schedule the next poll via setTimeout(poll,
POLL_INTERVAL_MS); start it once and on cleanup clear the timeout and set
mountedRef.current = false so no new polls run after unmounting; ensure you
still catch errors as before and reference the existing mountedRef, poll,
getQueueStatusAction, and POLL_INTERVAL_MS symbols.
app/actions/run-subagent-enrichments.ts (2)

29-37: Consider using an options object instead of growing positional parameters.

The function now has 7 positional parameters, several of which are optional. This makes call sites fragile (e.g., callers must pass undefined for subagentType and false for forceRefresh to reach enricherName). An options object would improve readability and make future extensions safer.

♻️ Sketch
interface RunSubagentEnrichmentsOptions {
  projectName: string;
  sessionId: string;
  agentId: string;
  subagentType?: string;
  subagentDescription?: string;
  forceRefresh?: boolean;
  enricherName?: string;
}

export async function runSubagentEnrichments(
  options: RunSubagentEnrichmentsOptions,
): Promise<SubagentEnrichActionResult> { ... }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/run-subagent-enrichments.ts` around lines 29 - 37, Refactor
runSubagentEnrichments to accept a single options object instead of seven
positional parameters: create a RunSubagentEnrichmentsOptions interface
containing projectName, sessionId, agentId (required) and optional subagentType,
subagentDescription, forceRefresh, enricherName, then change the function
signature to runSubagentEnrichments(options: RunSubagentEnrichmentsOptions):
Promise<SubagentEnrichActionResult>, update all internal references to use
options.<field>, and update all callers to pass an object (preserve default for
forceRefresh inside the function if undefined); keep the original function name
and return type to minimize ripple.

62-81: Targeted enricher runs always bypass cache, even if already cached.

When enricherName is provided, the condition on Line 62 (!enricherName) forces all filtered enrichers into uncachedItems, skipping the per-item cache lookup entirely. If this is intentional (because the UI only calls this path for known-uncached enrichers), a brief comment would clarify the design choice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/run-subagent-enrichments.ts` around lines 62 - 81, The current
conditional uses "!enricherName" which causes per-item cache checks to be
skipped when an enricherName filter is present; change the logic so cache
lookups still run for the selected enrichers unless forceRefresh is true.
Specifically, remove the "!enricherName" requirement from the if condition and
ensure you iterate only over the filtered subagentEnrichers (respecting
enricherName) when calling getPerItemCache in the block that populates
cachedResults and uncachedItems; keep the else branch for the full push when
forceRefresh is true. Use the existing symbols subagentEnrichers, enricherName,
forceRefresh, contentHash, getPerItemCache, cachedResults, and uncachedItems to
locate and update the code.
__tests__/actions/check-enrichment-cache.test.ts (1)

99-122: Nit: stubEnricher("sub-enricher") still has scope: "session" despite testing the subagent path.

The mock controls routing via mockGetSubagentEnrichers, so this doesn't affect correctness, but using a scope value like "subagent" would make the test fixture more self-documenting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@__tests__/actions/check-enrichment-cache.test.ts` around lines 99 - 122, The
test uses stubEnricher("sub-enricher") but the returned object still has scope:
"session"; update the fixture so the enricher reflects the subagent path — e.g.
call stubEnricher with an explicit scope override or construct the stub enricher
object with scope: "subagent" (update the test invocation of stubEnricher in the
"uses subagent scope when agentId provided" case so mockGetSubagentEnrichers
returns an enricher whose scope is "subagent" instead of "session"); keep
assertions against mockGetSubagentEnrichers, mockHashSubagent,
mockGetPerItemCache, and result unchanged.
app/actions/queue-session.ts (1)

14-18: cached field is always false.

queueAndProcessSession always returns cached: false (Lines 59 and 64). Since the processSession function internally handles caching transparently, the caller has no way to know whether results were served from cache. If this field isn't meaningful in the foreground path, consider removing it from QueueSessionResult to avoid confusion — or populate it based on actual cache hit status.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/queue-session.ts` around lines 14 - 18, QueueSessionResult's
cached is always false because queueAndProcessSession never exposes
processSession's cache status; either remove cached from QueueSessionResult to
avoid dead data or update the flow so cache hits are propagated: change
processSession (or the cache-check helper it uses) to return a cacheHit/cached
boolean (or expose it via its result object) and then set cached accordingly
inside queueAndProcessSession before returning; update the QueueSessionResult
type and any callers of queueAndProcessSession/processSession (and their tests)
to reflect the new boolean or removed field.
app/components/eval-results-panel.tsx (1)

164-240: Significant structural duplication with enrichment-results-panel.tsx.

The two-phase loading effect, runAll handler, state management (items, probing, allCached, globalRunning, error, mountedRef), and derived header values are nearly identical between EvalResultsPanel and EnrichmentResultsPanel. Consider extracting a shared hook (e.g., useCacheProbeAndProcess) that encapsulates:

  • Two-phase loading with cache probe → queue/process → re-probe
  • runAll and per-item re-run logic
  • Derived loading/error/cached states

This would reduce ~150 lines of duplicated logic and ensure bug fixes (like the unhandled rejection and stuck-spinner issues above) are applied consistently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 164 - 240, The
EvalResultsPanel and EnrichmentResultsPanel contain nearly identical two-phase
loading and run logic (states: items, probing, noEvals, allCached,
globalRunning, error, mountedRef; functions: checkEvalCacheAndList,
queueAndProcessSession, runSubagentEvals and per-item re-run behavior), so
extract this into a shared hook (e.g., useCacheProbeAndProcess) that
encapsulates the initial cache probe, building the initial items array, queue vs
per-item processing, re-probing to fill results, runAll and per-item re-run
handlers, and exposes derived flags (probing, allCached, globalRunning, error,
items, setItems) to both panels; replace the duplicated useEffect and related
logic in EvalResultsPanel and EnrichmentResultsPanel with calls to the new hook,
preserving calls to checkEvalCacheAndList, queueAndProcessSession, and
runSubagentEvals and ensuring mountedRef behavior and rejection/error handling
are kept unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/actions/queue-session.ts`:
- Around line 36-57: The foreground path can race with the background processor
because push(...) only adds to state.items, so dequeueNext() may steal and
process the same session before the foreground calls processSession(...); fix by
moving the item into state.processing immediately after push(...) (and remove it
from state.items if needed) so the background dequeueNext() will skip
it—specifically modify the sequence around push(projectName, sessionId, HIGH,
contentHash) to also insert the constructed QueueItem into state.processing
using the same key before awaiting processSession(item, processOptions); ensure
dequeueNext() logic still respects state.processing and markDone() handles
cleanup.

In `@app/components/enrichment-results-panel.tsx`:
- Around line 184-213: The runAll handler leaves items stuck in "loading" when
the agentId branch returns a non-ok result or no enrichers; update the agentId
branch in runAll to handle both !result.ok and !result.hasEnrichers by setting
items back to a terminal state (e.g., setItems(prev => prev.map(i => ({ status:
"error", name: i.name, error: result.error || "Enrichment failed" })))) and
ensure mountedRef.current is checked before mutating state; also
setAllCached(false) or appropriate flag when enrichers are absent so UI state
stays consistent.
- Around line 136-178: The promise chain started by queueAndProcessSession(...)
and the forEach(async ...) calling runSubagentEnrichments(...) can produce
unhandled rejections; add error handlers: attach a .catch(...) to the
queueAndProcessSession(...).then(...) chain to set any remaining loading items
to error and log the error, and replace the forEach(async ...) with a mapped
array of promises + Promise.all(...) (or add per-promise .catch) so each
rejection is caught and updates items via setItems (using mountedRef.current
guard) to set the corresponding enricher item to status "error"; also ensure the
re-probe call to checkEnrichmentCacheAndList(...) is wrapped in try/catch or its
errors are caught and handled similarly.

In `@app/components/eval-results-panel.tsx`:
- Around line 273-303: The empty catch in rerunSingleEval leaves items stuck in
"loading"; update the catch to handle errors by setting the specific item's
status to "error" (use setItems to map the item with name === evalName to {
status: "error", name: evalName, error: err?.message || String(err) ||
"Unexpected error" }) and optionally log the error (e.g., console.error or a
logger), and preserve the mountedRef check before calling setItems to avoid
state updates on unmounted components; make sure this update is applied inside
the rerunSingleEval useCallback.
- Around line 243-271: The runAll callback leaves items in "loading" when agent
and non-agent branches return no results; update runAll to handle error/no-eval
paths by calling setItems to transition those loading items to a terminal state:
in the agentId branch, if result.ok is false setItems to status "error" with
result.error (or a fallback message) and if result.ok but !result.hasEvals
setItems to status "error" (or "no-evals") with a clear message; in the
non-agent branch, after queueResult succeeds but freshProbe is falsy or
!freshProbe.hasEvals likewise setItems to mark items as "error"/"no-evals" with
an explanatory message; keep mountedRef checks and still clear setGlobalRunning
in finally.
- Around line 195-236: The .then() chain off queueAndProcessSession (and the
async re-probe call checkEvalCacheAndList inside it) can throw unhandled
rejections, and the per-eval async callbacks over probe.uncachedNames can also
produce unhandled rejections; add explicit error handling so loading items are
moved to error state. Specifically, attach a .catch to the Promise returned by
queueAndProcessSession(...).then(...) that sets any items with status "loading"
to {status:"error", error: err.message || "Phase 2 failed"}; also wrap the
re-probe call (checkEvalCacheAndList) in try/catch or propagate errors to that
.catch. For the subagent branch, replace probe.uncachedNames.forEach(async ...)
with probe.uncachedNames.map(...) and await Promise.allSettled(...) (or attach
.catch to each spawned promise) and on any failure call setItems to set the
corresponding evalName(s) from "loading" to error; reference
queueAndProcessSession, checkEvalCacheAndList, runSubagentEvals, setItems, and
probe.uncachedNames when locating the changes.

In `@app/components/queue-status-panel.tsx`:
- Around line 153-175: status.processingKeys is a string[] so the current
mapping in processingKeys.map(...) always renders <PriorityBadge priority={0} />
(HIGH); change the rendering to avoid hardcoding priority: either (A) remove the
priority cell for processing rows by rendering an empty/placeholder <td> instead
of <PriorityBadge>, or (B) render PriorityBadge only when priority metadata
exists (e.g., check for a parsed priority field on the item) and otherwise
render a neutral/default state; update the JSX inside processingKeys.map to
conditionally render based on presence of priority and keep symbols
PriorityBadge and status.processingKeys to locate the change.

---

Outside diff comments:
In `@app/actions/run-enrichments.ts`:
- Around line 17-29: runSessionAction's generic "Eval \"<name>\" not found"
error is confusing when called from runEnrichments with an enricherName; change
the call in runEnrichments to pass an explicit label (e.g. "Enricher" or
"Enrichment") or a new override field so runSessionAction uses that label
instead of "Eval" when constructing the not-found message. Specifically, update
the runEnrichments invocation of runSessionAction to include a descriptive
override (referencing runEnrichments, the runSessionAction call, the evalName
parameter and opts.kind) and modify runSessionAction to prefer that override
when formatting the "not found" error.

---

Nitpick comments:
In `@__tests__/actions/check-enrichment-cache.test.ts`:
- Around line 99-122: The test uses stubEnricher("sub-enricher") but the
returned object still has scope: "session"; update the fixture so the enricher
reflects the subagent path — e.g. call stubEnricher with an explicit scope
override or construct the stub enricher object with scope: "subagent" (update
the test invocation of stubEnricher in the "uses subagent scope when agentId
provided" case so mockGetSubagentEnrichers returns an enricher whose scope is
"subagent" instead of "session"); keep assertions against
mockGetSubagentEnrichers, mockHashSubagent, mockGetPerItemCache, and result
unchanged.

In `@app/actions/check-eval-cache.ts`:
- Around line 41-48: The current Promise.all over items causes cachedResults to
be filled in completion order, making result order non-deterministic; update the
logic in check-eval-cache (the block using items.map, hashItemCode,
getPerItemCache, cachedResults, uncachedNames) so ordering is preserved—either
iterate items sequentially (for .. of) and push results in registration order,
or keep the parallel cache lookups but collect {index, value} pairs and after
Promise.all sort by index before pushing into cachedResults; ensure
uncachedNames are recorded in the same deterministic order as items.

In `@app/actions/queue-session.ts`:
- Around line 14-18: QueueSessionResult's cached is always false because
queueAndProcessSession never exposes processSession's cache status; either
remove cached from QueueSessionResult to avoid dead data or update the flow so
cache hits are propagated: change processSession (or the cache-check helper it
uses) to return a cacheHit/cached boolean (or expose it via its result object)
and then set cached accordingly inside queueAndProcessSession before returning;
update the QueueSessionResult type and any callers of
queueAndProcessSession/processSession (and their tests) to reflect the new
boolean or removed field.

In `@app/actions/run-subagent-enrichments.ts`:
- Around line 29-37: Refactor runSubagentEnrichments to accept a single options
object instead of seven positional parameters: create a
RunSubagentEnrichmentsOptions interface containing projectName, sessionId,
agentId (required) and optional subagentType, subagentDescription, forceRefresh,
enricherName, then change the function signature to
runSubagentEnrichments(options: RunSubagentEnrichmentsOptions):
Promise<SubagentEnrichActionResult>, update all internal references to use
options.<field>, and update all callers to pass an object (preserve default for
forceRefresh inside the function if undefined); keep the original function name
and return type to minimize ripple.
- Around line 62-81: The current conditional uses "!enricherName" which causes
per-item cache checks to be skipped when an enricherName filter is present;
change the logic so cache lookups still run for the selected enrichers unless
forceRefresh is true. Specifically, remove the "!enricherName" requirement from
the if condition and ensure you iterate only over the filtered subagentEnrichers
(respecting enricherName) when calling getPerItemCache in the block that
populates cachedResults and uncachedItems; keep the else branch for the full
push when forceRefresh is true. Use the existing symbols subagentEnrichers,
enricherName, forceRefresh, contentHash, getPerItemCache, cachedResults, and
uncachedItems to locate and update the code.

In `@app/components/eval-results-panel.tsx`:
- Around line 164-240: The EvalResultsPanel and EnrichmentResultsPanel contain
nearly identical two-phase loading and run logic (states: items, probing,
noEvals, allCached, globalRunning, error, mountedRef; functions:
checkEvalCacheAndList, queueAndProcessSession, runSubagentEvals and per-item
re-run behavior), so extract this into a shared hook (e.g.,
useCacheProbeAndProcess) that encapsulates the initial cache probe, building the
initial items array, queue vs per-item processing, re-probing to fill results,
runAll and per-item re-run handlers, and exposes derived flags (probing,
allCached, globalRunning, error, items, setItems) to both panels; replace the
duplicated useEffect and related logic in EvalResultsPanel and
EnrichmentResultsPanel with calls to the new hook, preserving calls to
checkEvalCacheAndList, queueAndProcessSession, and runSubagentEvals and ensuring
mountedRef behavior and rejection/error handling are kept unchanged.

In `@app/components/queue-status-panel.tsx`:
- Around line 15-23: The timeAgo function currently returns only seconds,
minutes or hours and will display large hour counts for values past 24h; update
the timeAgo(ms: number) implementation to add a day-level bucket (e.g., "Xd ago"
for >=24 hours and <7 days, optionally "Xw ago" for >=7 days) so items older
than 24 hours are shown as days (or weeks) instead of large hour counts; locate
the timeAgo function in app/components/queue-status-panel.tsx and replace the
final hours return logic with branching that computes days (and optionally
weeks) and returns a human-friendly string like "2d ago" or "1w ago".
- Around line 55-80: The polling in QueueStatusPanel uses setInterval which can
overlap if getQueueStatusAction() takes longer than POLL_INTERVAL_MS; refactor
the useEffect to replace setInterval with a recursive setTimeout chain: create
the async poll function (currently named poll) to await getQueueStatusAction(),
setStatus when mountedRef.current is true, then schedule the next poll via
setTimeout(poll, POLL_INTERVAL_MS); start it once and on cleanup clear the
timeout and set mountedRef.current = false so no new polls run after unmounting;
ensure you still catch errors as before and reference the existing mountedRef,
poll, getQueueStatusAction, and POLL_INTERVAL_MS symbols.

In `@lib/eval-queue.ts`:
- Around line 462-470: getDetailedStatus currently returns the live state.items
array (from getQueueState()) which can be mutated by in-process callers; change
getDetailedStatus to return a shallow copy of items (e.g., using slice() or
spread) instead of state.items so callers can't corrupt the internal queue state
while keeping processingKeys and recentErrors behavior unchanged.
- Around line 474-488: The background processor callback can overlap when
processing takes longer than the interval; modify startBackgroundProcessor's
interval callback to use a running guard on the queue state (e.g., add a boolean
like state.isProcessing or state.running via getQueueState()) so the callback
early-returns if already running, set the flag before calling scanAndEnqueue()
and processBatch(), and clear it in a finally block (ensuring the flag is
cleared even on errors) to prevent concurrent executions while keeping
state.intervalId behavior unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3daf99 and 05ab20c.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (30)
  • CHANGELOG.md
  • CONTRIBUTING.md
  • README.md
  • __tests__/actions/check-enrichment-cache.test.ts
  • __tests__/actions/check-eval-cache.test.ts
  • app/actions/check-enrichment-cache.ts
  • app/actions/check-eval-cache.ts
  • app/actions/get-queue-status.ts
  • app/actions/queue-session.ts
  • app/actions/run-enrichments.ts
  • app/actions/run-session-dashboard.ts
  • app/actions/run-subagent-enrichments.ts
  • app/components/enrichment-results-panel.tsx
  • app/components/eval-results-panel.tsx
  • app/components/queue-status-panel.tsx
  • app/components/raw-log-viewer.tsx
  • app/dashboard/page.tsx
  • contexts/ThemeContext.tsx
  • docs/api-reference.md
  • examples/alerts.js
  • instrumentation.ts
  • lib/eval-queue.ts
  • lib/evals/alert-dispatcher.ts
  • lib/evals/alert-registry.ts
  • lib/evals/alert-types.ts
  • lib/evals/app.ts
  • lib/evals/index.ts
  • package.json
  • scripts/launch.ts
  • scripts/parse-script-args.ts
💤 Files with no reviewable changes (1)
  • app/actions/run-session-dashboard.ts

@NiveditJain
Copy link
Member Author

Code review

Found 3 issues:

  1. Queue item leak: queueAndProcessSession bypasses dequeueNext, leaving items permanently in the queue

push() adds the session to state.items[], then a new QueueItem is constructed and processSession() is called directly. dequeueNext() is the only function that removes items from state.items (via splice) and adds them to state.processing. Because it is bypassed: (a) the item stays in state.items forever, (b) markDone() does a no-op processing.delete since the key was never added, (c) when the background processor runs, it finds the leftover item via dequeueNext() and processes it again — firing alerts a second time. Additionally, both EvalResultsPanel and EnrichmentResultsPanel independently call queueAndProcessSession for the same session on mount, so two concurrent processSession calls run and fireAlerts() fires twice per page load.

// Push at HIGH priority (for dedup if background has it queued)
push(projectName, sessionId, HIGH, contentHash);
// Process immediately — don't go through dequeue since we know which session
const item: QueueItem = {
key,
projectName,
sessionId,
priority: HIGH,
addedAt: Date.now(),
contentHash,
};
const processOptions: ProcessSessionOptions | undefined = options
? {
forceRefresh: options.forceRefresh,
forceEvalNames: options.forceEvalNames,
forceEnricherNames: options.forceEnricherNames,
}
: undefined;
await processSession(item, processOptions);

  1. rerunSingleEval has an empty catch block that leaves items stuck in "loading" state permanently

The item is set to { status: "loading" } at the start, then the rest is wrapped in try { ... } catch { /* handled by row status */ }. The catch block does nothing — if queueAndProcessSession or checkEvalCacheAndList throws (network error, file not found, timeout), the item remains in "loading" state with an infinite spinner and no way to retry. The comment "handled by row status" is incorrect — EvalLoadingRow has no timeout or fallback. The isLoading flag stays true, keeping the Re-run button disabled.

const rerunSingleEval = useCallback(async (evalName: string) => {
setItems(prev => prev.map(i => i.name === evalName ? { status: "loading", name: evalName } : i));
try {
if (agentId) {
const result = await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, false, evalName);
if (!mountedRef.current) return;
if (result.ok && result.hasEvals) {
const fresh = result.summary.results[0];
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
}
}
} else {
const queueResult = await queueAndProcessSession(projectName, sessionId, { forceEvalNames: [evalName] });
if (!mountedRef.current) return;
if (!queueResult.ok) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: queueResult.error || "Queue processing failed" } : i));
return;
}
const freshProbe = await checkEvalCacheAndList(projectName, sessionId, agentId, subagentType);
if (!mountedRef.current) return;
if (freshProbe.ok && freshProbe.hasEvals) {
const fresh = freshProbe.cachedResults.find(r => r.name === evalName);
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
}
}
}
} catch { /* handled by row status */ }
}, [projectName, sessionId, agentId, subagentType, subagentDescription]);

  1. Both panels independently call queueAndProcessSession for the same session, causing duplicate alert firing

EvalResultsPanel and EnrichmentResultsPanel are rendered together for the same projectName/sessionId. Both have independent useEffect mount hooks that call queueAndProcessSession(projectName, sessionId) when uncached items exist. Since each call constructs its own QueueItem and calls processSession() directly (bypassing dequeue), two concurrent processSession calls run for the same session, each calling fireAlerts() at the end. Every registered alert fires twice on the first uncached page load.

if (probe.uncachedNames.length > 0 && !agentId) {
queueAndProcessSession(projectName, sessionId).then(async (queueResult) => {
if (!mountedRef.current) return;

if (probe.uncachedNames.length > 0 && !agentId) {
queueAndProcessSession(projectName, sessionId).then(async (queueResult) => {
if (!mountedRef.current) return;

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@NiveditJain
Copy link
Member Author

@copilot review

Copy link
Contributor

Copilot AI commented Feb 24, 2026

@NiveditJain I've opened a new pull request, #31, to work on those changes. Once the pull request is ready, I'll request review from you.

- Fix promise coalescing silently dropping force options for concurrent callers (C1)
- Fix processSession swallowing errors causing infinite loading spinners (I3)
- Fix processing Set leak on error paths via finally block (I4)
- Replace forEach+async with Promise.all in eval/enrichment panels (I5)
- Add Priority enum, priorityLabel helper, visibility-based polling pause
- Fix PriorityBadge to use server-provided label instead of hardcoded check
- Add 26 tests for eval-queue core (push/dequeue/dedup/priority/status)
- Add 11 tests for alert registry and dispatcher (register/replace/fire/isolation)
- Replace as-any test stubs with properly typed RegisteredEval/RegisteredEnricher

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (4)
app/components/enrichment-results-panel.tsx (2)

137-155: Phase 2 .then() chain lacks a .catch() — same issue as in eval-results-panel.tsx.

If the inner await checkEnrichmentCacheAndList(...) on Line 147 throws unexpectedly, the promise chain rejects without being caught, leaving uncached items as permanent loading spinners.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 137 - 155, The
.then() branch started after queueAndProcessSession(projectName, sessionId) can
reject if the inner await checkEnrichmentCacheAndList(...) throws, leaving items
stuck in "loading"; wrap the async body in a try/catch (or attach a .catch) so
any thrown error from checkEnrichmentCacheAndList is caught and you call
setItems to mark loading items as error (similar to the queueResult error
handling), and ensure you still check mountedRef.current before updating; update
the block that references probe.uncachedNames, queueAndProcessSession,
checkEnrichmentCacheAndList, mountedRef, and setItems to include this error
handling.

184-213: runAll has paths where items remain stuck in "loading" — same pattern as eval panel.

In the agentId branch (Line 192): if !result.ok or !result.hasEnrichers, items stay in "loading". In the non-agent branch: if freshProbe doesn't return enrichers, same issue.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 184 - 213, runAll
can leave items stuck in "loading" when the agent branch returns !result.ok or
!result.hasEnrichers (and similarly when freshProbe lacks enrichers); update the
failure branches in runAll to always update items state instead of leaving them
loading: when result.ok is false setItems to map each item to status: "error"
with an appropriate error message (use result.error || "Enrichment failed"), and
when result.hasEnrichers is false setItems to map each item to status: "error"
with message like "No enrichers found" (do the same for freshProbe: handle
!freshProbe.ok and !freshProbe.hasEnrichers by setting items to error and ensure
setAllCached is updated appropriately). Ensure these changes are made inside the
runAll function where result and freshProbe are handled.
app/components/eval-results-panel.tsx (2)

195-213: Phase 2 .then() chain still lacks a .catch() — items can get stuck in "loading".

If the inner await checkEvalCacheAndList(...) on Line 205 throws an unexpected error, the promise chain rejects without being caught, leaving uncached items as permanent loading spinners.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 195 - 213, The promise
handler passed to queueAndProcessSession(projectName, sessionId).then(...) can
throw from the inner await checkEvalCacheAndList(...) and currently has no error
handler, so wrap the async callback body in a try/catch (or append a .catch on
the returned promise) to catch any exceptions from checkEvalCacheAndList; in the
catch branch ensure you early-return if !mountedRef.current and call setItems to
turn any items with status "loading" into errors (use the same shape as the
existing error update), and include the caught error message; reference the
existing symbols queueAndProcessSession, checkEvalCacheAndList, mountedRef, and
setItems when applying the change.

243-271: runAll still has paths where items remain stuck in "loading".

In the agentId branch: if !result.ok or !result.hasEvals, items set to loading on Line 245 are never transitioned to done/error. In the non-agent branch: if freshProbe doesn't contain evals, loading items are likewise stranded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 243 - 271, The runAll
function leaves items stuck in "loading" when runSubagentEvals returns !ok or
hasEvals=false, or when checkEvalCacheAndList (freshProbe) does not have evals;
update runAll to handle those failure/empty branches by calling setItems to mark
items as error (use a descriptive message from result.error or freshProbe.error
or a default like "No evals found"), and ensure setAllCached is not incorrectly
changed; specifically modify the agentId branch after await
runSubagentEvals(...) to handle both !result.ok and !result.hasEvals by setting
items via setItems(prev => prev.map(i => ({ status: "error" as const, name:
i.name, error: result.error || "No evals found" }))) and similarly after
freshProbe when !(freshProbe.ok && freshProbe.hasEvals) set items to error using
freshProbe.error or a default; keep the mountedRef checks and existing finally
block that clears setGlobalRunning.
🧹 Nitpick comments (2)
app/components/enrichment-results-panel.tsx (1)

106-182: Consider extracting the shared two-phase loading pattern into a custom hook.

EvalResultsPanel and EnrichmentResultsPanel share near-identical state management, two-phase mount logic, runAll patterns, and derived header computations. Extracting common logic (e.g., useCacheProbeAndProcess) would reduce the ~150 lines of duplicated orchestration across both panels and make future fixes (like the missing .catch()) apply in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 106 - 182,
EnrichmentResultsPanel duplicates a two-phase mount/orchestration pattern also
used by EvalResultsPanel; extract that logic into a custom hook (e.g.,
useCacheProbeAndProcess) that encapsulates the state (items, probing,
noEnrichers, allCached, globalRunning, error, collapsed, mountedRef), the
initial cache probe (checkEnrichmentCacheAndList), the queue path
(queueAndProcessSession) and the subagent path (runSubagentEnrichments) plus the
per-item state updates, and return the state and control functions to the
component; while extracting, ensure the promise chains (the initial
checkEnrichmentCacheAndList().then(...), queueAndProcessSession(...).then(...),
and Promise.all(...)) all have .catch handlers to surface errors into setError
and to set items to error state where appropriate so unhandled rejections are
avoided.
lib/eval-queue.ts (1)

516-530: Background scan cycles can overlap if a cycle exceeds the interval.

setInterval fires the next callback regardless of whether the previous async callback has settled. Under heavy I/O (many projects/sessions), scanAndEnqueue + processBatch could exceed intervalSec, leading to concurrent scans hammering the filesystem.

Consider guarding with a simple isRunning flag to skip a cycle when the previous one is still in progress:

Proposed fix
   state.intervalId = setInterval(async () => {
+    if ((state as any)._bgRunning) return;
+    (state as any)._bgRunning = true;
     try {
       await scanAndEnqueue();
       await processBatch();
     } catch (err) {
       console.error("[eval-queue] Background processing error:", err);
+    } finally {
+      (state as any)._bgRunning = false;
     }
   }, intervalSec * 1000);

A cleaner approach would be to add a bgRunning: boolean field to EvalQueueState instead of casting through any.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 516 - 530, The background processor can start
overlapping cycles; modify startBackgroundProcessor to use a boolean flag on
EvalQueueState (e.g., add bgRunning: boolean to EvalQueueState) and check it at
the top of the interval callback—if state.bgRunning is true, return/skip this
cycle; otherwise set state.bgRunning = true before calling scanAndEnqueue() and
processBatch() and reset state.bgRunning = false in a finally block so it always
clears even on errors; reference the existing functions
startBackgroundProcessor, scanAndEnqueue, processBatch and the EvalQueueState
object—do not use any any casts, initialize bgRunning when creating the queue
state, and ensure the interval callback is async-safe by guarding with this
flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@__tests__/lib/eval-queue.test.ts`:
- Around line 188-195: The test title is misleading because it claims to verify
the MAX_ERRORS cap but directly pushes 60 entries into state.errors (bypassing
the cap logic in recordError). Fix by either renaming the test to reflect what
it actually does (e.g., "errors array is accessible through state" or
"state.errors can hold entries when mutated directly") or change the test to
exercise the cap indirectly by invoking the public flow that calls recordError
(e.g., trigger a failing processSession or other exported API that records
errors) and then assert that state.errors.length <= MAX_ERRORS (reference
MAX_ERRORS constant and the recordError behavior) — update the assertion and
test title accordingly and use getQueueState to read the state after the public
call.

In `@app/components/eval-results-panel.tsx`:
- Around line 277-300: The rerunSingleEval flow can leave an item stuck in
"loading" when non-exception failure paths occur; update the branches that call
runSubagentEvals and checkEvalCacheAndList so they always resolve the item's
state: after awaiting runSubagentEvals (when agentId is set) check result.ok and
if false call setItems to set that eval's status to "error" with a helpful
message, and if result.ok && result.hasEvals but result.summary.results[0] is
undefined also set status to "error"; similarly, after checkEvalCacheAndList
(non-agent branch) if freshProbe.ok is false or freshProbe.hasEvals is false or
the find for evalName returns undefined call setItems to mark the eval as
"error" (include freshProbe.error or a clear message), preserving the
mountedRef.current checks and keeping the same item identity updates used by
setItems.

In `@app/components/queue-status-panel.tsx`:
- Around line 122-131: The toggle buttons lack aria-expanded; update the main
header button (the one using setCollapsed and reading collapsed) and the
recent-errors toggle (the button around lines 228–237, likely using a
recentErrorsCollapsed state setter) to include aria-expanded set to the expanded
boolean (i.e., aria-expanded={!collapsed} for the main toggle and
aria-expanded={!recentErrorsCollapsed} for the recent-errors toggle) so screen
readers reflect open/closed state.
- Around line 107-116: The visibility gate uses status.processedCount (a
cumulative server-side counter) which prevents auto-hiding; update the condition
in the component that checks the status object so it no longer includes
status.processedCount — keep the null check for status and only consider
status.queued, status.processing, status.backgroundRunning, and
status.errorCount when deciding to return null, leaving the processedCount
display in the header unchanged.

---

Duplicate comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 137-155: The .then() branch started after
queueAndProcessSession(projectName, sessionId) can reject if the inner await
checkEnrichmentCacheAndList(...) throws, leaving items stuck in "loading"; wrap
the async body in a try/catch (or attach a .catch) so any thrown error from
checkEnrichmentCacheAndList is caught and you call setItems to mark loading
items as error (similar to the queueResult error handling), and ensure you still
check mountedRef.current before updating; update the block that references
probe.uncachedNames, queueAndProcessSession, checkEnrichmentCacheAndList,
mountedRef, and setItems to include this error handling.
- Around line 184-213: runAll can leave items stuck in "loading" when the agent
branch returns !result.ok or !result.hasEnrichers (and similarly when freshProbe
lacks enrichers); update the failure branches in runAll to always update items
state instead of leaving them loading: when result.ok is false setItems to map
each item to status: "error" with an appropriate error message (use result.error
|| "Enrichment failed"), and when result.hasEnrichers is false setItems to map
each item to status: "error" with message like "No enrichers found" (do the same
for freshProbe: handle !freshProbe.ok and !freshProbe.hasEnrichers by setting
items to error and ensure setAllCached is updated appropriately). Ensure these
changes are made inside the runAll function where result and freshProbe are
handled.

In `@app/components/eval-results-panel.tsx`:
- Around line 195-213: The promise handler passed to
queueAndProcessSession(projectName, sessionId).then(...) can throw from the
inner await checkEvalCacheAndList(...) and currently has no error handler, so
wrap the async callback body in a try/catch (or append a .catch on the returned
promise) to catch any exceptions from checkEvalCacheAndList; in the catch branch
ensure you early-return if !mountedRef.current and call setItems to turn any
items with status "loading" into errors (use the same shape as the existing
error update), and include the caught error message; reference the existing
symbols queueAndProcessSession, checkEvalCacheAndList, mountedRef, and setItems
when applying the change.
- Around line 243-271: The runAll function leaves items stuck in "loading" when
runSubagentEvals returns !ok or hasEvals=false, or when checkEvalCacheAndList
(freshProbe) does not have evals; update runAll to handle those failure/empty
branches by calling setItems to mark items as error (use a descriptive message
from result.error or freshProbe.error or a default like "No evals found"), and
ensure setAllCached is not incorrectly changed; specifically modify the agentId
branch after await runSubagentEvals(...) to handle both !result.ok and
!result.hasEvals by setting items via setItems(prev => prev.map(i => ({ status:
"error" as const, name: i.name, error: result.error || "No evals found" }))) and
similarly after freshProbe when !(freshProbe.ok && freshProbe.hasEvals) set
items to error using freshProbe.error or a default; keep the mountedRef checks
and existing finally block that clears setGlobalRunning.

---

Nitpick comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 106-182: EnrichmentResultsPanel duplicates a two-phase
mount/orchestration pattern also used by EvalResultsPanel; extract that logic
into a custom hook (e.g., useCacheProbeAndProcess) that encapsulates the state
(items, probing, noEnrichers, allCached, globalRunning, error, collapsed,
mountedRef), the initial cache probe (checkEnrichmentCacheAndList), the queue
path (queueAndProcessSession) and the subagent path (runSubagentEnrichments)
plus the per-item state updates, and return the state and control functions to
the component; while extracting, ensure the promise chains (the initial
checkEnrichmentCacheAndList().then(...), queueAndProcessSession(...).then(...),
and Promise.all(...)) all have .catch handlers to surface errors into setError
and to set items to error state where appropriate so unhandled rejections are
avoided.

In `@lib/eval-queue.ts`:
- Around line 516-530: The background processor can start overlapping cycles;
modify startBackgroundProcessor to use a boolean flag on EvalQueueState (e.g.,
add bgRunning: boolean to EvalQueueState) and check it at the top of the
interval callback—if state.bgRunning is true, return/skip this cycle; otherwise
set state.bgRunning = true before calling scanAndEnqueue() and processBatch()
and reset state.bgRunning = false in a finally block so it always clears even on
errors; reference the existing functions startBackgroundProcessor,
scanAndEnqueue, processBatch and the EvalQueueState object—do not use any any
casts, initialize bgRunning when creating the queue state, and ensure the
interval callback is async-safe by guarding with this flag.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05ab20c and 69a0445.

📒 Files selected for processing (10)
  • __tests__/actions/check-enrichment-cache.test.ts
  • __tests__/actions/check-eval-cache.test.ts
  • __tests__/lib/alert-system.test.ts
  • __tests__/lib/eval-queue.test.ts
  • app/actions/get-queue-status.ts
  • app/actions/queue-session.ts
  • app/components/enrichment-results-panel.tsx
  • app/components/eval-results-panel.tsx
  • app/components/queue-status-panel.tsx
  • lib/eval-queue.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/actions/get-queue-status.ts

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new post-processing “alerts” hook and routes all eval/enrichment execution through a shared priority queue, adding a dashboard UI surface for queue state and updating panels to progressively display cached vs. newly computed results.

Changes:

  • Adds app.alert(name, fn) with a global registry and a dispatcher that isolates alert failures via Promise.allSettled.
  • Introduces an eval/enrichment priority queue (foreground + optional background worker) plus a /dashboard Queue Status panel.
  • Refactors eval/enrichment panels to probe cache first, show item names immediately, and progressively fill in results (removing the monolithic runSessionDashboard action).

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
scripts/parse-script-args.ts Adds --queue-interval / --queue-concurrency CLI parsing.
scripts/launch.ts Passes queue settings into the Next.js process via env vars.
package.json Bumps package version to 0.6.0.
package-lock.json Updates lock metadata version fields (currently inconsistent with package.json).
lib/evals/index.ts Re-exports alert-related public types.
lib/evals/app.ts Adds chainable app.alert() registration API.
lib/evals/alert-types.ts Defines AlertContext / AlertFunction / RegisteredAlert.
lib/evals/alert-registry.ts Implements a globalThis-backed singleton alert registry.
lib/evals/alert-dispatcher.ts Fires alerts with per-alert error isolation and Promise.allSettled.
lib/eval-queue.ts Adds a global eval/enrichment/filter processing queue with background scanning/processing.
instrumentation.ts Starts background queue processing on server startup when configured.
examples/alerts.js Adds an example showing alert registration and usage.
docs/api-reference.md Documents alerts API + background queue processing and UI.
contexts/ThemeContext.tsx Wraps theme persistence in try/catch for SSR/test environments.
app/dashboard/page.tsx Adds Queue Status panel to the dashboard page(s).
app/components/raw-log-viewer.tsx Removes batched dashboard action usage; panels fetch independently.
app/components/queue-status-panel.tsx Implements client UI for live queue status (polling + collapsible display).
app/components/eval-results-panel.tsx Adds progressive eval display using cache probes + queue-based processing.
app/components/enrichment-results-panel.tsx Adds progressive enrichment display using cache probes + queue-based processing.
app/actions/run-subagent-enrichments.ts Adds optional single-enricher execution and adjusts caching behavior.
app/actions/run-enrichments.ts Adds optional single-enricher parameter (re-using shared runner helper).
app/actions/queue-session.ts Adds foreground queue processing server action with promise coalescing.
app/actions/get-queue-status.ts Adds server action to provide queue status payload to the UI.
app/actions/check-eval-cache.ts Adds cache-probe server action returning names + cached results + misses.
app/actions/check-enrichment-cache.ts Adds cache-probe server action returning names + cached results + misses.
app/actions/run-session-dashboard.ts Removes the monolithic “run everything in one round-trip” server action.
tests/lib/eval-queue.test.ts Adds unit tests for queue ordering/state helpers/coalescing.
tests/lib/alert-system.test.ts Adds unit tests for alert registry + dispatcher isolation semantics.
tests/actions/check-eval-cache.test.ts Adds tests for the eval cache-probe server action.
tests/actions/check-enrichment-cache.test.ts Adds tests for the enrichment cache-probe server action.
README.md Updates CLI/docs to describe alerts and background processing.
CONTRIBUTING.md Updates repo layout docs and environment variables list.
CHANGELOG.md Adds 0.6.0 release notes covering alerts/queue/UI/progressive panels.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 101 to 112
if (flag === "--queue-interval") {
const value = inlineValue ?? args[i + 1];
if (value === undefined || (inlineValue === null && value.startsWith("-"))) {
console.error("Error: --queue-interval requires a positive integer (seconds)");
process.exit(1);
}
const parsed = parseInt(value, 10);
if (isNaN(parsed) || parsed <= 0) {
console.error("Error: --queue-interval must be a positive integer");
process.exit(1);
}
queueInterval = parsed;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--queue-interval is documented/validated as a positive integer, but parseInt() will accept values like "30s" or "1.5" (parsing them as 30/1) without error. Consider validating with a stricter check (e.g., /^�d+�$/) before parsing so non-integer inputs are rejected as the error message implies.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Added strict regex validation (/^\d+$/) before parseInt() to reject non-integer inputs like "30s" or "1.5". Applied the same fix to --queue-concurrency for consistency.

Comment on lines 522 to 529
state.intervalId = setInterval(async () => {
try {
await scanAndEnqueue();
await processBatch();
} catch (err) {
console.error("[eval-queue] Background processing error:", err);
}
}, intervalSec * 1000);
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startBackgroundProcessor uses setInterval with an async callback. If scanAndEnqueue()/processBatch() takes longer than intervalSec, multiple overlapping runs can occur, leading to concurrent queue mutation and duplicate processing. Add a re-entrancy guard (e.g., a running flag) or switch to a setTimeout loop that schedules the next tick only after the previous run completes.

Suggested change
state.intervalId = setInterval(async () => {
try {
await scanAndEnqueue();
await processBatch();
} catch (err) {
console.error("[eval-queue] Background processing error:", err);
}
}, intervalSec * 1000);
const runTick = async () => {
try {
await scanAndEnqueue();
await processBatch();
} catch (err) {
console.error("[eval-queue] Background processing error:", err);
} finally {
// Only schedule the next run if we haven't been stopped.
if (state.intervalId !== null) {
state.intervalId = setTimeout(runTick, intervalSec * 1000);
}
}
};
// Schedule the first run.
state.intervalId = setTimeout(runTick, intervalSec * 1000);

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed. The current implementation uses recursive setTimeout via scheduleNext() (lines 522-533), not setInterval. Each tick only schedules the next after the current one completes, preventing overlap.

Comment on lines 120 to 124
checkEnrichmentCacheAndList(projectName, sessionId, agentId, subagentType).then(probe => {
if (!mountedRef.current) return;
if (!probe.ok) { setError(probe.error); setProbing(false); return; }
if (!probe.hasEnrichers) { setNoEnrichers(true); setProbing(false); return; }

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial checkEnrichmentCacheAndList(...) call is not wrapped with a .catch. If the server action rejects, probing stays true and the rejection can go unhandled. Add a catch that sets an error state and clears probing (and similarly guard the inner queueAndProcessSession/freshProbe awaits).

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed. The outer .catch() at line 184-188 handles rejections from the initial checkEnrichmentCacheAndList() call, setting error state and clearing the probing flag.

Comment on lines 188 to 194
it("caps errors at MAX_ERRORS (50)", () => {
const state = getQueueState();
for (let i = 0; i < 60; i++) {
state.errors.push({ key: `proj/sess-${i}`, error: `error-${i}`, at: Date.now() });
}
// The cap is enforced in recordError, but we can verify the field is present
expect(state.errors.length).toBe(60);
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test claims to verify the MAX_ERRORS (50) cap, but it pushes 60 entries directly onto state.errors and then asserts length is 60. That doesn't exercise the actual cap logic in recordError, so the test is currently misleading. Prefer triggering the recordError path (e.g., via a failing processSession) or refactor to make the cap logic testable.

Suggested change
it("caps errors at MAX_ERRORS (50)", () => {
const state = getQueueState();
for (let i = 0; i < 60; i++) {
state.errors.push({ key: `proj/sess-${i}`, error: `error-${i}`, at: Date.now() });
}
// The cap is enforced in recordError, but we can verify the field is present
expect(state.errors.length).toBe(60);
it("caps recentErrors at MAX_ERRORS (50) in getDetailedStatus", () => {
const state = getQueueState();
for (let i = 0; i < 60; i++) {
state.errors.push({ key: `proj/sess-${i}`, error: `error-${i}`, at: Date.now() });
}
// getDetailedStatus should expose at most MAX_ERRORS recent errors
const detailed = getDetailedStatus();
expect(detailed.recentErrors).toHaveLength(50);

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid (duplicate of comment #8). Fixed with the same test rename — the test now accurately describes that it tests direct state mutation, not the recordError cap.

Comment on lines 178 to 182
checkEvalCacheAndList(projectName, sessionId, agentId, subagentType).then(probe => {
if (!mountedRef.current) return;
if (!probe.ok) { setError(probe.error); setProbing(false); return; }
if (!probe.hasEvals) { setNoEvals(true); setProbing(false); return; }

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial checkEvalCacheAndList(...) call is not wrapped with a .catch. If the server action rejects (network/server error), this effect will leave probing stuck true and can trigger an unhandled promise rejection. Add a catch that sets an error state and clears probing (and similarly guard the inner queueAndProcessSession/freshProbe awaits).

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed. The outer .catch() at line 242-246 handles rejections from the initial checkEvalCacheAndList() call, setting error state and clearing the probing flag.

if (freshProbe.ok && freshProbe.hasEvals) {
setItems(prev => prev.map(i => {
const fresh = freshProbe.cachedResults.find(r => r.name === i.name);
return fresh ? { status: "done", name: i.name, result: fresh, cached: false } : i;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After queue processing completes, the refresh mapping sets cached:false for every item with a freshProbe match, including items that were already shown as cached in Phase 1. That makes the per-item "cached" badge disappear even for cache hits. Update only the items that were previously loading (or preserve the existing cached flag when an item was already done).

Suggested change
return fresh ? { status: "done", name: i.name, result: fresh, cached: false } : i;
if (!fresh) return i;
// Preserve existing cached flag for items that were already done;
// only mark as non-cached for items that were previously loading.
const nextCached = i.status === "loading" ? false : i.cached;
return { ...i, status: "done", result: fresh, cached: nextCached };

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed. Line 210 already guards with i.status === "loading": return (fresh && i.status === "loading") ? { status: "done", ...cached: false } : i. Items already marked "done" with cached: true from Phase 1 are preserved.

if (freshProbe.ok && freshProbe.hasEnrichers) {
setItems(prev => prev.map(i => {
const fresh = freshProbe.cachedResults.find(r => r.name === i.name);
return fresh ? { status: "done", name: i.name, result: fresh, cached: false } : i;
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After queue processing completes, the refresh mapping sets cached:false for every item found in freshProbe, including items that were already displayed as cached in Phase 1. This breaks the per-item cached indicator. Consider only updating items that were previously loading, or preserve the existing cached flag for items already done.

Suggested change
return fresh ? { status: "done", name: i.name, result: fresh, cached: false } : i;
if (!fresh) return i;
// If this item was previously loading, it's a newly computed result (not from cache).
if (i.status === "loading") {
return { status: "done" as const, name: i.name, result: fresh, cached: false };
}
// Preserve existing cached flag (e.g., items that were already marked cached in Phase 1).
return { ...i, status: "done" as const, result: fresh };

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed. Line 152 uses the same guard: return (fresh && i.status === "loading") ? { status: "done", ...cached: false } : i. Phase 1 cached items are not overwritten.

Comment on lines 23 to 31
const result = await runSessionAction<any, EnrichRunResult, EnrichRunSummary>({
kind: "enrichments",
projectName,
sessionId,
forceRefresh,
forceRefresh: enricherName ? true : forceRefresh,
evalName: enricherName,
getItems: getSessionScopedEnrichers,
run: (rawLines, stats, items) =>
runAllEnrichers(rawLines, stats, projectName, sessionId, items as any, { source: 'session' }),
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runEnrichments passes enricherName via the runSessionAction "evalName" option. When an enricher name is not found, runSessionAction will return an error message like Eval "..." not found, which is confusing/inaccurate for enrichments. Consider updating the shared helper to use a neutral option name (e.g., itemName) or allow the caller to customize the not-found error string.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Added itemLabel option to runSessionAction (defaults to "Eval"). run-enrichments.ts now passes itemLabel: "Enricher", so the not-found error reads Enricher "..." not found instead of Eval "..." not found.

…ancy

Address remaining PR #30 review comments:
- Add .catch() to all promise chains in eval/enrichment panels (C2/C4/C14/C16)
- Add error fallbacks for unhappy paths in runAll and rerunSingleEval (C3/C5/C6/C9)
- Preserve cached badge in Phase 2 refresh by guarding on loading status (C17/C18)
- Replace setInterval with self-scheduling setTimeout to prevent re-entrancy (C13)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/eval-queue.ts (2)

293-306: Duplicated view-name derivation logic between scanAndEnqueue and processSession.

Lines 295 and 412 contain identical logic:

views.length > 0 ? views.map(v => v.name) : (defaultFilters.length > 0 ? ["default"] : [])

Consider extracting a small helper (e.g., getActiveViewNames()) to keep this in sync.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 293 - 306, Extract the duplicated view-name
derivation into a small helper function (e.g., getActiveViewNames) and use it
from both scanAndEnqueue and processSession: move the expression that computes
view names from views and defaultFilters (currently written as views.length > 0
? views.map(v => v.name) : (defaultFilters.length > 0 ? ["default"] : [])) into
getActiveViewNames(views, defaultFilters) which returns string[]; replace the
inline logic in the cache probe loop and the corresponding spot around line 412
with calls to getActiveViewNames so the logic is centralized and consistent.

260-313: scanAndEnqueue scans all sessions across all projects — consider bounding for large installations.

For repositories with many projects and sessions, this full scan on every background tick could become expensive. The per-session cache probing (eval + enrichment + filter) involves multiple async I/O calls. Currently bounded by batchAll(..., 10) concurrency, but unbounded in total count.

Consider adding a configurable limit on the number of sessions to scan per tick (e.g., scan only the N most recently modified sessions), or tracking which sessions have already been scanned since the last evalsHash change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 260 - 313, The scan currently iterates all
sessions every tick (sessions.map(...)) causing unbounded I/O; limit the work by
adding a configurable cap or scan-tracking: add a MAX_SESSIONS_PER_TICK config
and apply it before batchAll (e.g., pick the N most-recently-modified sessions
or simply sessions.slice(0, MAX_SESSIONS_PER_TICK)), or maintain a
scannedSessions Set keyed by `${projectName}/${sessionId}` that you clear when
evalsHash changes and skip already-scanned sessions; update the scan logic that
builds the sessions array (the code that calls batchAll and references
hashSessionFile, getPerItemCache/getCachedResult, and push) to use the
limited/filtered list so only up to N sessions are probed per tick.
app/components/enrichment-results-panel.tsx (1)

106-191: Near-identical two-phase loading logic duplicated across eval and enrichment panels.

The mount effect, state management, runAll, error handling, and rendering logic in this file are structurally identical to eval-results-panel.tsx (~80% shared). Consider extracting a shared hook (e.g., useProgressiveItems) that encapsulates the probe → populate → queue → re-probe flow, parameterized by cache-check and run functions. This would eliminate the risk of fixes being applied to one panel but not the other (as seen with the !result.hasEnrichers/!result.hasEvals gap in both files).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 106 - 191, The
EnrichmentResultsPanel duplicates the two-phase probe→populate→run→reprobe flow
already present in eval-results-panel.tsx; extract that shared behavior into a
new hook (e.g., useProgressiveItems) and replace the mount effect/state logic in
EnrichmentResultsPanel with the hook; the hook should be parameterized with the
probe function (checkEnrichmentCacheAndList), the queue/run function
(queueAndProcessSession for session-mode or runSubagentEnrichments for
subagent-mode), and callbacks to map results into EnrichItemState, and must
preserve mountedRef cancellation, probing/error/noEnrichers flags, and the logic
that marks cached vs fresh results (ensure you unify the hasEnrichers/hasEvals
checks to avoid the existing gap); then update eval-results-panel to use the
same hook so both panels share the identical flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 161-183: The item stays stuck in "loading" when
runSubagentEnrichments returns { ok: true, hasEnrichers: false } because the
code just returns without updating state; modify the branch in the Promise.all
loop that currently reads `if (!result.hasEnrichers) return;` to instead check
mountedRef.current and call setItems to update the matching entry for
enricherName (e.g., set status to "error" with a clear message like "no
enrichers available" or other appropriate terminal state), using the same
mapping logic as the other branches so the UI stops spinning; reference
runSubagentEnrichments, mountedRef, setItems, and probe.uncachedNames when
making this change.

In `@app/components/eval-results-panel.tsx`:
- Around line 219-241: When runSubagentEvals returns ok:true but hasEvals:false
the loop currently returns early leaving that eval item stuck in "loading";
update the branch inside the Promise.all for each evalName so that when
result.ok is true and result.hasEvals is false you still call setItems to
transition the item out of loading (for example setItems(prev => prev.map(i =>
i.name === evalName ? { status: "done", name: evalName, result: null, cached:
false } : i))). Make this change in the block that calls runSubagentEvals
(referencing probe.uncachedNames, runSubagentEvals, mountedRef, and setItems) so
every code path updates the item state.

---

Nitpick comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 106-191: The EnrichmentResultsPanel duplicates the two-phase
probe→populate→run→reprobe flow already present in eval-results-panel.tsx;
extract that shared behavior into a new hook (e.g., useProgressiveItems) and
replace the mount effect/state logic in EnrichmentResultsPanel with the hook;
the hook should be parameterized with the probe function
(checkEnrichmentCacheAndList), the queue/run function (queueAndProcessSession
for session-mode or runSubagentEnrichments for subagent-mode), and callbacks to
map results into EnrichItemState, and must preserve mountedRef cancellation,
probing/error/noEnrichers flags, and the logic that marks cached vs fresh
results (ensure you unify the hasEnrichers/hasEvals checks to avoid the existing
gap); then update eval-results-panel to use the same hook so both panels share
the identical flow.

In `@lib/eval-queue.ts`:
- Around line 293-306: Extract the duplicated view-name derivation into a small
helper function (e.g., getActiveViewNames) and use it from both scanAndEnqueue
and processSession: move the expression that computes view names from views and
defaultFilters (currently written as views.length > 0 ? views.map(v => v.name) :
(defaultFilters.length > 0 ? ["default"] : [])) into getActiveViewNames(views,
defaultFilters) which returns string[]; replace the inline logic in the cache
probe loop and the corresponding spot around line 412 with calls to
getActiveViewNames so the logic is centralized and consistent.
- Around line 260-313: The scan currently iterates all sessions every tick
(sessions.map(...)) causing unbounded I/O; limit the work by adding a
configurable cap or scan-tracking: add a MAX_SESSIONS_PER_TICK config and apply
it before batchAll (e.g., pick the N most-recently-modified sessions or simply
sessions.slice(0, MAX_SESSIONS_PER_TICK)), or maintain a scannedSessions Set
keyed by `${projectName}/${sessionId}` that you clear when evalsHash changes and
skip already-scanned sessions; update the scan logic that builds the sessions
array (the code that calls batchAll and references hashSessionFile,
getPerItemCache/getCachedResult, and push) to use the limited/filtered list so
only up to N sessions are probed per tick.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69a0445 and e292070.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • app/components/enrichment-results-panel.tsx
  • app/components/eval-results-panel.tsx
  • lib/eval-queue.ts

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 33 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 263 to 264
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error", name: i.name, error: "Eval run failed" } : i
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the subagent branch of runAll, a non-ok response discards the server-provided result.error and replaces it with a generic "Eval run failed" for every row. This hides actionable debugging information; use result.error (or a fallback) when populating error rows.

Suggested change
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error", name: i.name, error: "Eval run failed" } : i
const errorMessage = (result as any).error || "Eval run failed";
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error" as const, name: i.name, error: errorMessage } : i

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Updated the runAll subagent error branch to use result.error (with fallback) instead of the generic "Eval run failed" message, so server-provided error details are now surfaced in the UI.

setAllCached(false);
} else {
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error", name: i.name, error: "Enrichment run failed" } : i
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the subagent branch of runAll, a non-ok response discards the server-provided result.error and replaces it with a generic "Enrichment run failed" for every row. Preserve result.error (with a fallback) so failures are diagnosable from the UI.

Suggested change
i.status === "loading" ? { status: "error", name: i.name, error: "Enrichment run failed" } : i
i.status === "loading"
? { status: "error", name: i.name, error: result.error || "Enrichment run failed" }
: i

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Updated the runAll subagent error branch to use result.error (with fallback) instead of the generic "Enrichment run failed" message, preserving server-provided error details in the UI.

Comment on lines 156 to 159
if (existing) {
// Upgrade priority if higher (lower number = higher priority)
if (priority < existing.priority) {
existing.priority = priority;
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an item is already queued, push() returns early after possibly upgrading priority, but it never updates contentHash. If the session log file changes between scans/enqueues, the queued item can be processed with a stale hash, leading to incorrect cache hits and writing fresh results under the old hash. Consider updating existing.contentHash (and possibly addedAt) when the new contentHash differs, or recomputing the hash inside processSession() just before cache access.

Suggested change
if (existing) {
// Upgrade priority if higher (lower number = higher priority)
if (priority < existing.priority) {
existing.priority = priority;
if (existing) {
let changed = false;
// Refresh content hash (and timestamp) if the session contents have changed
if (existing.contentHash !== contentHash) {
existing.contentHash = contentHash;
existing.addedAt = Date.now();
changed = true;
}
// Upgrade priority if higher (lower number = higher priority)
if (priority < existing.priority) {
existing.priority = priority;
changed = true;
}
if (changed) {

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Updated push() to refresh contentHash and addedAt when the session contents have changed since the item was first queued. This prevents stale hash values from causing incorrect cache hits during processing.

- Fix stuck-loading when !hasEvals/!hasEnrichers on subagent path
- Remove cumulative processedCount from queue panel visibility gate
- Add aria-expanded to queue panel toggle buttons
- Add strict regex validation for --queue-interval/--queue-concurrency
- Rename misleading eval-queue test title
- Add itemLabel to runSessionAction for correct enricher error messages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Preserve server-provided error messages in runAll subagent branches
  instead of generic "Eval/Enrichment run failed" strings
- Update contentHash and addedAt in push() when session contents change,
  preventing stale cache hits during queue processing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/components/enrichment-results-panel.tsx (1)

281-289: ⚠️ Potential issue | 🟡 Minor

Collapse button is missing aria-expanded.

The PR commit message explicitly lists "improves accessibility (aria-expanded)" as a change, and the toggle correctly manages collapsed state, but the <button> at Line 281 doesn't expose aria-expanded to assistive technologies.

♿ Proposed fix
         <button
           onClick={() => setCollapsed((prev) => !prev)}
+          aria-expanded={!collapsed}
           className="flex items-center gap-2 hover:opacity-80 transition-opacity"
         >
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 281 - 289, The
collapse toggle button in the EnrichmentResultsPanel is missing an accessibility
attribute: add aria-expanded to the <button> that reflects the current collapsed
state (use the existing collapsed state and set aria-expanded to the boolean for
"expanded" i.e. !collapsed) so assistive tech can detect expansion; update the
button where setCollapsed and collapsed are used (the element rendering
ChevronDown and Database) to include aria-expanded and consider also adding
aria-controls pointing to the collapsible region id if present.
♻️ Duplicate comments (1)
app/components/eval-results-panel.tsx (1)

233-236: ⚠️ Potential issue | 🟠 Major

Subagent mount path still has a missing terminal branch.

At Line 233, when result.hasEvals is true but result.summary.results[0] is absent, the item remains in loading indefinitely.

Proposed fix
             const fresh = result.summary.results[0];
             if (fresh) {
               setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
+            } else {
+              setItems(prev => prev.map(i =>
+                i.name === evalName
+                  ? { status: "error", name: evalName, error: "No result returned" }
+                  : i
+              ));
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 233 - 236, The mapping
that updates items when an eval summary arrives lacks a terminal branch for the
case where result.hasEvals is true but result.summary.results[0] (fresh) is
undefined, leaving the item stuck in "loading"; update the setItems call in the
component that handles result (referencing result.hasEvals,
result.summary.results, setItems and evalName) to include an else branch that
maps the matching item to a terminal state (e.g., set status to "done" or
"no-results", set result to null and cached: false) so the UI no longer remains
in loading when no summary result is present.
🧹 Nitpick comments (3)
app/actions/run-enrichments.ts (1)

27-27: forceRefresh override is redundant when enricherName is set.

runSessionAction already bypasses the per-item cache when opts.evalName is truthy (its guard is if (!opts.forceRefresh && !opts.evalName)), so coercing forceRefresh to true here has no additional effect.

♻️ Proposed simplification
-    forceRefresh: enricherName ? true : forceRefresh,
+    forceRefresh,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/actions/run-enrichments.ts` at line 27, The current ternary that sets
forceRefresh: enricherName ? true : forceRefresh is redundant; update the call
that builds the options passed to runSessionAction (the object containing
forceRefresh) to stop forcing true when enricherName/enricherName is set and
instead pass the existing forceRefresh value unchanged (i.e., remove the
enricherName ? true override), leaving runSessionAction to handle opts.evalName
and cache bypass logic.
app/components/queue-status-panel.tsx (1)

15-23: timeAgo has no days branch — old items display as 48h ago / 168h ago.

After the hours bucket the value grows without bound. Adding a simple days branch improves readability for items that have been sitting in the queue for over a day.

♻️ Proposed fix
  const hours = Math.floor(minutes / 60);
- return `${hours}h ago`;
+ if (hours < 24) return `${hours}h ago`;
+ const days = Math.floor(hours / 24);
+ return `${days}d ago`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/queue-status-panel.tsx` around lines 15 - 23, The timeAgo
function grows hours unbounded (e.g., "48h ago"); modify timeAgo to add a days
branch after the hours calculation: compute const days = Math.floor(hours / 24)
and if days < X (use 7 or no upper cap as desired) return `${days}d ago`,
otherwise keep existing behavior for hours when hours < 24; update the logic
around the existing variables seconds, minutes, hours in function timeAgo to
return `${hours}h ago` only when hours < 24 and `${days}d ago` when >= 24.
app/components/eval-results-panel.tsx (1)

389-392: Add ARIA state to the collapse toggle for better accessibility.

The toggle button controls collapsed content but does not expose expanded state to assistive tech.

Proposed fix
         <button
           onClick={() => setCollapsed((prev) => !prev)}
           className="flex items-center gap-2 hover:opacity-80 transition-opacity"
+          aria-expanded={!collapsed}
+          aria-controls="eval-results-list"
         >
@@
-        <div className="divide-y divide-border/50">
+        <div id="eval-results-list" className="divide-y divide-border/50">

Also applies to: 450-452

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 389 - 392, The collapse
toggle button that calls setCollapsed((prev) => !prev) lacks exposed state for
assistive tech; update the button (the element using collapsed and setCollapsed)
to include aria-expanded={collapsed ? "true" : "false"} and aria-controls
pointing to the id of the collapsible content, and give the collapsible panel
element (the DOM node rendered when collapsed/expanded around lines ~450) a
stable id and appropriate aria-hidden (or role="region") so screen readers know
its visibility; ensure the same changes are applied to the second toggle
referenced at lines 450-452.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 36-49: EnrichmentLoadingRow duplicates vertical spacing by adding
"pt-2 first:pt-0" on its root div; remove those utility classes from the root
div's className in the EnrichmentLoadingRow component so it relies on the parent
wrapper (consistent with EnricherGroup and EnrichmentErrorRow) and keep the
remaining classNames and internal layout unchanged.
- Around line 219-228: runAll resets all items to "loading" but when
freshProbe.cachedResults is partial some items are never updated and stay
"loading"; update the success branch that handles freshProbe (the block using
checkEnrichmentCacheAndList and setItems) to map over the existing or resetting
list and for each enricher: if it appears in freshProbe.cachedResults mark
status "done" and attach result, otherwise convert any remaining "loading"
entries to an "error" state (e.g., status: "error", error: "Enrichment missing
after probe") so no item remains perpetually "loading"; use the same identifiers
used in the diff (checkEnrichmentCacheAndList, freshProbe.cachedResults,
setItems, setAllCached, mountedRef) to locate and implement this merge-fallback
logic.

In `@app/components/eval-results-panel.tsx`:
- Around line 294-329: The per-item rerun in rerunSingleEval updates an item to
cached: false via setItems but never updates the header-level allCached state,
causing the cached badge to stay visible; fix by updating allCached whenever you
change an item's cached flag (e.g., inside the setItems functional updater,
compute const updated = prev.map(...) and then call setAllCached(updated.every(i
=> i.cached)) before returning updated) and apply this in both the agentId
branch and the queue branch where you set cached: false (and similarly
recompute/setAllCached when you set cached: true elsewhere).
- Around line 205-212: When re-probing with checkEvalCacheAndList returns
non-success or missing entries, ensure loading rows are transitioned out of
"loading" — after the mountedRef.current check, always call setItems and map
existing items: if freshProbe.ok && freshProbe.hasEvals and a matching
freshProbe.cachedResults entry exists, set the item to { status: "done", result:
fresh, cached: false }; otherwise for items still "loading" set a terminal state
(e.g., status: "error" or "no-eval" and clear/annotate result) so rows don’t
remain stuck; use the existing symbols setItems, freshProbe.cachedResults,
checkEvalCacheAndList, mountedRef.current and preserve existing item.name
matching logic.
- Around line 176-179: The current mountedRef check can allow stale async
results to overwrite state because it’s reset on each effect run; introduce a
per-request token (e.g., a unique requestId or AbortController signal) created
at the start of the effect and stored in a ref, pass that token/signal into
checkEvalCacheAndList, queueAndProcessSession, and runSubagentEvals (and into
runAll and rerunSingleEval), and in every async continuation verify the token
still matches (or signal not aborted) before writing state; also ensure the
effect cleanup invalidates/aborts the token so any in-flight promise will be
ignored when it resolves.

---

Outside diff comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 281-289: The collapse toggle button in the EnrichmentResultsPanel
is missing an accessibility attribute: add aria-expanded to the <button> that
reflects the current collapsed state (use the existing collapsed state and set
aria-expanded to the boolean for "expanded" i.e. !collapsed) so assistive tech
can detect expansion; update the button where setCollapsed and collapsed are
used (the element rendering ChevronDown and Database) to include aria-expanded
and consider also adding aria-controls pointing to the collapsible region id if
present.

---

Duplicate comments:
In `@app/components/eval-results-panel.tsx`:
- Around line 233-236: The mapping that updates items when an eval summary
arrives lacks a terminal branch for the case where result.hasEvals is true but
result.summary.results[0] (fresh) is undefined, leaving the item stuck in
"loading"; update the setItems call in the component that handles result
(referencing result.hasEvals, result.summary.results, setItems and evalName) to
include an else branch that maps the matching item to a terminal state (e.g.,
set status to "done" or "no-results", set result to null and cached: false) so
the UI no longer remains in loading when no summary result is present.

---

Nitpick comments:
In `@app/actions/run-enrichments.ts`:
- Line 27: The current ternary that sets forceRefresh: enricherName ? true :
forceRefresh is redundant; update the call that builds the options passed to
runSessionAction (the object containing forceRefresh) to stop forcing true when
enricherName/enricherName is set and instead pass the existing forceRefresh
value unchanged (i.e., remove the enricherName ? true override), leaving
runSessionAction to handle opts.evalName and cache bypass logic.

In `@app/components/eval-results-panel.tsx`:
- Around line 389-392: The collapse toggle button that calls setCollapsed((prev)
=> !prev) lacks exposed state for assistive tech; update the button (the element
using collapsed and setCollapsed) to include aria-expanded={collapsed ? "true" :
"false"} and aria-controls pointing to the id of the collapsible content, and
give the collapsible panel element (the DOM node rendered when
collapsed/expanded around lines ~450) a stable id and appropriate aria-hidden
(or role="region") so screen readers know its visibility; ensure the same
changes are applied to the second toggle referenced at lines 450-452.

In `@app/components/queue-status-panel.tsx`:
- Around line 15-23: The timeAgo function grows hours unbounded (e.g., "48h
ago"); modify timeAgo to add a days branch after the hours calculation: compute
const days = Math.floor(hours / 24) and if days < X (use 7 or no upper cap as
desired) return `${days}d ago`, otherwise keep existing behavior for hours when
hours < 24; update the logic around the existing variables seconds, minutes,
hours in function timeAgo to return `${hours}h ago` only when hours < 24 and
`${days}d ago` when >= 24.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e292070 and 8d247c9.

📒 Files selected for processing (7)
  • __tests__/lib/eval-queue.test.ts
  • app/actions/run-enrichments.ts
  • app/actions/run-session-action.ts
  • app/components/enrichment-results-panel.tsx
  • app/components/eval-results-panel.tsx
  • app/components/queue-status-panel.tsx
  • scripts/parse-script-args.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • scripts/parse-script-args.ts
  • tests/lib/eval-queue.test.ts

Comment on lines 176 to 179
mountedRef.current = true;

checkEvalCacheAndList(projectName, sessionId, agentId, subagentType).then(probe => {
if (!mountedRef.current) return;
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n app/components/eval-results-panel.tsx | head -200

Repository: exospherehost/claudeye

Length of output: 8670


🏁 Script executed:

rg -n 'mountedRef\.current|checkEvalCacheAndList\(|queueAndProcessSession\(|runSubagentEvals\(' app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 1986


🏁 Script executed:

rg -n 'request(Id|Seq|Token)|AbortController|sessionKeyRef' app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 48


🏁 Script executed:

sed -n '164,252p' app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 4436


🏁 Script executed:

sed -n '252,330p' app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 4171


mountedRef alone does not prevent stale async updates after prop changes.

Because the effect at line 175 resets mountedRef.current = true on every run (when projectName, sessionId, agentId, or subagentType change), older in-flight promises from the previous effect can still pass mount checks and overwrite state with stale data. Add a per-request token (or abort signal) and validate it before each state write in the effect, runAll, and rerunSingleEval.

This applies to all async chains: checkEvalCacheAndList, queueAndProcessSession, and runSubagentEvals across the three async contexts (initial load, runAll, and rerunSingleEval).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 176 - 179, The current
mountedRef check can allow stale async results to overwrite state because it’s
reset on each effect run; introduce a per-request token (e.g., a unique
requestId or AbortController signal) created at the start of the effect and
stored in a ref, pass that token/signal into checkEvalCacheAndList,
queueAndProcessSession, and runSubagentEvals (and into runAll and
rerunSingleEval), and in every async continuation verify the token still matches
(or signal not aborted) before writing state; also ensure the effect cleanup
invalidates/aborts the token so any in-flight promise will be ignored when it
resolves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Added a per-effect let stale = false flag that is set to true in the cleanup function. All async continuations now check stale || \!mountedRef.current before writing state, preventing stale updates from previous effect runs when props change. Applied the same pattern to the enrichment panel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@NiveditJain have exceeded the limit for the number of chat messages per hour. Please wait 0 minutes and 8 seconds before sending another message.

Comment on lines 294 to 329
const rerunSingleEval = useCallback(async (evalName: string) => {
setItems(prev => prev.map(i => i.name === evalName ? { status: "loading", name: evalName } : i));
try {
const result = agentId
? await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, false, evalName)
: await runEvals(projectName, sessionId, false, evalName);
if (controller.signal.aborted) return;
if (!result.ok) {
setError(result.error);
return;
}
if (!result.hasEvals) return;

// Merge the single fresh result into existing summary
const freshResult = result.summary.results[0];
if (!freshResult) return;

if (controller.signal.aborted) return;
setSummary((prev) => {
if (!prev) return prev;
const updatedResults = prev.results.map((r) =>
r.name === freshResult.name ? freshResult : r,
);
let passCount = 0, failCount = 0, errorCount = 0, skippedCount = 0, totalDurationMs = 0;
for (const r of updatedResults) {
totalDurationMs += r.durationMs || 0;
if (r.skipped) skippedCount++;
else if (r.error) errorCount++;
else if (r.pass) passCount++;
else failCount++;
if (agentId) {
const result = await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, false, evalName);
if (!mountedRef.current) return;
if (result.ok && result.hasEvals) {
const fresh = result.summary.results[0];
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
} else {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "No result returned" } : i));
}
} else {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "Eval re-run failed" } : i));
}
return {
...prev,
results: updatedResults,
passCount,
failCount,
errorCount,
skippedCount,
totalDurationMs,
};
});
setCached(false);
} catch (e) {
if (controller.signal.aborted) return;
setError(e instanceof Error ? e.message : "Failed to rerun eval");
} else {
const queueResult = await queueAndProcessSession(projectName, sessionId, { forceEvalNames: [evalName] });
if (!mountedRef.current) return;
if (!queueResult.ok) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: queueResult.error || "Queue processing failed" } : i));
return;
}
const freshProbe = await checkEvalCacheAndList(projectName, sessionId, agentId, subagentType);
if (!mountedRef.current) return;
if (freshProbe.ok && freshProbe.hasEvals) {
const fresh = freshProbe.cachedResults.find(r => r.name === evalName);
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
} else {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "No result after re-run" } : i));
}
} else {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "Eval unavailable after processing" } : i));
}
}
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

allCached can become stale after single-item re-runs.

At Line 294, per-item re-run transitions an item to cached: false, but allCached is not updated. This can keep the header-level “cached” badge visible incorrectly at Line 428.

Proposed fix
   const rerunSingleEval = useCallback(async (evalName: string) => {
+    setAllCached(false);
     setItems(prev => prev.map(i => i.name === evalName ? { status: "loading", name: evalName } : i));

Also applies to: 428-432

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 294 - 329, The per-item
rerun in rerunSingleEval updates an item to cached: false via setItems but never
updates the header-level allCached state, causing the cached badge to stay
visible; fix by updating allCached whenever you change an item's cached flag
(e.g., inside the setItems functional updater, compute const updated =
prev.map(...) and then call setAllCached(updated.every(i => i.cached)) before
returning updated) and apply this in both the agentId branch and the queue
branch where you set cached: false (and similarly recompute/setAllCached when
you set cached: true elsewhere).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid. Added setAllCached(false) at the start of rerunSingleEval so the header "cached" badge is cleared immediately when any single item is re-run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate Limit Exceeded

@NiveditJain have exceeded the limit for the number of chat messages per hour. Please wait 0 minutes and 5 seconds before sending another message.

- Add per-effect `stale` flag to prevent stale async updates after prop
  changes (both eval and enrichment panels)
- Handle non-success and partial results from Phase 2 re-probe, ensuring
  all loading items reach a terminal state
- Add cleanup pass in enrichment runAll for items not covered by fresh probe
- Clear allCached on single-item eval re-run to prevent stale badge
- Remove duplicate pt-2 spacing from EnrichmentLoadingRow/EnrichmentErrorRow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 34 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +264 to 268
// Re-run All handler
const runAll = useCallback(async () => {
setGlobalRunning(true);
setItems(prev => prev.map(i => ({ status: "loading" as const, name: i.name })));
try {
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When subagent run-all fails (either ok:false or hasEvals:false), the UI sets a generic "Eval run failed" message and discards the server-provided error. Prefer surfacing result.error (or a distinct "No evals available" message) so users can understand what went wrong.

Copilot uses AI. Check for mistakes.
Comment on lines +306 to +309
setItems(prev => prev.map(i => i.name === evalName ? { status: "loading", name: evalName } : i));
try {
const result = agentId
? await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, false, evalName)
: await runEvals(projectName, sessionId, false, evalName);
if (controller.signal.aborted) return;
if (!result.ok) {
setError(result.error);
return;
}
if (!result.hasEvals) return;

// Merge the single fresh result into existing summary
const freshResult = result.summary.results[0];
if (!freshResult) return;

if (controller.signal.aborted) return;
setSummary((prev) => {
if (!prev) return prev;
const updatedResults = prev.results.map((r) =>
r.name === freshResult.name ? freshResult : r,
);
let passCount = 0, failCount = 0, errorCount = 0, skippedCount = 0, totalDurationMs = 0;
for (const r of updatedResults) {
totalDurationMs += r.durationMs || 0;
if (r.skipped) skippedCount++;
else if (r.error) errorCount++;
else if (r.pass) passCount++;
else failCount++;
if (agentId) {
const result = await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, false, evalName);
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On subagent single-eval re-run failure, the UI sets a generic "Eval re-run failed" message even when the server action provides a specific result.error. Propagate result.error (and handle hasEvals:false separately) so the row shows actionable failure details.

Copilot uses AI. Check for mistakes.
Comment on lines +207 to 210
const runAll = useCallback(async () => {
setGlobalRunning(true);
setItems(prev => prev.map(i => ({ status: "loading" as const, name: i.name })));
try {
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When subagent run-all fails (either ok:false or hasEnrichers:false), the UI sets a generic "Enrichment run failed" message and discards the server-provided error. Prefer surfacing result.error (or a distinct "No enrichers available" message) so users can diagnose the failure.

Copilot uses AI. Check for mistakes.
Comment on lines +106 to +110
// Hide when inactive: no items, not processing, nothing processed, no background
if (
!status ||
(status.queued === 0 &&
status.processing === 0 &&
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hide condition comment says "nothing processed", but processedCount is not part of the condition. Either include processedCount === 0 in the hide check, or update the comment to match the actual behavior (hide when queued/processing/background/error are all zero).

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
lib/eval-queue.ts (2)

533-544: state.intervalId holds a stale (already-fired) timer ID during callback execution.

When the setTimeout callback fires, state.intervalId still contains the ID of the timer that just expired. Any concurrent call to startBackgroundProcessor would see a non-null state.intervalId and correctly return early (Line 529). Calling stopBackgroundProcessor during async processing clears this stale ID (a clearTimeout no-op) and sets it to null, causing Line 541's guard to correctly skip rescheduling. The logic is sound, but updating state.intervalId to null at the top of the callback (before the async work) and relying on scheduleNext to set a new ID at the end would make the lifecycle more explicit and easier to reason about.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 533 - 544, The timeout callback currently
leaves state.intervalId set to the fired timer ID while running async work,
which can mislead concurrent start/stop logic; modify the callback in
scheduleNext so the first action inside the setTimeout handler is to set
state.intervalId = null, then perform await scanAndEnqueue() and await
processBatch(), and finally call scheduleNext() to create and assign a fresh
timer ID; ensure startBackgroundProcessor and stopBackgroundProcessor semantics
remain the same and that scheduleNext is the only place that assigns
state.intervalId.

441-469: Alerts fire on every processSession invocation, including UI-triggered force-refresh reruns.

allEvalResults and allEnrichResults accumulate both cached and freshly computed results, so fireAlerts is called every time processSession runs — even when all results came from cache (Phase 2 coalescing) or when the user clicks "Re-run" (which calls queueAndProcessSession({ forceRefresh: true })). Per the PR description, alerts are designed to fire "after all evals and enrichments complete for a session," but force-refresh reruns from the UI would fire alerts redundantly alongside each rerun click.

If this is intentional, consider documenting it. Otherwise, consider gating fireAlerts on whether any fresh (non-cached) work was actually performed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/eval-queue.ts` around lines 441 - 469, The current logic calls fireAlerts
whenever hasAlerts() is true because allEvalResults/allEnrichResults include
cached results, causing alerts on every processSession (including UI
force-refresh reruns); modify the gating so fireAlerts only runs when there was
at least one fresh (non-cached) evaluation or enrichment performed in this run —
detect fresh work by tracking a boolean (e.g., named sawFreshEval or
sawFreshEnrich) where eval/enrich execution paths set it to true (ensure
queueAndProcessSession({ forceRefresh }) and processSession propagate
forceRefresh to force execution), and change the condition around fireAlerts to
require hasAlerts() && (sawFreshEval || sawFreshEnrich) before calling
fireAlerts with the same payload (projectName, sessionId, evalSummary,
enrichSummary).
app/components/enrichment-results-panel.tsx (2)

231-242: Cleanup pass at Lines 239–242 is a no-op in the success branch and redundant in the else branch.

When freshProbe.ok && freshProbe.hasEnrichers is true, Line 232 calls setItems(freshProbe.cachedResults.map(...)) which replaces the entire items array with only "done" entries — there are no "loading" items left for the cleanup pass to act on. When the else branch fires, Lines 235–237 already transition all "loading" items to "error", so the cleanup pass is again a no-op. The comment suggests it covers partial cachedResults, but that scenario is not reachable because the non-agentId runAll path calls queueAndProcessSession with forceRefresh: true, which should persist all enricher results before the re-probe. Consider removing the dead cleanup pass, or replacing the wholesale setItems(freshProbe.cachedResults.map(...)) with a merge pattern similar to Phase 2 (so dropped items surface as errors rather than silently disappearing).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 231 - 242, The
final setItems cleanup pass is redundant: when freshProbe.ok &&
freshProbe.hasEnrichers you replace items via
setItems(freshProbe.cachedResults.map(...)) so there are no "loading" entries,
and in the else branch you already map loading→error; remove the trailing
cleanup pass (the setItems(prev => prev.map(...)) at the end) OR instead of
replacing items wholesale in the success branch, merge freshProbe.cachedResults
with the existing items (using the existing Phase 2 merge pattern) so entries
not present in freshProbe.cachedResults are preserved and can be turned into
errors; update references in this component to freshProbe, setItems and
cachedResults and ensure callers like
runAll/queueAndProcessSession/forceRefresh/agentId behavior remain consistent.

158-163: Same dead-code ternary as in the eval panel.

When execution reaches this else branch, freshProbe.ok is always false, making the third operand unreachable.

🧹 Proposed simplification
-                ? { status: "error", name: i.name, error: freshProbe.ok ? "Enrichments unavailable after processing" : (!freshProbe.ok ? freshProbe.error : "Failed to refresh enrichment cache") }
+                ? { status: "error", name: i.name, error: freshProbe.ok ? "Enrichments unavailable after processing" : (freshProbe.error || "Failed to refresh enrichment cache") }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 158 - 163, The
ternary inside the setItems call in enrichment-results-panel (where
setItems(prev => prev.map(...))) contains a dead branch because in this else
branch freshProbe.ok is always false; simplify the error assignment by removing
the unreachable third operand and choosing the correct message based on
freshProbe.ok being false (e.g., use freshProbe.error or a single fallback
message like "Enrichments unavailable after processing"). Update the mapping
logic inside setItems (the branch that checks i.status === "loading") to set
error to either freshProbe.error if present or the fallback message, eliminating
the redundant conditional involving freshProbe.ok.
app/components/eval-results-panel.tsx (1)

217-221: Dead code in the error-message ternary.

Inside the else branch (Line 216), freshProbe.ok is guaranteed to be false, making !freshProbe.ok always true. The third operand "Failed to refresh eval cache" is unreachable.

🧹 Proposed simplification
-                ? { status: "error", name: i.name, error: freshProbe.ok ? "Evals unavailable after processing" : (!freshProbe.ok ? freshProbe.error : "Failed to refresh eval cache") }
+                ? { status: "error", name: i.name, error: freshProbe.ok ? "Evals unavailable after processing" : (freshProbe.error || "Failed to refresh eval cache") }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 217 - 221, The ternary
constructing the error string inside the setItems mapping is dead/overly complex
because freshProbe.ok is known false there; simplify the logic in the
setItems(prev => prev.map(...)) branch for i.status === "loading" to set error
to freshProbe.error (or freshProbe.error ?? "Failed to refresh eval cache")
instead of using freshProbe.ok ? ... : (!freshProbe.ok ? freshProbe.error :
"Failed to refresh eval cache"); update the mapping that touches status, name
and error (the setItems callback referencing prev, i and freshProbe) to remove
the unreachable branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 186-188: The branch that updates items when `fresh` exists leaves
the item stuck in "loading" if `fresh` is falsy (e.g.,
`result.summary.results[0]` is undefined); update the logic around the
`setItems` call for `enricherName` to include an else branch that sets the
item's status out of "loading" (e.g., { status: "done", name: enricherName,
result: undefined, cached: false } or an appropriate error status) so the UI
won't remain in loading; modify the code where `if (fresh) { setItems(prev =>
prev.map(...)) }` to handle the falsy case for `fresh` using the same
`setItems`/`enricherName` symbols.

In `@app/components/eval-results-panel.tsx`:
- Around line 244-246: The current check only updates the item when fresh is
truthy, leaving the item stuck in "loading" if result.summary.results[0] is
undefined; update the state in an else branch after the existing if (fresh) so
setItems(...) still runs and clears the loading state (e.g., set status to
"done" for the item identified by evalName and provide a result of undefined or
an appropriate empty value and cached: false) using the same setItems callback
to ensure the UI doesn't remain in loading when fresh is absent.
- Around line 265-301: runAll (and rerunSingleEval) can resume after an effect
reset because they read the shared mountedRef; add a per-invocation cancellation
token (e.g., an AbortController or a local cancelled flag) inside runAll and
rerunSingleEval so each call can detect if it was superseded before applying
state. Specifically: create a local controller/token at the top of
runAll/rerunSingleEval, check token.cancelled/ signal.aborted before every
setItems and setGlobalRunning and after each await, and abort or ignore results
when the token is triggered; where possible pass the signal into async helpers
like runSubagentEvals/queueAndProcessSession/checkEvalCacheAndList so they can
early-return. Ensure the token is scoped to the invocation (not mountedRef) so
stale callbacks never mutate state using old closure values.

---

Nitpick comments:
In `@app/components/enrichment-results-panel.tsx`:
- Around line 231-242: The final setItems cleanup pass is redundant: when
freshProbe.ok && freshProbe.hasEnrichers you replace items via
setItems(freshProbe.cachedResults.map(...)) so there are no "loading" entries,
and in the else branch you already map loading→error; remove the trailing
cleanup pass (the setItems(prev => prev.map(...)) at the end) OR instead of
replacing items wholesale in the success branch, merge freshProbe.cachedResults
with the existing items (using the existing Phase 2 merge pattern) so entries
not present in freshProbe.cachedResults are preserved and can be turned into
errors; update references in this component to freshProbe, setItems and
cachedResults and ensure callers like
runAll/queueAndProcessSession/forceRefresh/agentId behavior remain consistent.
- Around line 158-163: The ternary inside the setItems call in
enrichment-results-panel (where setItems(prev => prev.map(...))) contains a dead
branch because in this else branch freshProbe.ok is always false; simplify the
error assignment by removing the unreachable third operand and choosing the
correct message based on freshProbe.ok being false (e.g., use freshProbe.error
or a single fallback message like "Enrichments unavailable after processing").
Update the mapping logic inside setItems (the branch that checks i.status ===
"loading") to set error to either freshProbe.error if present or the fallback
message, eliminating the redundant conditional involving freshProbe.ok.

In `@app/components/eval-results-panel.tsx`:
- Around line 217-221: The ternary constructing the error string inside the
setItems mapping is dead/overly complex because freshProbe.ok is known false
there; simplify the logic in the setItems(prev => prev.map(...)) branch for
i.status === "loading" to set error to freshProbe.error (or freshProbe.error ??
"Failed to refresh eval cache") instead of using freshProbe.ok ? ... :
(!freshProbe.ok ? freshProbe.error : "Failed to refresh eval cache"); update the
mapping that touches status, name and error (the setItems callback referencing
prev, i and freshProbe) to remove the unreachable branch.

In `@lib/eval-queue.ts`:
- Around line 533-544: The timeout callback currently leaves state.intervalId
set to the fired timer ID while running async work, which can mislead concurrent
start/stop logic; modify the callback in scheduleNext so the first action inside
the setTimeout handler is to set state.intervalId = null, then perform await
scanAndEnqueue() and await processBatch(), and finally call scheduleNext() to
create and assign a fresh timer ID; ensure startBackgroundProcessor and
stopBackgroundProcessor semantics remain the same and that scheduleNext is the
only place that assigns state.intervalId.
- Around line 441-469: The current logic calls fireAlerts whenever hasAlerts()
is true because allEvalResults/allEnrichResults include cached results, causing
alerts on every processSession (including UI force-refresh reruns); modify the
gating so fireAlerts only runs when there was at least one fresh (non-cached)
evaluation or enrichment performed in this run — detect fresh work by tracking a
boolean (e.g., named sawFreshEval or sawFreshEnrich) where eval/enrich execution
paths set it to true (ensure queueAndProcessSession({ forceRefresh }) and
processSession propagate forceRefresh to force execution), and change the
condition around fireAlerts to require hasAlerts() && (sawFreshEval ||
sawFreshEnrich) before calling fireAlerts with the same payload (projectName,
sessionId, evalSummary, enrichSummary).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d247c9 and 96aeec8.

📒 Files selected for processing (3)
  • app/components/enrichment-results-panel.tsx
  • app/components/eval-results-panel.tsx
  • lib/eval-queue.ts

Comment on lines +186 to +188
if (fresh) {
setItems(prev => prev.map(i => i.name === enricherName ? { status: "done", name: enricherName, result: fresh, cached: false } : i));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same missing else branch as in the eval panel — item stuck in "loading" when result.summary.results[0] is undefined.

🛠️ Proposed fix
             const fresh = result.summary.results[0];
             if (fresh) {
               setItems(prev => prev.map(i => i.name === enricherName ? { status: "done", name: enricherName, result: fresh, cached: false } : i));
-            }
+            } else {
+              setItems(prev => prev.map(i => i.name === enricherName ? { status: "error", name: enricherName, error: "No result returned" } : i));
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/enrichment-results-panel.tsx` around lines 186 - 188, The
branch that updates items when `fresh` exists leaves the item stuck in "loading"
if `fresh` is falsy (e.g., `result.summary.results[0]` is undefined); update the
logic around the `setItems` call for `enricherName` to include an else branch
that sets the item's status out of "loading" (e.g., { status: "done", name:
enricherName, result: undefined, cached: false } or an appropriate error status)
so the UI won't remain in loading; modify the code where `if (fresh) {
setItems(prev => prev.map(...)) }` to handle the falsy case for `fresh` using
the same `setItems`/`enricherName` symbols.

Comment on lines +244 to +246
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing else branch leaves item stuck in "loading" when result.summary.results[0] is undefined.

In the Phase 2 subagent path, result.summary.results[0] may be undefined (e.g., if runSubagentEvals returns { ok: true, hasEvals: true, summary: { results: [] } }). The existing if (fresh) guard never transitions the item out of "loading" in that case.

🛠️ Proposed fix
             const fresh = result.summary.results[0];
             if (fresh) {
               setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
-            }
+            } else {
+              setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "No result returned" } : i));
+            }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
}
const fresh = result.summary.results[0];
if (fresh) {
setItems(prev => prev.map(i => i.name === evalName ? { status: "done", name: evalName, result: fresh, cached: false } : i));
} else {
setItems(prev => prev.map(i => i.name === evalName ? { status: "error", name: evalName, error: "No result returned" } : i));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 244 - 246, The current
check only updates the item when fresh is truthy, leaving the item stuck in
"loading" if result.summary.results[0] is undefined; update the state in an else
branch after the existing if (fresh) so setItems(...) still runs and clears the
loading state (e.g., set status to "done" for the item identified by evalName
and provide a result of undefined or an appropriate empty value and cached:
false) using the same setItems callback to ensure the UI doesn't remain in
loading when fresh is absent.

Comment on lines +265 to 301
const runAll = useCallback(async () => {
setGlobalRunning(true);
setItems(prev => prev.map(i => ({ status: "loading" as const, name: i.name })));
try {
const result = agentId
? await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, forceRefresh)
: await runEvals(projectName, sessionId, forceRefresh);
if (controller.signal.aborted) return;
if (!result.ok) {
setError(result.error);
} else if (!result.hasEvals) {
setNoEvals(true);
if (agentId) {
const result = await runSubagentEvals(projectName, sessionId, agentId, subagentType, subagentDescription, true);
if (!mountedRef.current) return;
if (result.ok && result.hasEvals) {
setItems(result.summary.results.map(r => ({ status: "done" as const, name: r.name, result: r, cached: false })));
setAllCached(false);
} else {
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error", name: i.name, error: (!result.ok ? result.error : null) || "Eval run failed" } : i
));
}
} else {
setSummary(result.summary);
setCached(result.cached);
const queueResult = await queueAndProcessSession(projectName, sessionId, { forceRefresh: true });
if (!mountedRef.current) return;
if (!queueResult.ok) {
setItems(prev => prev.map(i => ({ status: "error" as const, name: i.name, error: queueResult.error || "Queue processing failed" })));
return;
}
const freshProbe = await checkEvalCacheAndList(projectName, sessionId, agentId, subagentType);
if (!mountedRef.current) return;
if (freshProbe.ok && freshProbe.hasEvals) {
setItems(freshProbe.cachedResults.map(r => ({ status: "done" as const, name: r.name, result: r, cached: false })));
setAllCached(false);
} else {
setItems(prev => prev.map(i =>
i.status === "loading" ? { status: "error", name: i.name, error: "Evals unavailable after processing" } : i
));
}
}
} catch (e) {
if (controller.signal.aborted) return;
setError(e instanceof Error ? e.message : "Failed to run evals");
} finally {
if (!controller.signal.aborted) {
setLoading(false);
}
if (mountedRef.current) setGlobalRunning(false);
}
}, [projectName, sessionId, agentId, subagentType, subagentDescription]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

fd -t f "eval-results-panel.tsx"

Repository: exospherehost/claudeye

Length of output: 103


🏁 Script executed:

wc -l app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 107


🏁 Script executed:

cat -n app/components/eval-results-panel.tsx | head -100

Repository: exospherehost/claudeye

Length of output: 3800


🏁 Script executed:

cat -n app/components/eval-results-panel.tsx | sed -n '100,250p'

Repository: exospherehost/claudeye

Length of output: 7939


🏁 Script executed:

cat -n app/components/eval-results-panel.tsx | sed -n '250,350p'

Repository: exospherehost/claudeye

Length of output: 5681


🏁 Script executed:

cat -n app/components/eval-results-panel.tsx | sed -n '350,472p'

Repository: exospherehost/claudeye

Length of output: 6160


🏁 Script executed:

# Verify if there are other places where runAll or rerunSingleEval are called
rg -n "runAll|rerunSingleEval" app/components/eval-results-panel.tsx

Repository: exospherehost/claudeye

Length of output: 381


🏁 Script executed:

# Check the dependency arrays more carefully
cat -n app/components/eval-results-panel.tsx | sed -n '265,305p'

Repository: exospherehost/claudeye

Length of output: 2325


runAll and rerunSingleEval lack per-invocation cancellation—stale callbacks can observe a refreshed mountedRef after effect cleanup/restart.

When props change mid-flight, the effect cleanup sets mountedRef.current = false and the new effect immediately restores it to true. If an in-flight async operation in runAll (or rerunSingleEval) resumes after that reset, it sees a fresh mountedRef.current value and proceeds with setItems/setGlobalRunning using stale closure values (old projectName, sessionId, etc.), potentially writing incorrect state to the newly mounted component.

The effect protects itself with a local stale flag (line 261 cleanup), but the callbacks cannot access it. Using an AbortController or local cancelled flag for each callback invocation—separate from the component's effect lifecycle—would reliably prevent stale work from completing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/components/eval-results-panel.tsx` around lines 265 - 301, runAll (and
rerunSingleEval) can resume after an effect reset because they read the shared
mountedRef; add a per-invocation cancellation token (e.g., an AbortController or
a local cancelled flag) inside runAll and rerunSingleEval so each call can
detect if it was superseded before applying state. Specifically: create a local
controller/token at the top of runAll/rerunSingleEval, check token.cancelled/
signal.aborted before every setItems and setGlobalRunning and after each await,
and abort or ignore results when the token is triggered; where possible pass the
signal into async helpers like
runSubagentEvals/queueAndProcessSession/checkEvalCacheAndList so they can
early-return. Ensure the token is scoped to the invocation (not mountedRef) so
stale callbacks never mutate state using old closure values.

@NiveditJain
Copy link
Member Author

Closing in favor of a fresh PR with consolidated changes.

NiveditJain added a commit that referenced this pull request Feb 25, 2026
* feat: progressive eval & enrichment display (v0.6.0)

Show eval/enrichment names the moment the page loads and stream each
result into the UI as it completes. Cached results appear instantly,
slow evals don't block anything.

- Add enricherName param to runEnrichments and runSubagentEnrichments
- Add checkEvalCacheAndList and checkEnrichmentCacheAndList cache probes
- Rewrite EvalResultsPanel and EnrichmentResultsPanel for two-phase
  progressive loading (probe cache → stream uncached results)
- Remove runSessionDashboard batch — panels now self-load
- Fix localStorage.setItem unhandled error in ThemeContext tests
- Bump version to 0.6.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: alert API, eval queue, and queue status UI

- app.alert() API for post-eval callbacks (Slack, CI, logging)
- Priority eval queue with background processor (--queue-interval)
- Queue Status panel on /dashboard with live polling
- Force-refresh options for re-run actions routed through queue
- CLI flags: --queue-interval, --queue-concurrency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: alert API, eval queue, queue status UI, and alerts example

- CHANGELOG: added Alert API, Eval Queue, Queue Status UI sections to v0.6.0
- README: alerts section, queue processing section, updated CLI reference and How It Works
- API reference: full app.alert() docs with types, background queue processing section, updated full example and tips
- examples/alerts.js: complete alert example with evals, enrichments, and 3 alert patterns
- CONTRIBUTING: added examples dir, key subsystems table, environment variables table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve code review issues across eval queue, alerts, and UI panels

- Fix promise coalescing silently dropping force options for concurrent callers (C1)
- Fix processSession swallowing errors causing infinite loading spinners (I3)
- Fix processing Set leak on error paths via finally block (I4)
- Replace forEach+async with Promise.all in eval/enrichment panels (I5)
- Add Priority enum, priorityLabel helper, visibility-based polling pause
- Fix PriorityBadge to use server-provided label instead of hardcoded check
- Add 26 tests for eval-queue core (push/dequeue/dedup/priority/status)
- Add 11 tests for alert registry and dispatcher (register/replace/fire/isolation)
- Replace as-any test stubs with properly typed RegisteredEval/RegisteredEnricher

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: error handling, cached badge preservation, and processor re-entrancy

Address remaining PR #30 review comments:
- Add .catch() to all promise chains in eval/enrichment panels (C2/C4/C14/C16)
- Add error fallbacks for unhappy paths in runAll and rerunSingleEval (C3/C5/C6/C9)
- Preserve cached badge in Phase 2 refresh by guarding on loading status (C17/C18)
- Replace setInterval with self-scheduling setTimeout to prevent re-entrancy (C13)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve PR #30 code review issues (7 valid findings)

- Fix stuck-loading when !hasEvals/!hasEnrichers on subagent path
- Remove cumulative processedCount from queue panel visibility gate
- Add aria-expanded to queue panel toggle buttons
- Add strict regex validation for --queue-interval/--queue-concurrency
- Rename misleading eval-queue test title
- Add itemLabel to runSessionAction for correct enricher error messages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve remaining PR #30 review comments (3 valid findings)

- Preserve server-provided error messages in runAll subagent branches
  instead of generic "Eval/Enrichment run failed" strings
- Update contentHash and addedAt in push() when session contents change,
  preventing stale cache hits during queue processing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: stale effect guards, stuck-loading fallbacks, and UI spacing

- Add per-effect `stale` flag to prevent stale async updates after prop
  changes (both eval and enrichment panels)
- Handle non-success and partial results from Phase 2 re-probe, ensuring
  all loading items reach a terminal state
- Add cleanup pass in enrichment runAll for items not covered by fresh probe
- Clear allCached on single-item eval re-run to prevent stale badge
- Remove duplicate pt-2 spacing from EnrichmentLoadingRow/EnrichmentErrorRow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: v0.6.0 — unified queue, bun migration, alerts, progressive display

Unified Queue System:
- Single priority queue for all eval/enrichment processing
- Per-item granularity with HIGH/LOW priority
- Subagent queue integration via /api/queue-item
- Completed items tracking with TTL-based pruning
- Queue details page at /queue with three tabs
- Navbar dropdown with recently completed items

Bun Migration:
- Switched to Bun as primary runtime and package manager
- CI updated to use oven-sh/setup-bun@v2
- Removed tsx dependency

Alert API:
- app.alert() fires after all evals/enrichments complete
- Error isolation via Promise.allSettled
- Chainable API

Progressive Display:
- Two-phase loading: fast cache probe then stream uncached
- Per-item cached badges
- Self-sufficient panels (no parent initialResult)

Bug Fixes:
- Unhandled promise rejection handlers
- Stuck loading state transitions
- Cached badge preservation on refresh
- Background processor re-entrancy fix

Docs:
- Fixed CLI reference (queue flags are env-var-only)
- Consolidated redundant CHANGELOG sections
- Corrected concurrency default to 3

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: prevent event-loop starvation during eval processing

- Add request coalescing in runtimeCache to prevent cache stampede
- Add yield points in enrichment pass and pre-sort for large arrays
- Parallelize cache checks in scanAndEnqueue and tryFireSessionAlerts
- Add 10s timeout-based 202 response for slow queue items
- Handle queued state in client helpers and panel components
- Remove automatic queue details panel from dashboard page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve react-hooks lint violations breaking CI

Remove staleRef render-time access in queue page (react-hooks/refs) and
consolidate useEffect calls in queue-status to avoid set-state-in-effect
violations. Also includes hash robustness, loader cleanup, and test updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants