Dashboard aggregates with mandatory collect+reduce#29
Conversation
- Improved dashboard load times by implementing parallel session file discovery and cache operations, significantly speeding up performance for projects with many sessions. - Updated cache invalidation and file stat calls to run in parallel, enhancing efficiency. - UI improvements include immediate loading spinners for eval and enrichment results panels when no initial result is available, and various text overflow fixes for better display of tool names. Updated CHANGELOG.md to reflect these changes.
…dling - Introduced a new `batchAll` utility to manage concurrency for async tasks, preventing file descriptor exhaustion during parallel operations. - Updated various actions (compute-dashboard, run-session-action, run-subagent-enrichments, run-subagent-evals) to utilize `batchAll` for concurrency-limited execution of cache operations and file handling. - Enhanced cache invalidation in `LocalCacheBackend` to use `batchAll`, improving performance during file deletions. - Updated CHANGELOG.md to document these changes and improvements.
Update CHANGELOG.md to reference batchAll instead of Promise.allSettled/Promise.all, matching the actual implementation. Add vitest suite for the batchAll concurrency utility covering order preservation, bounded concurrency, empty inputs, and rejection handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…filtering (v0.5.5) Resolves JavaScript heap out-of-memory crashes on large workspaces by: - Adding LRU eviction (maxSize) to runtimeCache, capping getCachedSessionLog at 20 entries - Introducing DashboardIndex in globalThis for incremental session tracking (only new/changed/deleted sessions processed) - Bypassing runtime cache in computeDashboard to allow immediate GC of parsed JSONL data - Moving filtering and pagination server-side with debounced re-fetches - Using incremental accumulators for filter metadata instead of unbounded arrays Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep incremental DashboardIndex and hashEvalsModule import from feature branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Auto-reduce produced cluttered tables with mixed types. Users now provide both collect and reduce functions for full control over aggregate output columns and rows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Dashboard Aggregates (v0.5.6): new app.dashboard.aggregate API with per-session collect and cross-session reduce phases, incremental per-session caching/indexing via computeAggregates server action, aggregate registry/types, client UI for sortable aggregate tables, and per-eval recompute support (single-eval rerun bypassing cache). Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant AggClient as AggregateClient
participant Server as computeAggregates
participant Registry as DashboardRegistry
participant Storage as GlobalIndex
User->>AggClient: open dashboard view
AggClient->>Server: computeAggregates(viewName)
Server->>Registry: getAggregatesForView(viewName)
Registry-->>Server: aggregates[]
Server->>Storage: getOrCreateIndex(viewName,...)
Storage-->>Server: AggregateIndex
Note over Server: Phase 1 — Session Discovery
Server->>Server: discover sessions, hash contents
Server->>Storage: diff hashes -> changedSessions
Storage-->>Server: changedSessions
Note over Server: Phase 2 — Per-Session Collect
loop per changed session
Server->>Server: parse session, build AggregateContext
Server->>Server: aggregate.collect(context)
Server->>Storage: cache per-session result (fire-and-forget)
end
Note over Server: Phase 3 — Reduce
Server->>Server: aggregate.reduce(collectedSessions)
Server-->>AggClient: AggregatePayload (aggregates, totals)
AggClient->>AggClient: render AggregateCustomTable
User->>AggClient: click column header
AggClient->>AggClient: toggle sort & re-render
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This pull request introduces a comprehensive dashboard aggregation system for cross-session analytics, along with significant performance improvements to prevent memory issues in large workspaces. The changes span two versions (0.5.5 and 0.5.6) and include a new public API, incremental computation infrastructure, and server-side filtering/pagination.
Changes:
- New aggregate API: Adds
app.dashboard.aggregate()with mandatory{ collect, reduce }pattern for defining cross-session summary tables with full control over output columns and rows - Performance optimizations: Implements LRU-bounded runtime cache, incremental dashboard index to avoid reprocessing unchanged sessions, and server-side filtering/pagination to reduce payload sizes
- Memory safety: Eliminates OOM crashes on large workspaces by capping the runtime cache at 20 entries and bypassing it during dashboard filter computation
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
package.json |
Version bump to 0.5.6 |
lib/runtime-cache.ts |
Adds optional LRU eviction with maxSize parameter |
lib/log-entries.ts |
Applies maxSize limit (20) to session log cache |
lib/evals/dashboard-types.ts |
Defines aggregate types (AggregateContext, CollectedSession, AggregateTableRow, etc.) and serialization types for server-side filtering |
lib/evals/dashboard-registry.ts |
Adds aggregate registry functions (registerAggregate, getAggregatesForView, etc.) |
lib/evals/app.ts |
Exposes aggregate API on DashboardBuilder and DashboardViewBuilder |
lib/evals/index.ts |
Exports new aggregate types for public API |
app/actions/compute-dashboard.ts |
Implements incremental DashboardIndex with server-side filtering and pagination |
app/actions/compute-aggregates.ts |
New server action that runs collect/reduce phases with incremental caching |
app/dashboard/dashboard-client.tsx |
Refactored for server-side filtering with debounced re-fetches |
app/dashboard/dashboard-sessions-table.tsx |
Updated for server-side pagination |
app/dashboard/aggregate-client.tsx |
Client component that fetches and displays aggregates |
app/dashboard/aggregate-section.tsx |
Wrapper component for individual aggregate tables |
app/dashboard/aggregate-custom-table.tsx |
Sortable table component for aggregate results |
app/dashboard/page.tsx |
Integrates AggregateClient for default view |
app/dashboard/[viewName]/page.tsx |
Integrates AggregateClient for named views |
app/actions/list-dashboard-views.ts |
Includes aggregate count in view metadata |
docs/api-reference.md |
Documents aggregate API with examples and type signatures |
README.md |
Adds aggregate feature documentation with examples |
CHANGELOG.md |
Details 0.5.5 and 0.5.6 changes |
__tests__/lib/runtime-cache.test.ts |
Comprehensive LRU eviction tests |
test_evals.js |
Example aggregate usage for testing |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'Pass Rate': collected.filter(s => s.values[`${name}_pass`]).length / collected.length, | ||
| 'Avg Score': collected.reduce((sum, s) => { | ||
| const v = s.values[`${name}_score`]; | ||
| return sum + (typeof v === 'number' ? v : 0); | ||
| }, 0) / collected.length, |
There was a problem hiding this comment.
Division by zero risk when collected array is empty. If no sessions match the aggregate's condition or if all sessions fail during collect, the division by collected.length will result in Infinity. Consider adding a safeguard like const n = collected.length || 1; or checking if the array is empty before performing divisions.
There was a problem hiding this comment.
Same as the README comment — this is a documentation example, not runtime code. The reduce function is fully user-authored, so edge case handling is the user's responsibility.
| return Array.from(evalNames).map(name => ({ | ||
| 'Eval': name, | ||
| 'Pass Rate': collected.filter(s => s.values[`${name}_pass`]).length / collected.length, |
There was a problem hiding this comment.
Division by zero risk when collected array is empty. If no sessions match the aggregate's condition or if all sessions fail during collect, the division by collected.length will result in Infinity. Consider adding a safeguard like const n = collected.length || 1; or checking if the array is empty before performing divisions.
| return Array.from(evalNames).map(name => ({ | |
| 'Eval': name, | |
| 'Pass Rate': collected.filter(s => s.values[`${name}_pass`]).length / collected.length, | |
| const totalSessions = collected.length || 1; | |
| return Array.from(evalNames).map(name => ({ | |
| 'Eval': name, | |
| 'Pass Rate': collected.filter(s => s.values[`${name}_pass`]).length / totalSessions, |
There was a problem hiding this comment.
This is a documentation example, not runtime code. The reduce function is user-authored — edge case handling is the user's responsibility. The AggregateCustomTable component already renders "No data returned from reduce function." when rows are empty. Adding defensive guards to doc examples adds noise to what should be concise illustrative code.
There was a problem hiding this comment.
Missing cleanup for debounce timer on component unmount. The debounceRef.current timeout should be cleared when the component unmounts to prevent calling state setters on an unmounted component. Add a cleanup function to clear the timeout on unmount or when the component re-renders.
| useEffect(() => { | |
| return () => { | |
| if (debounceRef.current) { | |
| clearTimeout(debounceRef.current); | |
| } | |
| }; | |
| }, []); |
There was a problem hiding this comment.
Good catch. Fixed in e53914c — added useEffect cleanup that clears the debounce timer on unmount.
app/actions/compute-dashboard.ts
Outdated
There was a problem hiding this comment.
Inconsistent naming convention for global registry keys. The key uses __VIZAUDE_DASHBOARD_INDEX__ but the rest of the codebase uses __CLAUDEYE_ prefix for global keys. Consider renaming to __CLAUDEYE_DASHBOARD_INDEX__ for consistency with other registry keys like __CLAUDEYE_EVAL_REGISTRY__, __CLAUDEYE_DASHBOARD_FILTER_REGISTRY__, etc.
| const INDEX_KEY = "__CLAUDEYE_DASHBOARD_INDEX__"; |
There was a problem hiding this comment.
Good catch. Fixed in e53914c — renamed to __CLAUDEYE_DASHBOARD_INDEX__ for consistency.
app/actions/compute-aggregates.ts
Outdated
| contentHashes: Map<string, string>; | ||
| } | ||
|
|
||
| const INDEX_KEY = "__VIZAUDE_AGGREGATE_INDEX__"; |
There was a problem hiding this comment.
Inconsistent naming convention for global registry keys. The key uses __VIZAUDE_AGGREGATE_INDEX__ but the rest of the codebase uses __CLAUDEYE_ prefix for global keys. Consider renaming to __CLAUDEYE_AGGREGATE_INDEX__ for consistency with other registry keys like __CLAUDEYE_EVAL_REGISTRY__, __CLAUDEYE_DASHBOARD_FILTER_REGISTRY__, etc.
| const INDEX_KEY = "__VIZAUDE_AGGREGATE_INDEX__"; | |
| const INDEX_KEY = "__CLAUDEYE_AGGREGATE_INDEX__"; |
There was a problem hiding this comment.
Good catch. Fixed in e53914c — renamed to __CLAUDEYE_AGGREGATE_INDEX__ for consistency.
There was a problem hiding this comment.
Expired cache entries are not removed before checking capacity. When a cache hit occurs but the entry has expired (line 31 check fails), the expired entry remains in the cache Map. If the cache is at maxSize, line 43 will evict the oldest entry instead of replacing the expired one, potentially keeping more entries than maxSize. Consider removing expired entries before inserting new ones, or clearing expired entries when checking capacity.
There was a problem hiding this comment.
This is a non-issue. When an expired entry is found (line 31 check fails), execution falls through to fn(...args), then cache.set(key, ...) at line 48 overwrites the expired entry in-place because it's the same key — the Map size doesn't increase. The only theoretical concern would be a different key's expired entry taking up space, but with a cap of 20 entries and natural replacement on next access, adding a scan-all-entries path on every set would be unnecessary overhead.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
app/dashboard/page.tsx (1)
69-71: Gate default aggregates render to avoid unnecessary compute/flash.Right now the default
AggregateClientrenders even when no aggregates are registered, which still triggerscomputeAggregatesand a brief loading state. Consider plumbing ahasDefaultAggregates(or default aggregate count) fromlistDashboardViewsand conditionally rendering these blocks.♻️ Example adjustment (requires listDashboardViews to expose a default-aggregate flag)
- const { views, hasDefaultFilters } = result; + const { views, hasDefaultFilters, hasDefaultAggregates } = result; ... - <div className="mt-8"> - <AggregateClient viewName="default" /> - </div> + {hasDefaultAggregates && ( + <div className="mt-8"> + <AggregateClient viewName="default" /> + </div> + )} ... - <div className="mt-8"> - <AggregateClient viewName="default" /> - </div> + {hasDefaultAggregates && ( + <div className="mt-8"> + <AggregateClient viewName="default" /> + </div> + )}Also applies to: 89-91
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/dashboard/page.tsx` around lines 69 - 71, The default AggregateClient is rendered unconditionally and triggers computeAggregates even when there are no registered default aggregates; update the dashboard render logic to accept a flag from listDashboardViews (e.g., hasDefaultAggregates or defaultAggregateCount) and only render <AggregateClient viewName="default" /> (and the similar block at lines 89-91) when that flag indicates defaults exist; modify the code paths that call listDashboardViews to return the new flag and use it in the page component to gate rendering so computeAggregates is not invoked unnecessarily.test_evals.js (1)
46-63: Handle empty aggregate input explicitly.Returning rows when
collectedis empty yields a table of zeros and bypasses the empty-state message. Consider returning[]to indicate no data.♻️ Suggested guard for empty aggregates
reduce: (collected) => { + if (collected.length === 0) return []; const models = new Map(); let totalTurns = 0; let totalTools = 0; let completions = 0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test_evals.js` around lines 46 - 63, The reduce function in the provided test_evals.js aggregates metrics over the collected array but still returns a table of zeroed metrics when collected is empty; change the logic in the reduce reducer (the function named reduce that takes collected) to explicitly check if collected.length === 0 and return [] immediately so the empty-state handling can run instead of emitting rows of zeros; update references inside reduce (totalTurns/totalTools/completions/models calculation and the final return) to only run when collected is non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/actions/compute-aggregates.ts`:
- Around line 233-239: The code currently builds filterValues by only adding
results where !r.skipped && !r.error, which omits keys and violates the
FilterValue contract; change the loop that processes runAllFilters(...)
(filterSummary.results) to always set filterValues[r.name] = r.value for every
result (regardless of r.skipped or r.error) so each filter key is populated with
the returned FilterValue object; keep any existing logic that handles
skipped/error consumers unchanged but ensure filterValues is fully populated by
function runAllFilters and the variable filterValues.
- Around line 154-165: The loop updates index.contentHashes immediately which
can leave hashes advanced when a subsequent per-session collect fails, causing
stale aggregates; change the flow in the code that iterates over hashResults
(using result.value { key, contentHash, project, file }) to only set
index.contentHashes.set(key, contentHash) after the collect for that key
completes successfully (i.e., after the work that consumes toCompute for that
key finishes), or alternatively collect a provisional list of {key, task,
contentHash} and update index.contentHashes only once the collect step for each
entry succeeds; ensure index.collectedSessions remains consistent with
contentHashes by updating both together after successful collect rather than
before.
In `@app/dashboard/aggregate-custom-table.tsx`:
- Around line 61-71: The header cells rendered in the columns.map currently use
a clickable <th> with onClick(toggleSort) which isn't keyboard-accessible and
lacks aria-sort; make the header interactive and accessible by making the <th>
focusable (add tabIndex={0} or replace with a <button>/role="button"), add an
onKeyDown handler that triggers toggleSort(col) for Enter and Space, and expose
current sort state via aria-sort on the header (compute
"none"/"ascending"/"descending" from your sort state and apply it to the same
<th> rendered in columns.map); keep the ArrowUpDown icon and ensure toggleSort
and the sort state variables are referenced to determine aria-sort.
---
Nitpick comments:
In `@app/dashboard/page.tsx`:
- Around line 69-71: The default AggregateClient is rendered unconditionally and
triggers computeAggregates even when there are no registered default aggregates;
update the dashboard render logic to accept a flag from listDashboardViews
(e.g., hasDefaultAggregates or defaultAggregateCount) and only render
<AggregateClient viewName="default" /> (and the similar block at lines 89-91)
when that flag indicates defaults exist; modify the code paths that call
listDashboardViews to return the new flag and use it in the page component to
gate rendering so computeAggregates is not invoked unnecessarily.
In `@test_evals.js`:
- Around line 46-63: The reduce function in the provided test_evals.js
aggregates metrics over the collected array but still returns a table of zeroed
metrics when collected is empty; change the logic in the reduce reducer (the
function named reduce that takes collected) to explicitly check if
collected.length === 0 and return [] immediately so the empty-state handling can
run instead of emitting rows of zeros; update references inside reduce
(totalTurns/totalTools/completions/models calculation and the final return) to
only run when collected is non-empty.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
CHANGELOG.mdREADME.mdapp/actions/compute-aggregates.tsapp/actions/list-dashboard-views.tsapp/dashboard/[viewName]/page.tsxapp/dashboard/aggregate-client.tsxapp/dashboard/aggregate-custom-table.tsxapp/dashboard/aggregate-section.tsxapp/dashboard/page.tsxdocs/api-reference.mdlib/evals/app.tslib/evals/dashboard-registry.tslib/evals/dashboard-types.tslib/evals/index.tspackage.jsontest_evals.js
| const filterValues: Record<string, FilterValue> = {}; | ||
| if (filters.length > 0) { | ||
| const filterSummary = await runAllFilters(rawLines, stats, project.name, sid, filters); | ||
| for (const r of filterSummary.results) { | ||
| if (!r.skipped && !r.error) { | ||
| filterValues[r.name] = r.value; | ||
| } |
There was a problem hiding this comment.
Always populate filterValues to match the FilterValue contract.
runAllFilters already returns a value even on skipped/error results, but this logic drops those entries. That leaves missing keys in filterValues, which breaks consumers that expect a FilterValue for each filter.
✅ Suggested fix
- for (const r of filterSummary.results) {
- if (!r.skipped && !r.error) {
- filterValues[r.name] = r.value;
- }
- }
+ for (const r of filterSummary.results) {
+ filterValues[r.name] = r.value;
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/actions/compute-aggregates.ts` around lines 233 - 239, The code currently
builds filterValues by only adding results where !r.skipped && !r.error, which
omits keys and violates the FilterValue contract; change the loop that processes
runAllFilters(...) (filterSummary.results) to always set filterValues[r.name] =
r.value for every result (regardless of r.skipped or r.error) so each filter key
is populated with the returned FilterValue object; keep any existing logic that
handles skipped/error consumers unchanged but ensure filterValues is fully
populated by function runAllFilters and the variable filterValues.
There was a problem hiding this comment.
This is intentional and matches the same pattern used in compute-dashboard.ts. A skipped filter was never computed — it has no meaningful value. An errored filter has no valid value. The AggregateContext.filterValues type is Record<string, FilterValue>, not Record<string, FilterValue | undefined> — including skipped/errored results would require providing a fabricated value. Collect functions already handle missing keys gracefully (the key simply isn't present). The guard is correct.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the filterValues pattern in compute-dashboard.ts matches the pattern in compute-aggregates.ts
rg -n -A5 -B2 "filterValues\[r\.name\]" app/actions/compute-dashboard.ts
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Also look at the broader filter handling context in compute-dashboard.ts
rg -n -A10 "runAllFilters" app/actions/compute-dashboard.ts | head -30
Length of output: 1116
🏁 Script executed:
#!/bin/bash
# Check the FilterValue type definition to understand the contract
rg -n "type FilterValue|interface FilterValue" lib/evals/
Length of output: 84
@NiveditJain, you're absolutely right—thank you for the clarification! I've verified that compute-dashboard.ts uses the identical pattern at lines 336-339, and your explanation makes perfect sense. Record<string, FilterValue> doesn't require all keys to be present; it just constrains the type when they are. Including skipped or errored filters would indeed require fabricating values or changing the type contract. The guard is correct, and I apologize for the noise.
ᏊˊᄇˋᏊ learned something new
✏️ Learnings added
Learnt from: NiveditJain
Repo: exospherehost/claudeye PR: 29
File: app/actions/compute-aggregates.ts:233-239
Timestamp: 2026-02-24T08:23:08.197Z
Learning: In the claudeye project, `filterValues` in `compute-dashboard.ts` and `compute-aggregates.ts` intentionally omits skipped and errored filters using the guard `if (!result.skipped && !result.error)`. This matches the type contract `Record<string, FilterValue>` (where FilterValue = boolean | number | string) which allows missing keys. Skipped filters were never computed and errored filters have no valid value, so they should not be included. Consumers handle missing keys gracefully.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: NiveditJain
Repo: exospherehost/claudeye PR: 11
File: app/actions/run-subagent-enrichments.ts:53-54
Timestamp: 2026-02-15T13:56:52.379Z
Learning: In the claudeye project's 0.5.0 unified data model, server actions like `runSubagentEnrichments` in `app/actions/run-subagent-enrichments.ts` intentionally pass combined session + all subagent data (from `getCachedSessionLog`) directly to enrichers without server-side filtering by `agentId`. It is the responsibility of individual enricher implementations to filter by `_source` field if they need agent-specific data. This design provides flexibility for enrichers to analyze either scoped or combined data as needed.
…cleanup - Defer contentHashes update until after successful collect to prevent stale aggregates when collect fails - Rename __VIZAUDE_* globalThis keys to __CLAUDEYE_* for consistency - Add useEffect cleanup for debounce timer on unmount Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each eval result row now has a refresh button that reruns just that eval, bypassing cache and merging the result back in-place. Supports both session and subagent scopes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/components/eval-results-panel.tsx (1)
144-215:⚠️ Potential issue | 🟠 MajorPrevent stale per-row rerun results from overwriting a newer full rerun.
If a user clicks Re-run while a row rerun is in-flight, the row rerun can finish later and merge older data into the fresh summary. Add a lightweight sequencing guard so per-row reruns are ignored when a newer full run starts.
🔧 Suggested fix (sequence guard)
- const abortRef = useRef<AbortController | null>(null); + const abortRef = useRef<AbortController | null>(null); + const fullRunSeqRef = useRef(0); const run = useCallback(async (forceRefresh = false) => { + const seq = ++fullRunSeqRef.current; // Abort any in-flight request abortRef.current?.abort(); const controller = new AbortController(); abortRef.current = controller; @@ - if (controller.signal.aborted) return; + if (controller.signal.aborted || seq !== fullRunSeqRef.current) return; @@ - if (controller.signal.aborted) return; + if (controller.signal.aborted || seq !== fullRunSeqRef.current) return; @@ - if (!controller.signal.aborted) { + if (!controller.signal.aborted && seq === fullRunSeqRef.current) { setLoading(false); } } }, [projectName, sessionId, agentId, subagentType, subagentDescription]); const runSingleEval = useCallback(async (evalName: string) => { + const seqAtStart = fullRunSeqRef.current; try { @@ - setSummary((prev) => { + if (seqAtStart !== fullRunSeqRef.current) return; + setSummary((prev) => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/components/eval-results-panel.tsx` around lines 144 - 215, Row-level reruns (runSingleEval) can finish after a newer full run (run) and overwrite fresh summary; introduce a lightweight sequence guard: add a useRef number (e.g., runSeqRef) that you increment at the start of run() (abortRef handling place), capture the current seq into runSingleEval before calling runEvals/runSubagentEvals, and when the row response arrives compare captured seq to runSeqRef.current and ignore/return if they differ so stale per-row results are not merged; update references in run, runSingleEval, and the abort logic to use runSeqRef for sequencing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/components/eval-results-panel.tsx`:
- Around line 112-119: The rerun button in eval-results-panel (button using
onClick={handleRerun} and the RefreshCw icon) lacks an accessible name; add an
aria-label (for example aria-label="Re-run evaluation" or similar descriptive
text) to the button element so screen readers receive a proper accessible name
while keeping the existing title, disabled state, and icon/animation logic
(preserve props: onClick={handleRerun}, disabled={rerunning || globalLoading},
and the RefreshCw usage).
---
Outside diff comments:
In `@app/components/eval-results-panel.tsx`:
- Around line 144-215: Row-level reruns (runSingleEval) can finish after a newer
full run (run) and overwrite fresh summary; introduce a lightweight sequence
guard: add a useRef number (e.g., runSeqRef) that you increment at the start of
run() (abortRef handling place), capture the current seq into runSingleEval
before calling runEvals/runSubagentEvals, and when the row response arrives
compare captured seq to runSeqRef.current and ignore/return if they differ so
stale per-row results are not merged; update references in run, runSingleEval,
and the abort logic to use runSeqRef for sequencing.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
CHANGELOG.mdREADME.mdapp/actions/compute-aggregates.tsapp/actions/compute-dashboard.tsapp/actions/run-evals.tsapp/actions/run-session-action.tsapp/actions/run-subagent-evals.tsapp/components/eval-results-panel.tsxapp/dashboard/dashboard-client.tsx
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Code reviewFound 2 issues:
claudeye/app/components/eval-results-panel.tsx Lines 174 to 215 in 2e231b5
claudeye/app/actions/compute-aggregates.ts Lines 143 to 150 in 2e231b5 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/components/eval-results-panel.tsx`:
- Around line 191-210: When merging freshResult in the setSummary updater (the
callback passed to setSummary), also recompute summary.totalDurationMs from the
updatedResults instead of leaving it unchanged: after creating updatedResults
(which replaces the matching result by freshResult) sum each result's duration
field (e.g., result.durationMs or 0 if missing) into totalDurationMs and return
it along with passCount, failCount, errorCount, and skippedCount so the duration
display reflects the rerun immediately.
…ration - Add singleEvalAbortRef so run() aborts in-flight single-eval requests, preventing stale per-eval results from overwriting a fresh global rerun - Replace contentHash = "" with crypto.randomUUID() on hash failure so consecutive failures never match and permanently skip sessions - Recompute totalDurationMs in setSummary updater after single-eval rerun Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/actions/compute-aggregates.ts (1)
259-274: Pass the already-builtcontexttoagg.conditioninstead of the strippedcondCtx.The full
AggregateContext(includingevalResults,enrichResults, andfilterValues) is computed at lines 244–253 before this loop. The inlinecondCtxobject is a manual subset that unnecessarily prevents condition authors from writing predicates likectx.filterValues["myFilter"] === true. Since TypeScript allows passing a structural supertype to a function expecting a subset (no excess-property check on variables), this is a free improvement.♻️ Proposed fix
if (agg.condition) { - const condCtx = { entries: rawLines, stats, projectName: project.name, sessionId: sid, source: "session" }; - const condResult = await agg.condition(condCtx); + const condResult = await agg.condition(context); if (!condResult) continue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/actions/compute-aggregates.ts` around lines 259 - 274, The loop that evaluates aggregate conditions currently builds and passes a stripped condCtx to agg.condition, which prevents condition functions from accessing the full AggregateContext (including evalResults, enrichResults, filterValues); change the call to pass the already-built context object (the AggregateContext computed at lines just above the loop) into agg.condition and remove or stop using the ad-hoc condCtx object; update any related variable usage in this block so agg.condition(context) is awaited and behaves the same on truthy/falsy results.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/actions/compute-aggregates.ts`:
- Around line 256-294: The code flattens per-aggregate collected maps into
mergedValues via Object.assign(mergedValues, values) (and similarly on
cache-restore), causing silent key collisions when different aggregates emit the
same key; fix by preserving per-aggregate scoping instead of flattening: either
namespace flat keys when building mergedValues (e.g., prefix keys with
`${agg.name}/`) before assigning, or change the CollectedSession.payload (and
the reduce call sites) to pass the scoped map allCollectedValues[agg.name] to
each aggregate's reduce so each aggregate only receives its own values; update
setCachedResult usage and the restore path to keep the same per-aggregate shape
(allCollectedValues) and adjust any consumers of CollectedSession.values to
expect namespaced keys or a per-aggregate map accordingly.
---
Nitpick comments:
In `@app/actions/compute-aggregates.ts`:
- Around line 259-274: The loop that evaluates aggregate conditions currently
builds and passes a stripped condCtx to agg.condition, which prevents condition
functions from accessing the full AggregateContext (including evalResults,
enrichResults, filterValues); change the call to pass the already-built context
object (the AggregateContext computed at lines just above the loop) into
agg.condition and remove or stop using the ad-hoc condCtx object; update any
related variable usage in this block so agg.condition(context) is awaited and
behaves the same on truthy/falsy results.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/actions/compute-aggregates.tsapp/actions/compute-dashboard.tsapp/components/eval-results-panel.tsx
Code reviewFound 3 issues:
claudeye/app/actions/compute-aggregates.ts Lines 5 to 6 in 168b02c claudeye/app/actions/compute-aggregates.ts Lines 168 to 170 in 168b02c Compare with claudeye/app/actions/run-evals.ts Line 3 in 168b02c claudeye/app/actions/run-evals.ts Line 29 in 168b02c
claudeye/app/actions/compute-aggregates.ts Lines 168 to 170 in 168b02c Compare with claudeye/app/actions/run-enrichments.ts Line 27 in 168b02c
claudeye/app/dashboard/page.tsx Lines 69 to 71 in 168b02c Compare with the guarded version in the named view page: claudeye/app/dashboard/[viewName]/page.tsx Lines 36 to 40 in 168b02c 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
…teClient - Use getSessionScopedEvals/getSessionScopedEnrichers instead of getRegisteredEvals/getRegisteredEnrichers in compute-aggregates.ts to match the established pattern in run-evals.ts and run-enrichments.ts - Guard AggregateClient renders on default dashboard page behind hasDefaultAggregates check to avoid unnecessary server round-trips Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
app/actions/compute-aggregates.ts (1)
166-294:⚠️ Potential issue | 🟠 MajorPotential key collisions when flattening aggregate values.
Both the cache-restore path and the live collect path flatten per-aggregate maps into one
mergedValuesobject. If two aggregates emit the same key, later ones silently overwrite earlier values, andreduce()sees corrupted input. Consider namespacing keys (e.g.,${agg.name}/key) or passing per-aggregate values into eachreduce()so aggregates only see their own data.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/actions/compute-aggregates.ts` around lines 166 - 294, The flattening of per-aggregate maps into mergedValues causes key collisions (seen in the cache restore branch and live collect loop using mergedValues), so change to a collision-safe representation: either keep allCollectedValues as a map keyed by agg.name (i.e., do not Object.assign into a flat mergedValues) and update downstream reduce() consumers to iterate per-aggregate entries, or namespace flattened keys consistently (e.g., `${agg.name}/${key}`) both when writing in agg.collect results, when merging in the live path (in the loop around agg.collect and Object.assign), and when restoring from getCachedResult; ensure setCachedResult and the cached-restore branch use the same format so reduce() and other consumers read per-aggregate values without overwrite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@app/actions/compute-aggregates.ts`:
- Around line 166-294: The flattening of per-aggregate maps into mergedValues
causes key collisions (seen in the cache restore branch and live collect loop
using mergedValues), so change to a collision-safe representation: either keep
allCollectedValues as a map keyed by agg.name (i.e., do not Object.assign into a
flat mergedValues) and update downstream reduce() consumers to iterate
per-aggregate entries, or namespace flattened keys consistently (e.g.,
`${agg.name}/${key}`) both when writing in agg.collect results, when merging in
the live path (in the loop around agg.collect and Object.assign), and when
restoring from getCachedResult; ensure setCachedResult and the cached-restore
branch use the same format so reduce() and other consumers read per-aggregate
values without overwrite.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
app/actions/compute-aggregates.tsapp/actions/list-dashboard-views.tsapp/dashboard/page.tsx
Summary
app.dashboard.aggregate()API for cross-session aggregations on dashboard views, requiring explicit{ collect, reduce }for full control over output tablesbatchAllutility to prevent file descriptor exhaustion on large workspacesTest plan
npm run build— no type errorsnpm run build:evals— dist declarations compilenpx vitest run— all 403 tests passnpm run dev -- --evals ./test_evals.js— aggregate table renders with user-defined columns🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Documentation