Skip to content

Comments

feat(reviewer): add spec-reviewer sub-agent for artifact quality gates#92

Merged
tzachbon merged 21 commits intomainfrom
feat/reviewer-subagent
Feb 18, 2026
Merged

feat(reviewer): add spec-reviewer sub-agent for artifact quality gates#92
tzachbon merged 21 commits intomainfrom
feat/reviewer-subagent

Conversation

@tzachbon
Copy link
Owner

@tzachbon tzachbon commented Feb 17, 2026

Summary

  • Introduces a spec-reviewer sub-agent that validates artifacts against type-specific rubrics (research, requirements, design, tasks, execution)
  • Adds review loops to all 4 phase commands (research, requirements, design, tasks) with max 3 iterations and graceful degradation
  • Adds Layer 5: Artifact Review to the implement.md coordinator's verification layers for execution-time reviews
  • Adds review step to plan-synthesizer for quick mode, where automated review adds the most value since there's no human-in-the-loop
  • Bumps version from 3.4.0 to 3.5.0

Key Changes

File Change
agents/spec-reviewer.md New read-only reviewer agent with 5 rubrics, REVIEW_PASS/REVIEW_FAIL signal protocol
commands/research.md Review loop after merge, before walkthrough
commands/requirements.md Review loop after product-manager, before walkthrough
commands/design.md Review loop after architect-reviewer, before walkthrough
commands/tasks.md Review loop after task-planner, before walkthrough
commands/implement.md Layer 5 artifact review in verification layers
agents/plan-synthesizer.md Review step after artifact generation in quick mode
plugin.json / marketplace.json Version bump to 3.5.0

Design Decisions

  • Max 3 review iterations: Research shows quality plateaus after 2-3 iterations. Graceful degradation proceeds with warnings.
  • REVIEW_PASS/REVIEW_FAIL signals: Distinct from qa-engineer's VERIFICATION signals. Clearer intent, same structure.
  • Read-only reviewer: Reviewer never modifies files. Coordinator applies changes via phase agent re-invocation.
  • Quick mode skip: Phase commands skip review in quick mode (plan-synthesizer handles its own review).
  • Error handling: Missing signals treated as REVIEW_PASS (permissive). Failed revisions retry once then proceed with original.

Test Plan

  • spec-reviewer.md has correct frontmatter (name, description, model: inherit)
  • All 4 phase commands have Artifact Review section with consistent patterns
  • implement.md has Layer 5 with spec-reviewer delegation
  • plan-synthesizer has review step after artifact generation
  • REVIEW_PASS/REVIEW_FAIL signals consistent across all files
  • Error handling ("treat as REVIEW_PASS") in all 5 commands
  • Version 3.5.0 in both plugin.json and marketplace.json
  • All 24 acceptance criteria verified (see .progress.md)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Integrated a spec-reviewer sub-agent and mandatory artifact review workflow across phases with up to 3 review iterations and a quick-mode bypass.
  • Documentation
    • Added comprehensive reviewer rubrics, delegation prompts, iteration logging, error-handling rules, progress-tracking conventions, and design/requirements for review integration.
  • Chores
    • Bumped plugin metadata version to 3.5.0.

tzachbon and others added 16 commits February 17, 2026 23:34
Spec artifacts:
- research.md: feasibility analysis and codebase exploration
- requirements.md: user stories and acceptance criteria
- design.md: architecture and technical decisions
- tasks.md: POC-first implementation plan

Ready for implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 17, 2026

📝 Walkthrough

Walkthrough

Adds a read-only spec-reviewer sub-agent and integrates a bounded (max 3 iterations) Artifact Review loop into Ralph‑Specum phase and execution flows, bumps plugin version 3.4.0→3.5.0, and introduces per-iteration logging, revision delegation, and updated verification gating across multiple command docs and specs.

Changes

Cohort / File(s) Summary
Plugin metadata & marketplace
.claude-plugin/marketplace.json, plugins/ralph-specum/.claude-plugin/plugin.json
Bumped plugin version from 3.4.03.5.0 (metadata update).
Spec-reviewer agent doc
plugins/ralph-specum/agents/spec-reviewer.md
New read-only spec-reviewer specification: rubrics by artifact type, deterministic output schema, required REVIEW_PASS/REVIEW_FAIL signals, iteration semantics, and edge-case handling.
Phase command reviews
plugins/ralph-specum/commands/research.md, plugins/ralph-specum/commands/requirements.md, plugins/ralph-specum/commands/design.md, plugins/ralph-specum/commands/tasks.md
Inserted an Artifact Review loop (max 3 iterations) before Walkthrough in each command: delegate to spec-reviewer, parse signals, log per-iteration to .progress.md, handle revisions, graceful degradation, quick-mode bypass, and error rules. (design.md contains a duplicated review block.)
Execution / implement flow
plugins/ralph-specum/commands/implement.md
Added Layer 5 Artifact Review to the verification sequence, mapping reviewer signals to fix-task generation and state transitions; updated messaging from 4 → 5 verifications.
Plan synthesizer
plugins/ralph-specum/agents/plan-synthesizer.md
Reordered synthesis steps to require pre-commit artifact review and enforced iterative validation before committing specs.
Reviewer-subagent specs & progress
specs/reviewer-subagent/...
specs/reviewer-subagent/requirements.md, specs/reviewer-subagent/design.md, specs/reviewer-subagent/research.md, specs/reviewer-subagent/tasks.md, specs/reviewer-subagent/.progress.md
Added comprehensive spec/design/requirements/tasks docs and a .progress.md tracker describing reviewer integration, rubrics, input/output contracts, iteration logging conventions, error modes, and acceptance criteria.

Sequence Diagram(s)

sequenceDiagram
    participant PhaseAgent as Phase Agent
    participant Coordinator as Coordinator
    participant SpecReviewer as spec-reviewer
    participant RevisingAgent as Phase Agent (revision)
    participant Logger as .progress.md

    loop Artifact Review (max 3)
        Coordinator->>SpecReviewer: Delegate review (artifact, type, upstreams, iteration)
        SpecReviewer->>SpecReviewer: Evaluate artifact vs rubric
        alt REVIEW_PASS
            SpecReviewer-->>Coordinator: REVIEW_PASS
            Coordinator->>Logger: Append iteration PASS
            Coordinator->>PhaseAgent: Proceed to Walkthrough
        else REVIEW_FAIL
            SpecReviewer-->>Coordinator: REVIEW_FAIL + findings
            Coordinator->>Logger: Append iteration FAIL + findings
            alt iterations < 3
                Coordinator->>RevisingAgent: Delegate revision (feedback, iteration)
                RevisingAgent->>RevisingAgent: Update artifact
                Coordinator->>Coordinator: Increment iteration (loop)
            else max iterations reached
                Coordinator->>Logger: Append graceful-degradation warning
                Coordinator->>PhaseAgent: Proceed to Walkthrough
            end
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 I hopped through specs with careful cheer,

Three gentle rounds to make things clear.
Reviewer nibbles, findings laid,
I patch and hop and not dismayed—
Now artifacts gleam, the path sincere.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: introducing a spec-reviewer sub-agent for artifact quality gates, which is the central feature of this PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/reviewer-subagent

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/ralph-specum/.claude-plugin/plugin.json (1)

1-10: ⚠️ Potential issue | 🟠 Major

Add agents array to plugin.json to register all agent files.

The plugin manifest must include an agents array per the coding guidelines. While agent files exist in plugins/ralph-specum/agents/ (including spec-reviewer.md), they are not registered in plugin.json. Add an agents array listing all agents: architect-reviewer, plan-synthesizer, product-manager, qa-engineer, refactor-specialist, research-analyst, spec-executor, spec-reviewer, and task-planner. Also bump the version according to semantic versioning and update marketplace.json accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/.claude-plugin/plugin.json` around lines 1 - 10, Update
the plugin manifest (plugin.json) to include an "agents" array registering each
agent file (architect-reviewer, plan-synthesizer, product-manager, qa-engineer,
refactor-specialist, research-analyst, spec-executor, spec-reviewer,
task-planner) so the plugin framework can discover them; add each agent as an
entry in that array (matching the agent filenames or identifiers used in the
agents/ folder). Also increment the "version" field in plugin.json following
semver (e.g., bump patch or minor as appropriate) and mirror that same version
bump in marketplace.json so both manifests stay consistent. Ensure the updated
JSON remains valid and includes the existing fields (name, description, author,
license, keywords).
🧹 Nitpick comments (2)
plugins/ralph-specum/agents/plan-synthesizer.md (1)

105-131: Review loop omits per-iteration progress logging for REVIEW_PASS and intermediate REVIEW_FAIL.

Phase commands (requirements.md, research.md, design.md, tasks.md, implement.md) all have an explicit ### Review Iteration Logging section that appends a structured entry to .progress.md after every iteration regardless of outcome:

### Review: $artifactType (Iteration $iteration)
- Status: REVIEW_PASS or REVIEW_FAIL
- Findings: …
- Action: …

Plan-synthesizer's loop only logs to .progress.md on graceful degradation (max iterations reached). REVIEW_PASS and intermediate REVIEW_FAIL iterations leave no trace. For a quick-mode run — where there's no human walkthrough to provide observability — this makes debugging reviewer behavior harder.

Consider adding a ### Review Iteration Logging block consistent with the other five commands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/agents/plan-synthesizer.md` around lines 105 - 131, The
review loop in plan-synthesizer.md lacks per-iteration progress entries in
.progress.md for REVIEW_PASS and intermediate REVIEW_FAIL; update the WHILE
iteration body (after parsing the reviewer signal) to always append a "###
Review: $artifactType (Iteration $iteration)" block to .progress.md with fields
"- Status: REVIEW_PASS or REVIEW_FAIL", "- Findings: …", and "- Action: …"
(matching the exact format used by
requirements.md/research.md/design.md/tasks.md/implement.md). Place this append
step immediately after handling the parsed signal in the loop (before proceeding
to the next artifact or revising/continuing iterations), and keep the existing
max-iteration warning append (iteration >= 3) but still include the
per-iteration entry for that final iteration as well.
plugins/ralph-specum/commands/design.md (1)

276-294: Consider passing upstream context in the Revision Delegation Prompt

The task-planner revision prompt in tasks.md explicitly includes requirements and design content, giving the agent full context for revision. The architect-reviewer revision prompt here only provides the current artifact and reviewer feedback. On a REVIEW_FAIL for consistency/traceability dimensions, the architect-reviewer may need to re-read requirements.md to correctly align the design—consider including it here as the tasks.md pattern does.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/commands/design.md` around lines 276 - 294, The
architect-reviewer revision prompt (subagent_type: architect-reviewer) currently
only provides the current artifact and reviewer feedback; update the Revision
Delegation Prompt for this agent so it also passes upstream context (e.g., the
requirements/tasks) like the task-planner prompt does. Specifically, modify the
prompt template that produces the revision instructions for architect-reviewer
to include the relevant requirements.md or tasks.md content when available, and
ensure the prompt mentions both the current artifact (./specs/$spec/design.md)
and the upstream spec requirements (requirements.md or tasks.md) so the agent
can re-read them for consistency/traceability when addressing REVIEW_FAIL
findings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/ralph-specum/agents/spec-reviewer.md`:
- Around line 30-50: The fenced code block containing the numbered execution
flow starting with "1. Parse artifactType from delegation prompt" is missing a
language identifier and triggers markdownlint MD040; update the opening
triple-backtick for that block to include the language specifier "text" (i.e.,
change ``` to ```text) so the block matches other code blocks and resolves the
lint error.

In `@plugins/ralph-specum/commands/implement.md`:
- Around line 1091-1127: Layer 5 is ambiguous for parallelGroup batches: when
multiple tasks complete in parallel, clarify which index and file/task
aggregates to use in the spec-reviewer invocation and progress logs; update
Layer 5 and the review loop to use parallelGroup.startIndex as the
representative $taskIndex, compute the union of all tasks' Files lists for the
"Changed files: [Content...]" field, and concatenate all task blocks for "Task
description: [Full task block...]", and ensure .progress.md headings and any
references to $taskIndex reflect the representative index and note the full list
of task indices in the batch.

In `@plugins/ralph-specum/commands/requirements.md`:
- Line 201: Four fenced code blocks currently use bare backtick fences (``` )
with no language tag; update each opening fence to include an appropriate
language specifier (e.g., ```text, ```markdown, or ```yaml) so markdownlint rule
MD040 is satisfied. Locate the four code blocks flagged in the PR comment (the
blocks that start with ``` and contain plain text/markup/yaml snippets) and edit
their opening fences to the correct language token, ensuring the closing fences
remain ``` unchanged.

In `@plugins/ralph-specum/commands/research.md`:
- Around line 594-600: The pseudocode's step "Re-merge results into research.md"
is incorrect for the revision path because revision delegation (Revision
Delegation Prompt / research-analyst) writes a full revised artifact directly to
./specs/$spec/research.md; update the instruction for the revision branch (the
block handling REVIEW_FAIL -> re-invoke research-analyst) to "Re-read
research.md" or "Reload ./specs/$spec/research.md" instead of attempting to
re-merge partial files so the coordinator reads the newly written full file
rather than trying to merge non-existent partials.

In `@specs/reviewer-subagent/.progress.md`:
- Line 64: Update the stale pattern description string ("Pattern: All phase
commands follow identical structure: Validate -> Interview -> Execute ->
Walkthrough -> Review -> Update State -> Commit -> Stop") to reflect the new
phase ordering implemented after the PR: "Execute -> Review (spec-reviewer) ->
Walkthrough -> Review & Feedback Loop" and ensure the full flow now reads:
Validate -> Interview -> Execute -> Review (spec-reviewer) -> Walkthrough ->
Review & Feedback Loop -> Update State -> Commit -> Stop; edit the text in
.progress.md where that Pattern line appears so the documented flow matches the
actual implementation.

In `@specs/reviewer-subagent/design.md`:
- Around line 169-206: MD040: Two fenced code blocks (the "Revision Delegation
Prompt Template" and the initial review prompt block in the "Review Delegation
Prompt Template") are missing a language identifier; update each triple-backtick
fence to include a language token (e.g., use ```text) so the linter stops
flagging MD040, verify both the "Revision Delegation Prompt Template" and the
earlier prompt block include the identifier, and run the markdown linter to
confirm the warning is resolved.

In `@specs/reviewer-subagent/requirements.md`:
- Line 86: Update the FR-12 entry in the requirements table so it references the
correct implementation file: change the linked file from start.md to
plan-synthesizer.md (since the review step for quick-mode synthesis lives in
plan-synthesizer.md and satisfies US-4). Ensure the table row still lists the
same description ("Add review step in start.md for quick mode flow") but replace
the file name to "plan-synthesizer.md" and optionally add a short note that US-4
ACs are satisfied by that file to avoid future confusion.

In `@specs/reviewer-subagent/tasks.md`:
- Around line 54-166: The current phase headings ("Phase 2: Complete
Integration", "Phase 3: Refinement and Version Bump", and "Phase 5: PR
Lifecycle") do not follow the CLAUDE.md required 4-phase POC-First Workflow;
update the headings and content to use the canonical phases: rename "Phase 2:
Complete Integration" to "Phase 2: Refactoring", rename "Phase 3: Refinement and
Version Bump" to "Phase 3: Testing", collapse/move any PR lifecycle steps from
"Phase 5: PR Lifecycle" into "Phase 4: Quality Gates" (include push/PR/CI items
there), and ensure any phase-specific checklist items and descriptions reference
the correct phase names and ordering and that the document overall lists only
Phase 1–4 in that sequence.
- Around line 168-222: Phase 3 currently has four consecutive tasks
(3.1→3.2→3.3→3.4) before the VERIFY gate (3.5); insert a new quality checkpoint
after 3.2 (make it the new 3.3 VERIFY) to enforce the 2–3 task rule, then
renumber the existing 3.3→3.4→3.5 to 3.4→3.5→3.6 respectively; update the new
checkpoint's "Do/Done when/Verify/Commit" entries to assert the version bumps
and rubric edits (reuse the grep checks from the original 3.5 verify lines), and
update any cross-references in the Phase 3 list and related files
(plugins/ralph-specum/.claude-plugin/plugin.json,
.claude-plugin/marketplace.json, plugins/ralph-specum/agents/spec-reviewer.md,
and the commands/*.md entries) so all task numbers and verify commands match the
new numbering.

---

Outside diff comments:
In `@plugins/ralph-specum/.claude-plugin/plugin.json`:
- Around line 1-10: Update the plugin manifest (plugin.json) to include an
"agents" array registering each agent file (architect-reviewer,
plan-synthesizer, product-manager, qa-engineer, refactor-specialist,
research-analyst, spec-executor, spec-reviewer, task-planner) so the plugin
framework can discover them; add each agent as an entry in that array (matching
the agent filenames or identifiers used in the agents/ folder). Also increment
the "version" field in plugin.json following semver (e.g., bump patch or minor
as appropriate) and mirror that same version bump in marketplace.json so both
manifests stay consistent. Ensure the updated JSON remains valid and includes
the existing fields (name, description, author, license, keywords).

---

Nitpick comments:
In `@plugins/ralph-specum/agents/plan-synthesizer.md`:
- Around line 105-131: The review loop in plan-synthesizer.md lacks
per-iteration progress entries in .progress.md for REVIEW_PASS and intermediate
REVIEW_FAIL; update the WHILE iteration body (after parsing the reviewer signal)
to always append a "### Review: $artifactType (Iteration $iteration)" block to
.progress.md with fields "- Status: REVIEW_PASS or REVIEW_FAIL", "- Findings:
…", and "- Action: …" (matching the exact format used by
requirements.md/research.md/design.md/tasks.md/implement.md). Place this append
step immediately after handling the parsed signal in the loop (before proceeding
to the next artifact or revising/continuing iterations), and keep the existing
max-iteration warning append (iteration >= 3) but still include the
per-iteration entry for that final iteration as well.

In `@plugins/ralph-specum/commands/design.md`:
- Around line 276-294: The architect-reviewer revision prompt (subagent_type:
architect-reviewer) currently only provides the current artifact and reviewer
feedback; update the Revision Delegation Prompt for this agent so it also passes
upstream context (e.g., the requirements/tasks) like the task-planner prompt
does. Specifically, modify the prompt template that produces the revision
instructions for architect-reviewer to include the relevant requirements.md or
tasks.md content when available, and ensure the prompt mentions both the current
artifact (./specs/$spec/design.md) and the upstream spec requirements
(requirements.md or tasks.md) so the agent can re-read them for
consistency/traceability when addressing REVIEW_FAIL findings.

Comment on lines 1091 to 1127
```text
Set reviewIteration = 1

WHILE reviewIteration <= 3:
1. Collect changed files from the task (from the task's Files list and git diff)
2. Read ./specs/$spec/design.md and ./specs/$spec/requirements.md
3. Invoke spec-reviewer via Task tool (see delegation prompt below)
4. Parse the last line of spec-reviewer output for signal:
- If output contains "REVIEW_PASS":
a. Log review iteration to .progress.md (see Review Iteration Logging below)
b. Break loop, proceed to State Update (section 8)
- If output contains "REVIEW_FAIL" AND reviewIteration < 3:
a. Log review iteration to .progress.md (see Review Iteration Logging below)
b. Extract "Feedback for Revision" from reviewer output
c. Coordinator can:
- Add fix tasks to tasks.md (same pattern as Section 6c fix task generator):
Generate a fix task from the reviewer feedback, insert after current task,
delegate to spec-executor, and on TASK_COMPLETE re-run Layer 5
- Log suggested spec updates in .progress.md for manual review:
Append reviewer suggestions under "## Review Suggestions" section
d. reviewIteration = reviewIteration + 1
e. Continue loop
- If output contains "REVIEW_FAIL" AND reviewIteration >= 3:
a. Log review iteration to .progress.md (see Review Iteration Logging below)
b. Log warnings to .progress.md:
```markdown
### Review Warning: execution (Task $taskIndex)
- Max iterations (3) reached without REVIEW_PASS
- Proceeding with best available implementation
- Outstanding issues: [findings from last REVIEW_FAIL]
```
c. Break loop, proceed to State Update (section 8)
- If output contains NEITHER signal (reviewer error):
a. Treat as REVIEW_PASS (permissive)
b. Log review iteration to .progress.md with status "REVIEW_PASS (no signal)"
c. Break loop, proceed to State Update (section 8)
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Layer 5 is underspecified for parallel batch execution.

Section 5 allows multiple tasks to complete simultaneously in a parallelGroup. When Section 7 runs after a parallel batch, Layer 5's variables are ambiguous:

  • $taskIndex — which task's index is used in the review prompt and progress-log heading?
  • "Changed files: [Content of each file listed in the task's Files section]" — should combine all tasks' Files lists for the batch, but this isn't stated.
  • "Task description: [Full task block from tasks.md]" — similarly needs to include all batch task blocks.

Recommend adding a note clarifying parallel-batch behavior (e.g., use parallelGroup.startIndex as the representative $taskIndex and union all Files lists from tasks in the batch).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/commands/implement.md` around lines 1091 - 1127, Layer 5
is ambiguous for parallelGroup batches: when multiple tasks complete in
parallel, clarify which index and file/task aggregates to use in the
spec-reviewer invocation and progress logs; update Layer 5 and the review loop
to use parallelGroup.startIndex as the representative $taskIndex, compute the
union of all tasks' Files lists for the "Changed files: [Content...]" field, and
concatenate all task blocks for "Task description: [Full task block...]", and
ensure .progress.md headings and any references to $taskIndex reflect the
representative index and note the full list of task indices in the batch.

a. Log review iteration to .progress.md (see Review Iteration Logging below)
b. Extract "Feedback for Revision" from reviewer output
c. Re-invoke product-manager with revision prompt (see below)
d. Re-read updated requirements.md
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Four fenced code blocks lack language specifiers (MD040).

Markdownlint flags lines 201, 266, 287, and 301. Add a language tag (e.g., text, markdown, or yaml as appropriate) to each opening fence.

Also applies to: 266-266, 287-287, 301-301

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 201-201: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/commands/requirements.md` at line 201, Four fenced code
blocks currently use bare backtick fences (``` ) with no language tag; update
each opening fence to include an appropriate language specifier (e.g., ```text,
```markdown, or ```yaml) so markdownlint rule MD040 is satisfied. Locate the
four code blocks flagged in the PR comment (the blocks that start with ``` and
contain plain text/markup/yaml snippets) and edit their opening fences to the
correct language token, ensuring the closing fences remain ``` unchanged.

Comment on lines +169 to +206
```text
You are reviewing the $artifactType artifact for spec: $spec
Spec path: $specPath

Review iteration: $iteration of 3

Artifact content:
[Full content of the artifact being reviewed]

Upstream artifacts (for cross-referencing):
[Content of prior artifacts - research for requirements, research+requirements for design, etc.]

Apply the $artifactType rubric. Output structured findings with REVIEW_PASS or REVIEW_FAIL.

If REVIEW_FAIL, provide specific, actionable feedback for revision. Reference line numbers or sections.
```

## Revision Delegation Prompt Template

Used by phase commands when re-invoking phase agent after REVIEW_FAIL:

```text
You are revising the $artifactType for spec: $spec
Spec path: $specPath

Current artifact: [path to artifact]

Reviewer feedback (iteration $iteration):
$reviewerFindings

Your task:
1. Read the current artifact
2. Address each finding from the reviewer
3. Update the artifact to resolve all issues
4. Maintain consistency with upstream artifacts

Focus on the specific issues flagged. Do not rewrite sections that passed review.
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing language identifier on fenced code block (MD040)

Static analysis flags a fenced code block around line 201 in this file without a language specifier. Verify the template blocks in the Revision Delegation Prompt Template and Review Delegation Prompt Template sections both carry a language identifier (e.g., text).

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 201-201: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/reviewer-subagent/design.md` around lines 169 - 206, MD040: Two fenced
code blocks (the "Revision Delegation Prompt Template" and the initial review
prompt block in the "Review Delegation Prompt Template") are missing a language
identifier; update each triple-backtick fence to include a language token (e.g.,
use ```text) so the linter stops flagging MD040, verify both the "Revision
Delegation Prompt Template" and the earlier prompt block include the identifier,
and run the markdown linter to confirm the warning is resolved.

"name": "ralph-specum",
"description": "Spec-driven development with research, requirements, design, tasks, and autonomous execution. Fresh context per task.",
"version": "3.4.0",
"description": "Spec-driven development with research, requirements, design, tasks, and autonomous execution with artifact review. Fresh context per task and reviewer sub-agent quality gates.",
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary....

"version": "3.4.0",
"description": "Spec-driven development with task-by-task execution. Research, requirements, design, tasks, and autonomous implementation with fresh context per task.",
"version": "3.5.0",
"description": "Spec-driven development with task-by-task execution and artifact review. Research, requirements, design, tasks, and autonomous implementation with fresh context per task and reviewer sub-agent quality gates.",
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary

- Revert plugin descriptions to pre-PR values (owner feedback)
- Fix REVIEW_PASS template contradiction (all-PASS example, no FAIL dims)
- Add `text` language identifier to bare code fence in spec-reviewer
- Change "Re-merge" to "Re-read" in research.md review loop
- Add per-iteration review logging to plan-synthesizer
- Add requirements.md upstream context to design.md revision prompt
- Clarify Layer 5 parallel batch behavior in implement.md
- Fix stale pattern description in .progress.md
- Fix FR-12 reference from start.md to plan-synthesizer.md
- Rename task phases to match POC-First Workflow, collapse Phase 5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
plugins/ralph-specum/agents/plan-synthesizer.md (1)

110-140: Nested fenced code blocks with identical delimiters produce broken markdown rendering (MD040).

The inner ```yaml at line ~111 terminates the outer ```text block (standard markdown parsers close a fenced block when they encounter three backticks, regardless of the language hint). As a result, the closing ``` at lines 123 and 141 are interpreted as new, unlanguaged code blocks, which triggers the MD040 linting warnings.

Use four backticks for the outer block to allow three-backtick inner fences:

♻️ Proposed fix
-```text
+````text
 Set iteration = 1
 
 WHILE iteration <= 3:
   1. Read the artifact content from <basePath>/<artifact>.md
   2. Invoke spec-reviewer via Task tool:
      ```yaml
      subagent_type: spec-reviewer
      ...
      ```
   3. Parse signal:
      ...
-```
+````
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/agents/plan-synthesizer.md` around lines 110 - 140, The
outer fenced block in plan-synthesizer.md uses three backticks and contains an
inner ```yaml``` fence which prematurely closes the outer block (MD040); change
the outer fence to four backticks (i.e., replace the outer ```text ... ``` with
````text ... ````) so the inner ```yaml``` fence remains nested and markdown
renders correctly, ensuring the opening marker around "Set iteration = 1" and
its matching closing marker are updated to four backticks.
specs/reviewer-subagent/tasks.md (3)

192-196: Non-standard task number 3.2a — sequential renumbering preferred.

The checkpoint was inserted mid-sequence using an alphabetic suffix rather than renumbering. Prefer fully sequential integers (3.3 → 3.4 → 3.5 → 3.6) for consistency with every other phase.

♻️ Proposed renumbering
-- [x] 3.2a [VERIFY] Version bump checkpoint
+- [x] 3.3 [VERIFY] Version bump checkpoint
   - **Do**: Verify version bumps are correct in both files
   ...

-- [x] 3.3 Review and refine spec-reviewer rubrics
+- [x] 3.4 Review and refine spec-reviewer rubrics
   ...

-- [x] 3.4 Ensure error handling in all review loops
+- [x] 3.5 Ensure error handling in all review loops
   ...

-- [x] 3.5 [VERIFY] Refinement checkpoint
+- [x] 3.6 [VERIFY] Refinement checkpoint

Update total_tasks: 20 to 21 in the frontmatter if the new task count changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/reviewer-subagent/tasks.md` around lines 192 - 196, The task item uses
a non-standard ordinal "3.2a" which breaks sequential numbering—rename the task
to the next integer in sequence (e.g., change "3.2a" to "3.3" and increment
subsequent tasks accordingly so numbering remains contiguous), and if adding
this renumbered task increases the total task count, update the frontmatter key
total_tasks (e.g., change from 20 to 21) to match; locate the task line
containing "3.2a" and the frontmatter key total_tasks to apply these changes
consistently.

230-230: Phase 4 name carries an extra suffix beyond the canonical "Quality Gates".

The required phase name is "Quality Gates". "& PR Lifecycle" is redundant since PR creation/CI are standard quality gate activities and the CLAUDE.md spec treats them as such.

-## Phase 4: Quality Gates & PR Lifecycle
+## Phase 4: Quality Gates

Based on learnings: "All specs must follow 4-phase POC-First Workflow: … Phase 4 (Quality Gates - lint, types, CI, PR)"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/reviewer-subagent/tasks.md` at line 230, Update the Phase 4 header
string "## Phase 4: Quality Gates & PR Lifecycle" to exactly "## Phase 4:
Quality Gates" in the tasks.md content; replace the redundant suffix so the
phase name matches the canonical "Quality Gates" used by the 4-phase POC-First
Workflow (locate the header line that contains "Phase 4: Quality Gates & PR
Lifecycle" and change it to "Phase 4: Quality Gates").

168-228: Phase 3 content doesn't align with the "Testing" phase definition.

Per the CLAUDE.md POC-First Workflow, Phase 3 is for testing (unit/integration/e2e). Tasks 3.1–3.2 are version bumps (release-readiness) and 3.3–3.4 are rubric refinement/error-handling audits — none are testing tasks. The Notes section acknowledges "No build/test step", which explains the intent, but the phase tasks themselves are better categorised as follows:

  • 3.1–3.2 (version bumps) → Phase 4 Quality Gates (release readiness)
  • 3.3–3.4 (refinement/error-handling) → Phase 2 Refactoring

Consider restructuring so Phase 3 holds only grep-verification tasks (or keep the acknowledgement in a phase description comment to make the deviation explicit).

Based on learnings: "All specs must follow 4-phase POC-First Workflow: Phase 1 (Make It Work - POC validation, skip tests), Phase 2 (Refactoring - code cleanup), Phase 3 (Testing - unit, integration, e2e), Phase 4 (Quality Gates - lint, types, CI, PR)"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/reviewer-subagent/tasks.md` around lines 168 - 228, Phase 3 currently
contains non-testing work: move tasks 3.1 and 3.2 (version bumps) into Phase 4
Quality Gates and move tasks 3.3 and 3.4 (rubric refinement and error-handling
audits) into Phase 2 Refactoring, or alternatively update the Phase 3
header/notes to explicitly state these are exceptions to the CLAUDE.md POC-First
Workflow; specifically update the headings and task assignments for "3.1 Bump
version in plugin.json", "3.2 Bump version in marketplace.json", "3.3 Review and
refine spec-reviewer rubrics", and "3.4 Ensure error handling in all review
loops" so they appear under the correct phase or are marked as deliberate
deviations per the CLAUDE.md POC-First Workflow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/ralph-specum/agents/spec-reviewer.md`:
- Around line 74-85: The rubric currently uses P0/P1/P2 priority notation (in
the "Completeness" PASS criterion and the PASS example "FR-1 (P0): Create
reviewer agent") which conflicts with the project MoSCoW standard; update the
rubric text and examples to use Must/Should/Could instead of P0/P1/P2 (e.g.,
change "FRs have priorities (P0/P1/P2)" to "FRs have priorities
(Must/Should/Could)" and change the example "FR-1 (P0): Create reviewer agent"
to "FR-1 (Must): Create reviewer agent"), and scan the surrounding rubric block
(the Completeness/Testability/Traceability/Scope table and examples) for any
other P0/P1/P2 occurrences and replace them accordingly so the rubric matches
the project template format.

In `@plugins/ralph-specum/commands/implement.md`:
- Around line 1102-1112: The current loop treats both "Add fix tasks" and "Log
suggested spec updates" the same by incrementing reviewIteration and re-running
the reviewer, which wastes iterations when no code changes occur; update the
REVIEW_FAIL handling so that when you take the "Log suggested spec updates" path
(append under "## Review Suggestions" in .progress.md) you do NOT increment
reviewIteration or re-invoke the reviewer—instead record the suggestion as a
manual/spec-level issue and break/exit the review loop (or mark review as
deferred) so only the "Add fix tasks" path triggers re-run and
reviewIteration++; keep the existing behavior for the "Add fix tasks" branch
(generate fix task in tasks.md, delegate to spec-executor, on TASK_COMPLETE
re-run Layer 5 and increment reviewIteration).

In `@specs/reviewer-subagent/.progress.md`:
- Line 66: Update the stale version annotation in the Learnings line inside
specs/reviewer-subagent/.progress.md by replacing "(currently 3.4.0)" with
"(currently 3.5.0)" so it matches the PR's bump; ensure the surrounding sentence
still reads correctly and that any other mention of the current version in this
file is updated to 3.5.0 as well.

In `@specs/reviewer-subagent/requirements.md`:
- Around line 81-86: FR-7 and FR-12 are duplicate requirements both referencing
adding a review step in plan-synthesizer.md; remove FR-12 (or merge its wording
into FR-7) so the requirements list contains a single entry for that
functionality, update the table row for FR-7 to reflect the consolidated
description and keep its identifier (FR-7) and associated why/priority (Should |
US-4), and ensure no other references in the document still point to FR-12.

---

Nitpick comments:
In `@plugins/ralph-specum/agents/plan-synthesizer.md`:
- Around line 110-140: The outer fenced block in plan-synthesizer.md uses three
backticks and contains an inner ```yaml``` fence which prematurely closes the
outer block (MD040); change the outer fence to four backticks (i.e., replace the
outer ```text ... ``` with ````text ... ````) so the inner ```yaml``` fence
remains nested and markdown renders correctly, ensuring the opening marker
around "Set iteration = 1" and its matching closing marker are updated to four
backticks.

In `@specs/reviewer-subagent/tasks.md`:
- Around line 192-196: The task item uses a non-standard ordinal "3.2a" which
breaks sequential numbering—rename the task to the next integer in sequence
(e.g., change "3.2a" to "3.3" and increment subsequent tasks accordingly so
numbering remains contiguous), and if adding this renumbered task increases the
total task count, update the frontmatter key total_tasks (e.g., change from 20
to 21) to match; locate the task line containing "3.2a" and the frontmatter key
total_tasks to apply these changes consistently.
- Line 230: Update the Phase 4 header string "## Phase 4: Quality Gates & PR
Lifecycle" to exactly "## Phase 4: Quality Gates" in the tasks.md content;
replace the redundant suffix so the phase name matches the canonical "Quality
Gates" used by the 4-phase POC-First Workflow (locate the header line that
contains "Phase 4: Quality Gates & PR Lifecycle" and change it to "Phase 4:
Quality Gates").
- Around line 168-228: Phase 3 currently contains non-testing work: move tasks
3.1 and 3.2 (version bumps) into Phase 4 Quality Gates and move tasks 3.3 and
3.4 (rubric refinement and error-handling audits) into Phase 2 Refactoring, or
alternatively update the Phase 3 header/notes to explicitly state these are
exceptions to the CLAUDE.md POC-First Workflow; specifically update the headings
and task assignments for "3.1 Bump version in plugin.json", "3.2 Bump version in
marketplace.json", "3.3 Review and refine spec-reviewer rubrics", and "3.4
Ensure error handling in all review loops" so they appear under the correct
phase or are marked as deliberate deviations per the CLAUDE.md POC-First
Workflow.

Comment on lines 81 to 86
| FR-7 | Add review step in plan-synthesizer.md for quick mode | Should | US-4 |
| FR-8 | Implement max 3 iteration enforcement in all review loops | Must | US-3 |
| FR-9 | Implement graceful degradation (proceed with warnings) on max iterations | Must | US-3 |
| FR-10 | Log review findings in .progress.md per iteration | Must | US-3 |
| FR-11 | Bump version in plugin.json and marketplace.json | Must | US-5 |
| FR-12 | Add review step in plan-synthesizer.md for quick mode flow | Should | US-4 |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

FR-7 and FR-12 are now near-duplicates.

After the previous fix applied to FR-12 (changing start.mdplan-synthesizer.md), both entries describe the same requirement:

FR Description
FR-7 Add review step in plan-synthesizer.md for quick mode
FR-12 Add review step in plan-synthesizer.md for quick mode flow

Remove FR-12 (or merge it into FR-7) to avoid ambiguity.

✏️ Suggested correction
 | FR-7 | Add review step in plan-synthesizer.md for quick mode | Should | US-4 |
-| FR-12 | Add review step in plan-synthesizer.md for quick mode flow | Should | US-4 |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| FR-7 | Add review step in plan-synthesizer.md for quick mode | Should | US-4 |
| FR-8 | Implement max 3 iteration enforcement in all review loops | Must | US-3 |
| FR-9 | Implement graceful degradation (proceed with warnings) on max iterations | Must | US-3 |
| FR-10 | Log review findings in .progress.md per iteration | Must | US-3 |
| FR-11 | Bump version in plugin.json and marketplace.json | Must | US-5 |
| FR-12 | Add review step in plan-synthesizer.md for quick mode flow | Should | US-4 |
| FR-7 | Add review step in plan-synthesizer.md for quick mode | Should | US-4 |
| FR-8 | Implement max 3 iteration enforcement in all review loops | Must | US-3 |
| FR-9 | Implement graceful degradation (proceed with warnings) on max iterations | Must | US-3 |
| FR-10 | Log review findings in .progress.md per iteration | Must | US-3 |
| FR-11 | Bump version in plugin.json and marketplace.json | Must | US-5 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/reviewer-subagent/requirements.md` around lines 81 - 86, FR-7 and FR-12
are duplicate requirements both referencing adding a review step in
plan-synthesizer.md; remove FR-12 (or merge its wording into FR-7) so the
requirements list contains a single entry for that functionality, update the
table row for FR-7 to reflect the consolidated description and keep its
identifier (FR-7) and associated why/priority (Should | US-4), and ensure no
other references in the document still point to FR-12.

tzachbon and others added 3 commits February 18, 2026 11:53
…ture)

- Replace P0/P1/P2 with Must/Should/Could in spec-reviewer rubric
- Split REVIEW_FAIL handling into two distinct paths in implement.md
- Update stale version reference 3.4.0 → 3.5.0 in .progress.md
- Remove duplicate FR-12 (merged into FR-7) in requirements.md
- Fix nested markdown fences in plan-synthesizer.md
- Renumber 3.2a → 3.3 and fix Phase 4 header in tasks.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
plugins/ralph-specum/agents/plan-synthesizer.md (1)

621-629: Consider adding a review-completion checklist item.

The Quality Checklist has no entry confirming the Artifact Review step ran. Given that review is now <mandatory>, a checklist item provides a safety net for cases where an LLM skips the step.

✏️ Suggested addition
 - [ ] **State file updated**: phase="execution", totalTasks set correctly
+- [ ] **Artifact review completed**: all 4 artifacts reviewed (REVIEW_PASS or graceful degradation logged)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/agents/plan-synthesizer.md` around lines 621 - 629, Add
a new checklist item to the "Before completing:" Quality Checklist in
plan-synthesizer.md that explicitly confirms the Artifact Review step ran and
its mandatory status; update the list near the existing entries (e.g., alongside
"All artifacts have `generated: auto` in frontmatter") with a line like "- [ ]
Artifact Review completed (`<mandatory>`)" so the synthesizer must acknowledge
completion before finalizing. Ensure the phrasing matches checklist style and
appears before the completion guard to act as a safety net when an LLM might
skip the review step.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/ralph-specum/agents/spec-reviewer.md`:
- Around line 108-126: Adjust the Completeness rule in spec-reviewer.md so that
the "Every task has Do, Files, Done when, Verify, and Commit fields" requirement
is conditional: allow tasks that are explicit checkpoint/VERIFY tasks (detect by
`[VERIFY]` in the `Do` or by titles like "POC Checkpoint"/"1.X POC Checkpoint")
to omit the `Files` field; update the PASS/FAIL table row and the Completeness
examples to show that checkpoint/VERIFY tasks PASS without `Files` while regular
tasks still require all five fields, and ensure the reviewer logic/text
references "Verify/Checkpoint tasks" when describing this exception.

In `@plugins/ralph-specum/commands/implement.md`:
- Around line 1206-1210: Update the Layer 5 Error Handling text to replace the
misleading "Phase agent fails during revision" phrasing with an actor-specific
description naming the spec-executor and the work it performs: say something
like "spec-executor fails while executing a fix task" (or "spec-executor fails
during revision/fix execution"), and keep the behavior unchanged (retry the fix
task once; if it fails again, use the original implementation and proceed).
Ensure the unique term "spec-executor" appears in the rule and that references
to "phase agent" or phase-command terminology are removed or replaced.

---

Duplicate comments:
In `@specs/reviewer-subagent/.progress.md`:
- Line 67: Remove the hard-coded inline version annotation "(currently 3.5.0)"
from the changelog/note so it won't become stale; update the sentence "Pattern:
Version bumps required in BOTH plugin.json AND marketplace.json (currently
3.5.0)" by deleting the parenthetical or replacing it with a
non-version-specific phrase like "(ensure both files are bumped)" so the
guidance remains correct without needing future edits of the version string.

---

Nitpick comments:
In `@plugins/ralph-specum/agents/plan-synthesizer.md`:
- Around line 621-629: Add a new checklist item to the "Before completing:"
Quality Checklist in plan-synthesizer.md that explicitly confirms the Artifact
Review step ran and its mandatory status; update the list near the existing
entries (e.g., alongside "All artifacts have `generated: auto` in frontmatter")
with a line like "- [ ] Artifact Review completed (`<mandatory>`)" so the
synthesizer must acknowledge completion before finalizing. Ensure the phrasing
matches checklist style and appears before the completion guard to act as a
safety net when an LLM might skip the review step.

Comment on lines +108 to +126
| Dimension | PASS Criteria | FAIL Criteria |
|-----------|--------------|---------------|
| Completeness | Every task has Do, Files, Done when, Verify, and Commit fields | Any task missing required fields |
| Traceability | Tasks reference requirements (FR-*) and/or design sections | Tasks exist without tracing to requirements or design |
| Actionability | Do steps are concrete with specific instructions (file names, code patterns, section names) | Do steps are vague (e.g., "implement the feature", "add appropriate code") |
| Structure | POC-first 4-phase structure followed (Phase 1: POC, Phase 2: Refactoring, Phase 3: Testing, Phase 4: Quality) | Phases are out of order, missing, or don't follow POC-first approach |
| Quality Gates | [VERIFY] tasks present at appropriate intervals (every 2-3 tasks) | No [VERIFY] tasks, or gaps of more than 3 tasks without a checkpoint |

**Examples**:
- Completeness PASS: Task has all five fields: `Do` (numbered steps), `Files` (list), `Done when` (criteria), `Verify` (shell command), `Commit` (message).
- Completeness FAIL: Task has `Do` and `Files` but no `Verify` command.
- Traceability PASS: Task footer says "_Requirements: FR-1_ / _Design: Component A_".
- Traceability FAIL: Task has no FR-* or design section references.
- Actionability PASS: "Add `## Artifact Review` section after line 45 in `commands/research.md` with iteration counter starting at 1."
- Actionability FAIL: "Implement the review feature in the appropriate files."
- Structure PASS: Phase 1 is POC (minimal wiring), Phase 2 is full integration, Phase 3 is testing, Phase 4 is quality gates.
- Structure FAIL: Phase 1 jumps straight to testing; or Phase 2 is labeled "POC" but Phase 1 already exists.
- Quality Gates PASS: [VERIFY] task after tasks 1.2 and 2.3 (every 2-3 tasks).
- Quality Gates FAIL: 6 consecutive tasks with no [VERIFY] checkpoint.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tasks Rubric Completeness criterion will false-positive on every correctly-structured tasks.md.

The criterion Every task has Do, Files, Done when, Verify, and Commit fields applies unconditionally to all tasks. However, [VERIFY] checkpoint tasks — which are required by the quality guidelines every 2–3 tasks — legitimately omit the Files field because they make no file changes. The plan-synthesizer.md template itself includes a 1.X POC Checkpoint task without a Files field (lines 470–475 of that file).

Result: the spec-reviewer will output REVIEW_FAIL on any tasks.md that follows the required checkpoint pattern (which is every generated tasks.md), causing 3 wasted iterations and graceful degradation on all tasks reviews, regardless of actual artifact quality.

🐛 Proposed fix — allow `[VERIFY]` checkpoint tasks to omit `Files`
-| Completeness | Every task has Do, Files, Done when, Verify, and Commit fields | Any task missing required fields |
+| Completeness | Every implementation task has Do, Files, Done when, Verify, and Commit fields; `[VERIFY]` checkpoint tasks may omit Files | Implementation tasks missing any of Do, Files, Done when, Verify, or Commit; `[VERIFY]` tasks missing Do, Done when, or Verify |

Update the Completeness examples accordingly:

-- Completeness FAIL: Task has `Do` and `Files` but no `Verify` command.
+- Completeness FAIL: Implementation task has `Do` and `Files` but no `Verify` command. (Note: `[VERIFY]` checkpoint tasks may omit `Files`.)
+- Completeness PASS: `[VERIFY]` task has `Do`, `Done when`, `Verify`, and `Commit` but no `Files` — this is expected for checkpoint tasks.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/agents/spec-reviewer.md` around lines 108 - 126, Adjust
the Completeness rule in spec-reviewer.md so that the "Every task has Do, Files,
Done when, Verify, and Commit fields" requirement is conditional: allow tasks
that are explicit checkpoint/VERIFY tasks (detect by `[VERIFY]` in the `Do` or
by titles like "POC Checkpoint"/"1.X POC Checkpoint") to omit the `Files` field;
update the PASS/FAIL table row and the Completeness examples to show that
checkpoint/VERIFY tasks PASS without `Files` while regular tasks still require
all five fields, and ensure the reviewer logic/text references
"Verify/Checkpoint tasks" when describing this exception.

Comment on lines +1206 to +1210
**Layer 5 Error Handling**:

- **Reviewer fails to output signal**: treat as REVIEW_PASS (permissive) and log with status "REVIEW_PASS (no signal)"
- **Phase agent fails during revision**: retry the fix task once; if it fails again, use the original implementation and proceed
- **Iteration counter edge cases**: if reviewIteration is missing or invalid, default to 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

"Phase agent fails during revision" uses phase-command terminology in an execution context.

In Layer 5, the "agent doing revision" is spec-executor (executing fix tasks generated by the coordinator), not a phase agent (research-analyst, product-manager, etc.). The wording borrows the template from phase commands but the actor is different here.

✏️ Proposed fix
-- **Phase agent fails during revision**: retry the fix task once; if it fails again, use the original implementation and proceed
+- **spec-executor fails during fix task**: retry the fix task once; if it fails again, use the original implementation and proceed
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Layer 5 Error Handling**:
- **Reviewer fails to output signal**: treat as REVIEW_PASS (permissive) and log with status "REVIEW_PASS (no signal)"
- **Phase agent fails during revision**: retry the fix task once; if it fails again, use the original implementation and proceed
- **Iteration counter edge cases**: if reviewIteration is missing or invalid, default to 1
**Layer 5 Error Handling**:
- **Reviewer fails to output signal**: treat as REVIEW_PASS (permissive) and log with status "REVIEW_PASS (no signal)"
- **spec-executor fails during fix task**: retry the fix task once; if it fails again, use the original implementation and proceed
- **Iteration counter edge cases**: if reviewIteration is missing or invalid, default to 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/commands/implement.md` around lines 1206 - 1210, Update
the Layer 5 Error Handling text to replace the misleading "Phase agent fails
during revision" phrasing with an actor-specific description naming the
spec-executor and the work it performs: say something like "spec-executor fails
while executing a fix task" (or "spec-executor fails during revision/fix
execution"), and keep the behavior unchanged (retry the fix task once; if it
fails again, use the original implementation and proceed). Ensure the unique
term "spec-executor" appears in the rule and that references to "phase agent" or
phase-command terminology are removed or replaced.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/ralph-specum/commands/implement.md`:
- Around line 1127-1129: The review-iteration logging currently lacks an Action
value for Path B (spec-level deferral) and logs before the path decision, making
Action ambiguous; update the logic that emits the "Action" field (references:
REVIEW_FAIL, REVIEW_PASS, reviewIteration and the logging step labeled “a” vs
decision step “c”) to (1) move the logging step to after the path decision so
Action can be set deterministically and (2) add a defined Action string for Path
B (e.g., "deferred_to_spec" or similar) that the coordinator will record when
the spec-level deferral branch is taken (ensure the branch that logs suggestions
and breaks the loop without incrementing reviewIteration sets this Action).

---

Duplicate comments:
In `@plugins/ralph-specum/commands/implement.md`:
- Line 1187: Update the wording that incorrectly uses "Phase agent" for the
revision step: change the heading/text "Phase agent fails during revision" to
reference the actual execution actor and task (e.g., "spec-executor fails during
revision" or "Spec executor fails while executing fix task") and adjust
surrounding text to mention the spec-executor performing a fix task/revision
rather than phase agents like research-analyst or product-manager; ensure
occurrences of "phase agent" in this section are replaced with "spec-executor"
or "executor" and that the flow describes retrying the fix task once, falling
back to the original implementation if the spec-executor fails again.

Comment on lines +1127 to +1129
- "fix task added" if REVIEW_FAIL and reviewIteration < 3 (fix task generated)
- "warnings appended, proceeded" if REVIEW_FAIL and reviewIteration >= 3 (graceful degradation)
- "proceeded" if REVIEW_PASS
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Action field in Review Iteration Logging has no defined value for Path B (spec-level deferral).

Lines 1127–1129 define three Action values, but only cover Path A (fix task) and REVIEW_PASS/graceful-degradation outcomes. When Path B is chosen (spec-level deferral: suggestions logged, loop broken without incrementing reviewIteration), there is no defined Action string, so the coordinator has no guidance on what to record.

Additionally, the log step (a) fires before the path decision (c), so the Action value cannot be known at the time of logging — a coordinator would have to retrospectively fill it in or leave it ambiguous.

✏️ Suggested fix
   - "fix task added" if REVIEW_FAIL and reviewIteration < 3 (fix task generated)
+  - "spec update logged, proceeded (deferred)" if REVIEW_FAIL and reviewIteration < 3 AND Path B taken (spec-level deferral)
   - "warnings appended, proceeded" if REVIEW_FAIL and reviewIteration >= 3 (graceful degradation)
   - "proceeded" if REVIEW_PASS

Consider also moving the logging step to after the path decision (after step c) so the Action field can be accurately recorded.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/ralph-specum/commands/implement.md` around lines 1127 - 1129, The
review-iteration logging currently lacks an Action value for Path B (spec-level
deferral) and logs before the path decision, making Action ambiguous; update the
logic that emits the "Action" field (references: REVIEW_FAIL, REVIEW_PASS,
reviewIteration and the logging step labeled “a” vs decision step “c”) to (1)
move the logging step to after the path decision so Action can be set
deterministically and (2) add a defined Action string for Path B (e.g.,
"deferred_to_spec" or similar) that the coordinator will record when the
spec-level deferral branch is taken (ensure the branch that logs suggestions and
breaks the loop without incrementing reviewIteration sets this Action).

@tzachbon tzachbon merged commit 25f556c into main Feb 18, 2026
3 checks passed
@tzachbon tzachbon deleted the feat/reviewer-subagent branch February 18, 2026 10:16
tzachbon added a commit that referenced this pull request Feb 19, 2026
- Bump version 3.5.0 → 3.5.1 (main already has 3.5.0 from PR #92)
- implement.md: add content field to SendMessage shutdown
- requirements.md: add language specifier to code block
- research.md: add note that feedback loop skips team pattern
- design.md: remove ambiguous mermaid arrow, add code block language
- .progress.md: remove duplicated task 2.2 entry
- index-state.json: add task counts for enforce-teams-instead
- index.md: fix phase value for return-ralph-wrigum

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant