From 6c0ada2124ea1487307dd9fb67f4d42ab52a07ef Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Thu, 29 Jan 2026 21:19:44 +0200
Subject: [PATCH 01/37] spec(refactor-plugins): add requirements

---
 specs/refactor-plugins/requirements.md | 176 +++++++++++++++++++++++++
 1 file changed, 176 insertions(+)
 create mode 100644 specs/refactor-plugins/requirements.md
diff --git a/specs/refactor-plugins/requirements.md b/specs/refactor-plugins/requirements.md
new file mode 100644
index 00000000..93b1c67d
--- /dev/null
+++ b/specs/refactor-plugins/requirements.md
@@ -0,0 +1,176 @@
+---
+spec: refactor-plugins
+phase: requirements
+created: 2026-01-29
+---
+
+# Requirements: Plugin Refactoring to Best Practices
+
+## Goal
+
+Refactor ralph-specum and ralph-speckit plugins to fully comply with plugin-dev skills best practices, fixing 61 identified issues across agents, skills, hooks, and commands.
+
+## User Decisions
+
+| Question | Response |
+|----------|----------|
+| Primary users | Both developers and end users |
+| Priority tradeoffs | Prioritize thoroughness over speed |
+| Success criteria | Full compliance + documentation (all issues fixed plus validation scripts and docs) |
+| Problem statement | Improve all plugins according to plugin-dev skills best practices |
+| Constraints | Use only plugin-dev skills to improve all plugins |
+
+---
+
+## User Stories
+
+### US-1: Agent Color and Examples
+**As a** plugin developer
+**I want** all agents to have proper `color` and `<example>` blocks
+**So that** Claude can correctly identify when to invoke each agent
+
+**Acceptance Criteria:**
+- [ ] AC-1.1: All 8 ralph-specum agents have `color` field in frontmatter
+- [ ] AC-1.2: All 6 ralph-speckit agents have `color` field in frontmatter
+- [ ] AC-1.3: All 14 agents have at least 2 `<example>` blocks in description
+- [ ] AC-1.4: Each example follows Context/user/assistant/commentary format
+- [ ] AC-1.5: Colors match semantic guidelines (blue=analysis, green=execution, yellow=validation, magenta=transformation)
+
+### US-2: Skill Version and Description Format
+**As a** plugin user
+**I want** skills to have proper version and trigger-phrase descriptions
+**So that** Claude correctly identifies when to use each skill
+
+**Acceptance Criteria:**
+- [ ] AC-2.1: All 6 ralph-specum skills have `version: 0.1.0` field
+- [ ] AC-2.2: All 4 ralph-speckit skills have `version: 0.1.0` field
+- [ ] AC-2.3: All skill descriptions use third-person format ("This skill should be used when...")
+- [ ] AC-2.4: All skill descriptions include at least 3 trigger phrases in quotes
+- [ ] AC-2.5: interview-framework skill description is rewritten in correct format
+
+### US-3: Hook Matcher Fields
+**As a** plugin developer
+**I want** all hook entries to have explicit `matcher` field
+**So that** hook configuration matches official plugin patterns
+
+**Acceptance Criteria:**
+- [ ] AC-3.1: ralph-specum hooks.json Stop entry has `matcher: "*"`
+- [ ] AC-3.2: ralph-specum hooks.json SessionStart entry has `matcher: "*"`
+- [ ] AC-3.3: ralph-speckit hooks.json Stop entry has `matcher: "*"`
+
+### US-4: Command Migration and Fixes
+**As a** ralph-speckit user
+**I want** all commands consolidated in `commands/` with proper frontmatter
+**So that** the plugin follows standard structure
+
+**Acceptance Criteria:**
+- [ ] AC-4.1: All 5 modern ralph-speckit commands have `name` field
+- [ ] AC-4.2: All 9 legacy commands migrated from `.claude/commands/` to `commands/`
+- [ ] AC-4.3: Duplicate implement.md resolved (keep one, remove other)
+- [ ] AC-4.4: Legacy `.claude/commands/` directory removed
+- [ ] AC-4.5: Migrated commands have proper frontmatter (name, description, allowed_tools)
+
+### US-5: Validation and Documentation
+**As a** plugin maintainer
+**I want** validation scripts and updated documentation
+**So that** future changes maintain compliance
+
+**Acceptance Criteria:**
+- [ ] AC-5.1: Validation script checks all agents have `color` field
+- [ ] AC-5.2: Validation script checks all agents have `<example>` blocks
+- [ ] AC-5.3: Validation script checks all skills have `version` field
+- [ ] AC-5.4: Validation script checks all hooks have `matcher` field
+- [ ] AC-5.5: CLAUDE.md updated with best practices reference
+
+---
+
+## Functional Requirements
+
+| ID | Requirement | Priority | Acceptance Criteria |
+|----|-------------|----------|---------------------|
+| FR-1 | Add `color` field to all 14 agents | P0 | Agents render with semantic colors |
+| FR-2 | Add `<example>` blocks to all 14 agent descriptions | P0 | Each agent has 2+ examples with correct format |
+| FR-3 | Fix skill descriptions to third-person format | P0 | All descriptions start with "This skill should be used when" |
+| FR-4 | Add `version: 0.1.0` to all 10 skills | P1 | All skills report version in metadata |
+| FR-5 | Add `matcher: "*"` to all hook entries | P1 | Hook config matches official patterns |
+| FR-6 | Add `name` field to 5 ralph-speckit commands | P1 | Commands register with correct names |
+| FR-7 | Migrate 9 legacy commands to `commands/` | P1 | All commands in standard location |
+| FR-8 | Remove duplicate implement.md | P1 | Only one implement command exists |
+| FR-9 | Create validation script | P2 | Script exits non-zero on compliance failures |
+| FR-10 | Update CLAUDE.md documentation | P2 | Best practices documented |
+
+---
+
+## Non-Functional Requirements
+
+| ID | Requirement | Metric | Target |
+|----|-------------|--------|--------|
+| NFR-1 | Backward compatibility | Breaking changes | 0 breaking changes to existing workflows |
+| NFR-2 | Validation speed | Script runtime | < 5 seconds |
+| NFR-3 | Code consistency | Style | Match official plugin-dev patterns exactly |
+
+---
+
+## Glossary
+
+- **Agent**: Subagent definition (markdown file in `agents/`) invoked via Task tool
+- **Skill**: Contextual knowledge (markdown in `skills/*/SKILL.md`) auto-loaded when relevant
+- **Hook**: Event-driven action (JSON in `hooks/hooks.json`) triggered on lifecycle events
+- **Command**: Slash command (markdown in `commands/`) invoked by user
+- **Matcher**: Hook field specifying which events trigger the hook (`*` = all)
+- **Frontmatter**: YAML metadata block at top of markdown files (between `---` markers)
+- **Third-person description**: Format starting with "This skill/agent should be used when..."
+
+---
+
+## Out of Scope
+
+- Adding new agents, skills, or commands
+- Changing agent behavior or prompts beyond frontmatter fixes
+- Adding `tools` restrictions to agents (noted in research but not required)
+- Adding SessionStart hook to ralph-speckit (optional enhancement)
+- Enhancing plugin.json with repository/homepage fields (nice-to-have)
+- Performance optimization of plugins
+- CI/CD integration of validation script
+
+---
+
+## Dependencies
+
+- plugin-dev skills must be available for reference patterns
+- Both plugins must be in working state before refactor
+- No external service dependencies
+
+---
+
+## Risk Assessment
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Breaking agent triggering | High | Medium | Test each agent after color/example changes |
+| Legacy command loss | High | Low | Backup before migration, verify all commands work |
+| Skill trigger regression | Medium | Low | Test skill matching after description changes |
+
+---
+
+## Success Criteria
+
+1. All 61 identified issues resolved
+2. Validation script passes for both plugins
+3. No regressions in existing plugin functionality
+4. Both plugins match official plugin-dev patterns
+
+---
+
+## Unresolved Questions
+
+- Should colors be unique per agent or grouped by function? (Recommendation: group by function per research.md)
+- Should ralph-speckit get SessionStart hook? (Out of scope for this refactor)
+
+---
+
+## Next Steps
+
+1. Run `/ralph-specum:design` to create technical design for implementation
+2. Define file-by-file change plan for each component type
+3. Create task breakdown with quality checkpoints

From 15ad4843c093e983b8d349ccd1d00f3fd6cac5ac Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Thu, 29 Jan 2026 22:33:28 +0200
Subject: [PATCH 02/37] spec(refactor-plugins): add technical design

Two-phase approach:
- Phase A: Metadata fixes (color, version, matcher, examples)
- Phase B: Skill consolidation (11 new skills, simplified commands/agents)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 specs/refactor-plugins/design.md   | 520 +++++++++++++++++++++++++++++
 specs/refactor-plugins/research.md | 360 ++++++++++++++++++++
 2 files changed, 880 insertions(+)
 create mode 100644 specs/refactor-plugins/design.md
 create mode 100644 specs/refactor-plugins/research.md

diff --git a/specs/refactor-plugins/design.md b/specs/refactor-plugins/design.md
new file mode 100644
index 00000000..9e94168c
--- /dev/null
+++ b/specs/refactor-plugins/design.md
@@ -0,0 +1,520 @@
+---
+spec: refactor-plugins
+phase: design
+created: 2026-01-29
+updated: 2026-01-29
+---
+
+# Design: Plugin Refactoring to Best Practices
+
+## Overview
+
+Two-phase refactoring: (1) Add missing frontmatter fields (color, version, matcher) and example blocks. (2) Consolidate procedural logic from commands/agents INTO skills. Commands become thin wrappers; agents reference skills for expertise.
+
+## Design Inputs (Interview Responses)
+
+| Parameter | Value |
+|-----------|-------|
+| Architecture style | Edit files in place |
+| Technology constraints | Bash only for validation |
+| Integration approach | Minimal - frontmatter/metadata only |
+| New direction | Consolidate heavy-lifting into skills |
+
+## Skill Consolidation Strategy
+
+### Current State Analysis
+
+**Commands** (heavy - 100-1200 lines each):
+- `implement.md`: 1200+ lines of coordinator logic, recovery orchestration, state machine
+- `research.md`: 700+ lines including parallel research patterns, merge logic
+- `start.md`: 980+ lines with branch management, quick mode, intent classification
+- `design.md`, `requirements.md`, `tasks.md`: 250-300 lines each with interview patterns
+
+**Agents** (moderate - 250-500 lines each):
+- `spec-executor.md`: 440 lines with execution rules, phase-specific rules, commit discipline
+- `task-planner.md`: 520 lines with POC workflow, quality checkpoints, VF task generation
+- `research-analyst.md`: 340 lines with quality command discovery, methodology
+- `architect-reviewer.md`: 250 lines with design structure template
+
+**Skills** (light - 50-200 lines each):
+- `interview-framework`: 200 lines - already well-consolidated
+- `communication-style`: 105 lines - good
+- `delegation-principle`: 48 lines - good
+- `spec-workflow`: 45 lines - just command reference
+- `reality-verification`: ~100 lines - good
+- `smart-ralph`: ~100 lines - good
+
+### Consolidation Goals
+
+| Goal | Benefit |
+|------|---------|
+| Skills = reusable knowledge | Referenced by multiple commands/agents |
+| Commands = thin orchestration | ~50-100 lines max, delegate to skills |
+| Agents = focused expertise | Reference skills, don't duplicate patterns |
+
+### Content Migration Matrix
+
+| Source | Content to Extract | New Skill | Lines Saved |
+|--------|-------------------|-----------|-------------|
+| implement.md | Recovery orchestration (6b-6d) | `failure-recovery` | ~400 |
+| implement.md | Verification layers (7) | `verification-layers` | ~70 |
+| implement.md | Coordinator prompt pattern | `coordinator-pattern` | ~150 |
+| start.md | Branch management logic | `branch-management` | ~200 |
+| start.md | Intent classification | `intent-classification` | ~100 |
+| start.md | Spec scanner | `spec-scanner` | ~80 |
+| research.md | Parallel execution pattern | `parallel-research` | ~300 |
+| research.md | Merge results algorithm | (in parallel-research) | ~100 |
+| spec-executor.md | Phase-specific rules | `phase-rules` | ~50 |
+| spec-executor.md | Commit discipline | `commit-discipline` | ~60 |
+| task-planner.md | POC-first workflow rules | `poc-workflow` | ~80 |
+| task-planner.md | Quality checkpoint rules | `quality-checkpoints` | ~100 |
+| task-planner.md | VF task generation | (in reality-verification) | ~40 |
+| research-analyst.md | Quality command discovery | `quality-commands` | ~80 |
+
+### New Skills to Create
+
+| Skill | Purpose | Referenced By |
+|-------|---------|---------------|
+| `failure-recovery` | Iterative fix task generation, recovery loop | implement.md, coordinator |
+| `verification-layers` | 4-layer verification (contradiction, uncommitted, checkmark, signal) | implement.md, spec-executor |
+| `coordinator-pattern` | Task delegation, state management, completion signal | implement.md |
+| `branch-management` | Branch creation, worktree setup, naming conventions | start.md |
+| `intent-classification` | Goal analysis for question counts | start.md, all phase commands |
+| `spec-scanner` | Related specs discovery | start.md, research.md |
+| `parallel-research` | Multi-agent research spawning, merge logic | research.md |
+| `phase-rules` | POC/Refactor/Testing/Quality phase behaviors | spec-executor, task-planner |
+| `commit-discipline` | Commit rules, spec file inclusion | spec-executor |
+| `quality-checkpoints` | [VERIFY] task format, frequency rules | task-planner |
+| `quality-commands` | Discovery from package.json/Makefile/CI | research-analyst |
+
+### Command Simplification Plan
+
+After consolidation, commands become thin orchestrators:
+
+```markdown
+# Before: implement.md (1200 lines)
+- Full coordinator prompt inline
+- Recovery orchestration logic inline
+- Verification layers inline
+- State machine inline
+
+# After: implement.md (~150 lines)
+1. Determine active spec
+2. Validate prerequisites
+3. Parse arguments
+4. Initialize state
+5. Reference skills:
+   - Invoke skill: coordinator-pattern
+   - Invoke skill: failure-recovery (if --recovery-mode)
+   - Invoke skill: verification-layers
+6. Invoke Ralph Loop
+```
+
+| Command | Before | After | References Skills |
+|---------|--------|-------|-------------------|
+| implement.md | 1200 | ~150 | coordinator-pattern, failure-recovery, verification-layers |
+| start.md | 980 | ~200 | branch-management, intent-classification, spec-scanner, interview-framework |
+| research.md | 700 | ~150 | parallel-research, interview-framework |
+| design.md | 300 | ~80 | interview-framework |
+| requirements.md | 294 | ~80 | interview-framework |
+| tasks.md | 314 | ~80 | interview-framework |
+
+### Agent Simplification Plan
+
+Agents reference skills instead of duplicating patterns:
+
+```markdown
+# Before: spec-executor.md (440 lines)
+- Phase rules inline (50 lines)
+- Commit discipline inline (60 lines)
+- Verification handling inline (100 lines)
+
+# After: spec-executor.md (~200 lines)
+- Core execution logic
+- Reference skill: phase-rules
+- Reference skill: commit-discipline
+- Reference skill: verification-layers (for [VERIFY] tasks)
+```
+
+| Agent | Before | After | References Skills |
+|-------|--------|-------|-------------------|
+| spec-executor.md | 440 | ~200 | phase-rules, commit-discipline, verification-layers |
+| task-planner.md | 520 | ~250 | poc-workflow, quality-checkpoints, phase-rules |
+| research-analyst.md | 340 | ~200 | quality-commands |
+| architect-reviewer.md | 250 | ~200 | (minimal change - mostly template) |
+
+### Skill Reference Pattern
+
+Commands/agents reference skills using:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/failure-recovery/SKILL.md`
+Use the failure recovery pattern when spec-executor does not output TASK_COMPLETE and recoveryMode is true.
+</skill-reference>
+```
+
+Or inline reference:
+```markdown
+**Failure Recovery**: Apply standard recovery loop from `skills/failure-recovery/SKILL.md`
+```
+
+## Change Strategy
+
+**Approach**: Two-phase refactoring
+1. **Phase A**: Metadata fixes (original scope) - color, version, matcher, examples
+2. **Phase B**: Skill consolidation - extract patterns to skills, simplify commands/agents
+
+**Safety**:
+- All changes are additive (add fields, not remove)
+- Skill references maintain full behavior
+- Backward compatible - existing usage unchanged
+- Phase A can be deployed independently
+
+## File Change Matrix
+
+### ralph-specum Agents (8 files)
+
+| File | Add `color` | Add `<example>` blocks |
+|------|-------------|------------------------|
+| `plugins/ralph-specum/agents/research-analyst.md` | `blue` | 2 examples |
+| `plugins/ralph-specum/agents/product-manager.md` | `cyan` | 2 examples |
+| `plugins/ralph-specum/agents/architect-reviewer.md` | `blue` | 2 examples |
+| `plugins/ralph-specum/agents/task-planner.md` | `cyan` | 2 examples |
+| `plugins/ralph-specum/agents/spec-executor.md` | `green` | 2 examples |
+| `plugins/ralph-specum/agents/plan-synthesizer.md` | `green` | 2 examples |
+| `plugins/ralph-specum/agents/qa-engineer.md` | `yellow` | 2 examples |
+| `plugins/ralph-specum/agents/refactor-specialist.md` | `magenta` | 2 examples |
+
+### ralph-speckit Agents (6 files)
+
+| File | Add `color` | Add `<example>` blocks |
+|------|-------------|------------------------|
+| `plugins/ralph-speckit/agents/constitution-architect.md` | `magenta` | 2 examples |
+| `plugins/ralph-speckit/agents/spec-analyst.md` | `blue` | 2 examples |
+| `plugins/ralph-speckit/agents/qa-engineer.md` | `yellow` | 2 examples |
+| `plugins/ralph-speckit/agents/spec-executor.md` | `green` | 2 examples |
+| `plugins/ralph-speckit/agents/plan-architect.md` | `cyan` | 2 examples |
+| `plugins/ralph-speckit/agents/task-planner.md` | `cyan` | 2 examples |
+
+### ralph-specum Skills (6 files)
+
+| File | Add `version` | Fix description |
+|------|---------------|-----------------|
+| `plugins/ralph-specum/skills/communication-style/SKILL.md` | `0.1.0` | No (OK) |
+| `plugins/ralph-specum/skills/delegation-principle/SKILL.md` | `0.1.0` | No (OK) |
+| `plugins/ralph-specum/skills/interview-framework/SKILL.md` | `0.1.0` | Yes - rewrite |
+| `plugins/ralph-specum/skills/reality-verification/SKILL.md` | `0.1.0` | No (OK) |
+| `plugins/ralph-specum/skills/smart-ralph/SKILL.md` | `0.1.0` | No (OK) |
+| `plugins/ralph-specum/skills/spec-workflow/SKILL.md` | `0.1.0` | No (OK) |
+
+### ralph-speckit Skills (4 files)
+
+| File | Add `version` | Fix description |
+|------|---------------|-----------------|
+| `plugins/ralph-speckit/skills/communication-style/SKILL.md` | `0.1.0` | Yes - rewrite |
+| `plugins/ralph-speckit/skills/delegation-principle/SKILL.md` | `0.1.0` | Yes - rewrite |
+| `plugins/ralph-speckit/skills/smart-ralph/SKILL.md` | `0.1.0` | Yes - rewrite |
+| `plugins/ralph-speckit/skills/speckit-workflow/SKILL.md` | `0.1.0` | Yes - rewrite |
+
+### Hooks (2 files)
+
+| File | Change |
+|------|--------|
+| `plugins/ralph-specum/hooks/hooks.json` | Add `"matcher": "*"` to Stop and SessionStart entries |
+| `plugins/ralph-speckit/hooks/hooks.json` | Add `"matcher": "*"` to Stop entry |
+
+### ralph-speckit Commands (5 files + 9 legacy)
+
+**Modern commands - add `name` field:**
+
+| File | Add `name` |
+|------|-----------|
+| `plugins/ralph-speckit/commands/start.md` | `start` |
+| `plugins/ralph-speckit/commands/status.md` | `status` |
+| `plugins/ralph-speckit/commands/switch.md` | `switch` |
+| `plugins/ralph-speckit/commands/cancel.md` | `cancel` |
+| `plugins/ralph-speckit/commands/implement.md` | `implement` |
+
+**Legacy commands - migrate from `.claude/commands/` to `commands/`:**
+
+| Source | Destination | Add frontmatter |
+|--------|-------------|-----------------|
+| `.claude/commands/speckit.analyze.md` | `commands/analyze.md` | name, allowed_tools |
+| `.claude/commands/speckit.checklist.md` | `commands/checklist.md` | name, allowed_tools |
+| `.claude/commands/speckit.clarify.md` | `commands/clarify.md` | name, allowed_tools |
+| `.claude/commands/speckit.constitution.md` | `commands/constitution.md` | name, allowed_tools |
+| `.claude/commands/speckit.implement.md` | REMOVE (duplicate) | - |
+| `.claude/commands/speckit.plan.md` | `commands/plan.md` | name, allowed_tools |
+| `.claude/commands/speckit.specify.md` | `commands/specify.md` | name, allowed_tools |
+| `.claude/commands/speckit.tasks.md` | `commands/tasks.md` | name, allowed_tools |
+| `.claude/commands/speckit.taskstoissues.md` | `commands/taskstoissues.md` | name, allowed_tools |
+
+**Post-migration**: Remove `plugins/ralph-speckit/.claude/commands/` directory.
+
+### Validation Script (1 new file)
+
+| File | Purpose |
+|------|---------|
+| `scripts/validate-plugins.sh` | Check compliance, exit non-zero on failure |
+
+### Documentation (1 file)
+
+| File | Change |
+|------|--------|
+| `CLAUDE.md` | Add plugin best practices reference section |
+
+## Agent Color Assignments
+
+Color grouping by semantic function:
+
+| Color | Function | Agents |
+|-------|----------|--------|
+| `blue` | Analysis, review, investigation | research-analyst, architect-reviewer, spec-analyst |
+| `cyan` | Planning, coordination | product-manager, task-planner (both), plan-architect |
+| `green` | Generation, execution | spec-executor (both), plan-synthesizer |
+| `yellow` | Validation, quality | qa-engineer (both) |
+| `magenta` | Transformation, creative | refactor-specialist, constitution-architect |
+
+## Example Block Format
+
+Each agent gets 2 examples in description:
+
+```markdown
+<example>
+Context: [Scenario setup]
+user: "[User message]"
+assistant: "[Claude response about using agent]"
+<commentary>
+[Why this triggers the agent]
+</commentary>
+</example>
+```
+
+## Validation Script Design
+
+**Location**: `scripts/validate-plugins.sh`
+
+**Checks**:
+
+```bash
+#!/bin/bash
+# Plugin compliance validation
+
+ERRORS=0
+
+# 1. Check agents have color field
+for agent in plugins/*/agents/*.md; do
+  if ! grep -q "^color:" "$agent"; then
+    echo "FAIL: Missing color in $agent"
+    ((ERRORS++))
+  fi
+done
+
+# 2. Check agents have <example> blocks (at least 2)
+for agent in plugins/*/agents/*.md; do
+  count=$(grep -c "<example>" "$agent" || echo 0)
+  if [ "$count" -lt 2 ]; then
+    echo "FAIL: Need 2+ examples in $agent (found $count)"
+    ((ERRORS++))
+  fi
+done
+
+# 3. Check skills have version field
+for skill in plugins/*/skills/*/SKILL.md; do
+  if ! grep -q "^version:" "$skill"; then
+    echo "FAIL: Missing version in $skill"
+    ((ERRORS++))
+  fi
+done
+
+# 4. Check hooks have matcher field
+for hooks in plugins/*/hooks/hooks.json; do
+  if ! grep -q '"matcher"' "$hooks"; then
+    echo "FAIL: Missing matcher in $hooks"
+    ((ERRORS++))
+  fi
+done
+
+# 5. Check no legacy commands remain
+if [ -d "plugins/ralph-speckit/.claude/commands" ]; then
+  echo "FAIL: Legacy commands directory still exists"
+  ((ERRORS++))
+fi
+
+# Summary
+if [ $ERRORS -eq 0 ]; then
+  echo "PASS: All plugins compliant"
+  exit 0
+else
+  echo "FAIL: $ERRORS compliance issues"
+  exit 1
+fi
+```
+
+## Execution Order
+
+### Phase A: Metadata Fixes (Original Scope)
+
+| Step | Tasks | Files |
+|------|-------|-------|
+| A1 | Fix ralph-specum agents (color + examples) | 8 files |
+| A2 | Fix ralph-speckit agents (color + examples) | 6 files |
+| A3 | Fix ralph-specum skills (version) | 6 files |
+| A4 | Fix ralph-speckit skills (version + descriptions) | 4 files |
+| A5 | Fix hooks (matcher) | 2 files |
+| A6 | Fix ralph-speckit modern commands (name) | 5 files |
+| A7 | Migrate legacy commands | 8 files create |
+| A8 | Remove legacy directory | 1 directory |
+| A9 | Create validation script | 1 file |
+| A10 | Update CLAUDE.md | 1 file |
+| QC-A | Run Phase A validation, verify no regressions | - |
+
+### Phase B: Skill Consolidation (New Scope)
+
+| Step | Tasks | Files |
+|------|-------|-------|
+| B1 | Create failure-recovery skill | 1 skill |
+| B2 | Create verification-layers skill | 1 skill |
+| B3 | Create coordinator-pattern skill | 1 skill |
+| B4 | Create branch-management skill | 1 skill |
+| B5 | Create intent-classification skill | 1 skill |
+| B6 | Create spec-scanner skill | 1 skill |
+| B7 | Create parallel-research skill | 1 skill |
+| B8 | Create phase-rules skill | 1 skill |
+| B9 | Create commit-discipline skill | 1 skill |
+| B10 | Create quality-checkpoints skill | 1 skill |
+| B11 | Create quality-commands skill | 1 skill |
+| B12 | Simplify implement.md | 1 command |
+| B13 | Simplify start.md | 1 command |
+| B14 | Simplify research.md | 1 command |
+| B15 | Simplify design.md, requirements.md, tasks.md | 3 commands |
+| B16 | Simplify spec-executor.md | 1 agent |
+| B17 | Simplify task-planner.md | 1 agent |
+| B18 | Simplify research-analyst.md | 1 agent |
+| QC-B | Run Phase B validation, test skill references | - |
+
+## Technical Decisions
+
+### Phase A Decisions
+
+| Decision | Options | Choice | Rationale |
+|----------|---------|--------|-----------|
+| Edit approach | In-place vs temp files | In-place | Simpler, git tracks changes |
+| Color strategy | Unique per agent vs grouped | Grouped by function | Consistent semantic meaning |
+| Legacy command naming | Keep speckit. prefix vs strip | Strip prefix | Match modern command style |
+| Validation location | scripts/ vs plugin dir | scripts/ | Project-wide, not plugin-specific |
+| Skill version | Use plugin version vs 0.1.0 | 0.1.0 | Standard initial version |
+
+### Phase B Decisions
+
+| Decision | Options | Choice | Rationale |
+|----------|---------|--------|-----------|
+| Skill extraction granularity | Few large vs many small | Many small (11) | Better reuse, focused context |
+| Reference pattern | Inline expand vs reference | Reference with summary | Keeps commands/agents readable |
+| Phase B timing | Same PR vs separate | Same PR | Single atomic refactor |
+| Skill naming | Generic vs specific | Specific (failure-recovery not just recovery) | Clear intent, searchable |
+| Skill organization | Flat vs grouped | Flat in skills/ | Plugin convention, auto-discovery |
+| Content preserved vs trimmed | Keep full detail vs summarize | Keep full detail in skill | Skills are the source of truth |
+
+## Error Handling
+
+| Error | Handling |
+|-------|----------|
+| Edit fails to match | Investigate file format, use Read to verify |
+| Legacy command has unique content | Compare with modern version, keep better |
+| Validation fails after changes | Review failed check, fix file |
+
+## Rollback Strategy
+
+Git provides rollback:
+```bash
+git checkout -- plugins/  # Revert all plugin changes
+git checkout -- scripts/  # Revert validation script
+```
+
+No database, no external state, no destructive operations.
+
+## Test Strategy
+
+**Validation script**: Run after all changes, must pass
+**Manual verification**:
+- Claude Code restart with `--plugin-dir` for both plugins
+- Test agent triggering via Task tool
+- Test skill loading via context matching
+- Test command invocation
+
+## File Summary
+
+### Phase A: Metadata Fixes
+
+| Category | Files Changed | Files Created | Files Deleted |
+|----------|---------------|---------------|---------------|
+| Agents | 14 | 0 | 0 |
+| Skills (version) | 10 | 0 | 0 |
+| Hooks | 2 | 0 | 0 |
+| Commands | 5 | 8 | 9 |
+| Scripts | 0 | 1 | 0 |
+| Docs | 1 | 0 | 0 |
+| **Subtotal A** | **32** | **9** | **9** |
+
+### Phase B: Skill Consolidation
+
+| Category | Files Changed | Files Created | Files Deleted |
+|----------|---------------|---------------|---------------|
+| Commands (simplify) | 6 | 0 | 0 |
+| Agents (simplify) | 4 | 0 | 0 |
+| Skills (new) | 0 | 11 | 0 |
+| **Subtotal B** | **10** | **11** | **0** |
+
+### Combined Total
+
+| Category | Files Changed | Files Created | Files Deleted |
+|----------|---------------|---------------|---------------|
+| **Grand Total** | **42** | **20** | **9** |
+
+## Implementation Steps
+
+### Phase A: Metadata Fixes
+
+1. Add `color` field to 8 ralph-specum agents
+2. Add `<example>` blocks to 8 ralph-specum agents
+3. Add `color` field to 6 ralph-speckit agents
+4. Add `<example>` blocks to 6 ralph-speckit agents
+5. Add `version: 0.1.0` to 6 ralph-specum skills
+6. Add `version: 0.1.0` to 4 ralph-speckit skills
+7. Rewrite interview-framework skill description (ralph-specum)
+8. Rewrite 4 ralph-speckit skill descriptions
+9. Add `"matcher": "*"` to ralph-specum hooks.json
+10. Add `"matcher": "*"` to ralph-speckit hooks.json
+11. Add `name` field to 5 ralph-speckit commands
+12. Migrate 8 legacy commands to `commands/` with frontmatter
+13. Remove duplicate speckit.implement.md
+14. Remove `.claude/commands/` directory
+15. Create `scripts/validate-plugins.sh`
+16. Update CLAUDE.md with best practices reference
+17. Run validation script Phase A
+18. Test both plugins with Claude Code
+
+### Phase B: Skill Consolidation
+
+19. Create `skills/failure-recovery/SKILL.md` from implement.md sections 6b-6d
+20. Create `skills/verification-layers/SKILL.md` from implement.md section 7
+21. Create `skills/coordinator-pattern/SKILL.md` from implement.md coordinator prompt
+22. Create `skills/branch-management/SKILL.md` from start.md
+23. Create `skills/intent-classification/SKILL.md` from start.md
+24. Create `skills/spec-scanner/SKILL.md` from start.md
+25. Create `skills/parallel-research/SKILL.md` from research.md
+26. Create `skills/phase-rules/SKILL.md` from spec-executor.md
+27. Create `skills/commit-discipline/SKILL.md` from spec-executor.md
+28. Create `skills/quality-checkpoints/SKILL.md` from task-planner.md
+29. Create `skills/quality-commands/SKILL.md` from research-analyst.md
+30. Simplify implement.md to reference skills
+31. Simplify start.md to reference skills
+32. Simplify research.md to reference skills
+33. Simplify design.md, requirements.md, tasks.md to reference interview-framework
+34. Simplify spec-executor.md to reference skills
+35. Simplify task-planner.md to reference skills
+36. Simplify research-analyst.md to reference skills
+37. Run validation script Phase B
+38. Test skill references work correctly
diff --git a/specs/refactor-plugins/research.md b/specs/refactor-plugins/research.md
new file mode 100644
index 00000000..893a7d8e
--- /dev/null
+++ b/specs/refactor-plugins/research.md
@@ -0,0 +1,360 @@
+---
+spec: refactor-plugins
+phase: research
+created: 2026-01-29
+---
+
+# Research: refactor-plugins
+
+## Executive Summary
+
+Analysis of `ralph-specum` and `ralph-speckit` plugins against plugin-dev skills best practices reveals both plugins are functional but have significant gaps. The research used:
+1. All plugin-dev skills for best practice patterns
+2. Official Claude Code plugins (plugin-dev, feature-dev, hookify, ralph-loop) as reference implementations
+
+**Key findings:**
+- **37 total issues** to fix across both plugins
+- Agents: Missing `color` field (CRITICAL) and `<example>` blocks in descriptions (CRITICAL)
+- Skills: Missing `version` field and some using incorrect description format
+- Hooks: Missing `matcher` field in hook entries
+- Commands: ralph-speckit has legacy `.claude/commands/` that need migration
+
+## Official Plugin Reference Patterns
+
+### Agent Pattern (from plugin-dev/agents/agent-creator.md)
+
+```markdown
+---
+name: agent-creator
+description: Use this agent when the user asks to "create an agent", "generate an agent", "build a new agent"... Examples:
+
+<example>
+Context: User wants to create a code review agent
+user: "Create an agent that reviews code for quality issues"
+assistant: "I'll use the agent-creator agent to generate the agent configuration."
+<commentary>
+User requesting new agent creation, trigger agent-creator to generate it.
+</commentary>
+</example>
+
+model: sonnet
+color: magenta
+tools: ["Write", "Read"]
+---
+```
+
+### Skill Pattern (from plugin-dev/skills/*)
+
+```markdown
+---
+name: Skill Name
+description: This skill should be used when the user asks to "specific phrase 1", "specific phrase 2", "specific phrase 3". Include exact phrases users would say.
+version: 0.1.0
+---
+```
+
+### Hook Pattern (from ralph-loop/hooks/hooks.json)
+
+```json
+{
+  "description": "Brief explanation of hooks",
+  "hooks": {
+    "Stop": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "${CLAUDE_PLUGIN_ROOT}/hooks/script.sh"
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+**Note:** `matcher` field is optional when hook applies to all events, but official plugins include it explicitly.
+
+### Color Guidelines (from agent-development skill)
+
+| Color | Use For |
+|-------|---------|
+| blue/cyan | Analysis, review, investigation |
+| green | Generation, creation, success-oriented |
+| yellow | Validation, warnings, caution |
+| red | Security, critical analysis |
+| magenta | Transformation, creative, refactoring |
+
+---
+
+## ralph-specum Gap Analysis
+
+### Agents (8 total)
+
+| Agent | color | <example> blocks | Issues |
+|-------|-------|------------------|--------|
+| research-analyst | MISSING | MISSING | 2 |
+| product-manager | MISSING | MISSING | 2 |
+| architect-reviewer | MISSING | MISSING | 2 |
+| task-planner | MISSING | MISSING | 2 |
+| spec-executor | MISSING | MISSING | 2 |
+| plan-synthesizer | MISSING | MISSING | 2 |
+| qa-engineer | MISSING | MISSING | 2 |
+| refactor-specialist | MISSING | MISSING | 2 |
+
+**Total Agent Issues: 16 CRITICAL**
+
+**Recommended colors:**
+- research-analyst: `blue` (investigation)
+- product-manager: `cyan` (analysis)
+- architect-reviewer: `blue` (review)
+- task-planner: `cyan` (planning)
+- spec-executor: `green` (execution)
+- plan-synthesizer: `green` (generation)
+- qa-engineer: `yellow` (validation)
+- refactor-specialist: `magenta` (transformation)
+
+### Skills (6 total)
+
+| Skill | version | description format | Issues |
+|-------|---------|-------------------|--------|
+| communication-style | MISSING | OK (third-person) | 1 |
+| delegation-principle | MISSING | OK (third-person) | 1 |
+| interview-framework | MISSING | WRONG (not third-person) | 2 |
+| reality-verification | MISSING | OK (third-person) | 1 |
+| smart-ralph | MISSING | OK (third-person) | 1 |
+| spec-workflow | MISSING | OK (third-person) | 1 |
+
+**Total Skill Issues: 7 (6 HIGH + 1 CRITICAL)**
+
+### Hooks (hooks/hooks.json)
+
+| Entry | matcher field | Status |
+|-------|--------------|--------|
+| Stop | MISSING | NEEDS FIX |
+| SessionStart | MISSING | NEEDS FIX |
+
+**Total Hook Issues: 2 HIGH**
+
+### Commands (13 total)
+
+Commands follow best practices with proper frontmatter. No issues found.
+
+---
+
+## ralph-speckit Gap Analysis
+
+### Agents (6 total)
+
+| Agent | color | <example> blocks | Issues |
+|-------|-------|------------------|--------|
+| constitution-architect | MISSING | MISSING | 2 |
+| spec-analyst | MISSING | MISSING | 2 |
+| qa-engineer | MISSING | MISSING | 2 |
+| spec-executor | MISSING | MISSING | 2 |
+| plan-architect | MISSING | MISSING | 2 |
+| task-planner | MISSING | MISSING | 2 |
+
+**Total Agent Issues: 12 CRITICAL**
+
+### Skills (4 total)
+
+| Skill | version | description format | Issues |
+|-------|---------|-------------------|--------|
+| communication-style | MISSING | WRONG (not third-person) | 2 |
+| delegation-principle | MISSING | WRONG (not third-person) | 2 |
+| smart-ralph | MISSING | WRONG (not third-person) | 2 |
+| speckit-workflow | MISSING | WRONG (not third-person) | 2 |
+
+**Total Skill Issues: 8 CRITICAL**
+
+### Hooks (hooks/hooks.json)
+
+| Entry | matcher field | Status |
+|-------|--------------|--------|
+| Stop | MISSING | NEEDS FIX |
+
+**Total Hook Issues: 1 HIGH**
+
+### Commands
+
+**Modern commands (5 in `commands/`):**
+| Command | name field | Status |
+|---------|-----------|--------|
+| start.md | MISSING | NEEDS FIX |
+| status.md | MISSING | NEEDS FIX |
+| switch.md | MISSING | NEEDS FIX |
+| cancel.md | MISSING | NEEDS FIX |
+| implement.md | MISSING | NEEDS FIX |
+
+**Legacy commands (9 in `.claude/commands/`):**
+- speckit.analyze.md
+- speckit.checklist.md
+- speckit.clarify.md
+- speckit.constitution.md
+- speckit.implement.md (DUPLICATE!)
+- speckit.plan.md
+- speckit.specify.md
+- speckit.tasks.md
+- speckit.taskstoissues.md
+
+**Total Command Issues: 5 missing name + 9 need migration + 1 duplicate**
+
+---
+
+## Summary of All Issues
+
+| Plugin | Component | Critical | High | Total |
+|--------|-----------|----------|------|-------|
+| ralph-specum | Agents | 16 | 0 | 16 |
+| ralph-specum | Skills | 1 | 6 | 7 |
+| ralph-specum | Hooks | 0 | 2 | 2 |
+| ralph-speckit | Agents | 12 | 0 | 12 |
+| ralph-speckit | Skills | 8 | 0 | 8 |
+| ralph-speckit | Hooks | 0 | 1 | 1 |
+| ralph-speckit | Commands | 0 | 15 | 15 |
+| **TOTAL** | | **37** | **24** | **61** |
+
+---
+
+## Sample Fixes
+
+### Agent Fix Example (research-analyst.md)
+
+**Current:**
+```yaml
+---
+name: research-analyst
+description: This agent should be used to "research a feature"...
+model: inherit
+---
+```
+
+**Fixed:**
+```yaml
+---
+name: research-analyst
+description: This agent should be used to "research a feature", "analyze feasibility", "explore codebase", "find existing patterns", "gather context before requirements". Expert analyzer that verifies through web search, documentation, and codebase exploration before providing findings.
+
+<example>
+Context: User wants to add authentication to their app
+user: "I need to add OAuth support"
+assistant: "Let me research OAuth best practices and analyze your codebase for existing auth patterns."
+<commentary>
+Research-analyst is triggered to explore OAuth implementations and codebase patterns before requirements phase.
+</commentary>
+</example>
+
+<example>
+Context: User starting new spec
+user: "/ralph-specum:research"
+assistant: "Starting research phase. I'll analyze best practices and your codebase."
+<commentary>
+Research-analyst is explicitly invoked via the research command.
+</commentary>
+</example>
+
+model: inherit
+color: blue
+---
+```
+
+### Skill Fix Example (interview-framework/SKILL.md)
+
+**Current:**
+```yaml
+---
+name: interview-framework
+description: Standard single-question adaptive interview loop used across all spec phases
+---
+```
+
+**Fixed:**
+```yaml
+---
+name: interview-framework
+description: This skill should be used when implementing "interview questions", "adaptive interview loop", "single-question flow", "parameter chain", "question piping", or building interview flows for spec phases.
+version: 0.1.0
+---
+```
+
+### Hooks Fix Example (hooks/hooks.json)
+
+**Current:**
+```json
+"Stop": [
+  {
+    "hooks": [...]
+  }
+]
+```
+
+**Fixed:**
+```json
+"Stop": [
+  {
+    "matcher": "*",
+    "hooks": [...]
+  }
+]
+```
+
+---
+
+## Recommendations for Requirements
+
+### Priority 1: Critical (Must Fix)
+
+1. **Add `color` to all 14 agents** across both plugins
+2. **Add `<example>` blocks to all 14 agent descriptions**
+3. **Add `matcher` to all hook entries** in both plugins
+4. **Fix skill descriptions** to use third-person format with trigger phrases
+
+### Priority 2: High (Should Fix)
+
+5. **Add `version: 0.1.0`** to all 10 skills
+6. **Add `name` field** to ralph-speckit commands
+7. **Migrate legacy commands** from `.claude/commands/` to `commands/`
+8. **Remove duplicate** implement.md in ralph-speckit
+
+### Priority 3: Nice to Have
+
+9. **Enhance plugin.json** with repository, homepage, full author info
+10. **Create validation script** to check plugin compliance
+
+---
+
+## Open Questions
+
+1. Should we use unique colors for each agent or group by function?
+2. Should we add `tools` restrictions to limit agent access?
+3. Should ralph-speckit get a SessionStart hook like ralph-specum?
+
+---
+
+## Sources
+
+### Plugin-Dev Skills
+- plugin-structure/SKILL.md - Directory layout, manifest format
+- agent-development/SKILL.md - Agent frontmatter requirements
+- skill-development/SKILL.md - Skill frontmatter requirements
+- hook-development/SKILL.md - Hook configuration format
+- command-development/SKILL.md - Command frontmatter format
+
+### Official Plugin References
+- plugin-dev/agents/agent-creator.md - Agent with color and examples
+- feature-dev/agents/code-architect.md - Agent with tools restriction
+- ralph-loop/hooks/hooks.json - Hooks with matcher field
+- hookify/skills/writing-rules/SKILL.md - Skill with version
+
+### Analyzed Files
+- plugins/ralph-specum/agents/*.md (8 files)
+- plugins/ralph-specum/skills/*/SKILL.md (6 files)
+- plugins/ralph-specum/hooks/hooks.json
+- plugins/ralph-specum/commands/*.md (13 files)
+- plugins/ralph-speckit/agents/*.md (6 files)
+- plugins/ralph-speckit/skills/*/SKILL.md (4 files)
+- plugins/ralph-speckit/hooks/hooks.json
+- plugins/ralph-speckit/commands/*.md (5 files)
+- plugins/ralph-speckit/.claude/commands/*.md (9 files)

From 9471f9208b8e4b60d55fc8143ba582773f70444f Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 18:40:49 +0200
Subject: [PATCH 03/37] spec(refactor-plugins): add implementation tasks

32 tasks across 2 phases:
- Phase A: 14 tasks (metadata fixes)
- Phase B: 18 tasks (skill consolidation)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 specs/refactor-plugins/tasks.md | 576 ++++++++++++++++++++++++++++++++
 1 file changed, 576 insertions(+)
 create mode 100644 specs/refactor-plugins/tasks.md

diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
new file mode 100644
index 00000000..9fa231cf
--- /dev/null
+++ b/specs/refactor-plugins/tasks.md
@@ -0,0 +1,576 @@
+---
+spec: refactor-plugins
+phase: tasks
+total_tasks: 32
+created: 2026-01-29
+---
+
+# Tasks: Plugin Refactoring to Best Practices
+
+## Phase 1: Make It Work (POC) - Phase A Metadata Fixes
+
+Focus: Fix all missing frontmatter fields (color, version, matcher, name) and add example blocks to agents.
+
+### A1: Agent Metadata
+
+- [ ] 1.1 Add color and examples to ralph-specum agents (8 files)
+  - **Do**:
+    1. For each of 8 agents, add `color` field after `model` in frontmatter
+    2. Add 2 `<example>` blocks with Context/user/assistant/commentary format to description
+    3. Colors: research-analyst=blue, product-manager=cyan, architect-reviewer=blue, task-planner=cyan, spec-executor=green, plan-synthesizer=green, qa-engineer=yellow, refactor-specialist=magenta
+  - **Files**:
+    - `plugins/ralph-specum/agents/research-analyst.md`
+    - `plugins/ralph-specum/agents/product-manager.md`
+    - `plugins/ralph-specum/agents/architect-reviewer.md`
+    - `plugins/ralph-specum/agents/task-planner.md`
+    - `plugins/ralph-specum/agents/spec-executor.md`
+    - `plugins/ralph-specum/agents/plan-synthesizer.md`
+    - `plugins/ralph-specum/agents/qa-engineer.md`
+    - `plugins/ralph-specum/agents/refactor-specialist.md`
+  - **Done when**: All 8 agents have color field and 2+ example blocks
+  - **Verify**: `for f in plugins/ralph-specum/agents/*.md; do grep -q "^color:" "$f" && test $(grep -c "<example>" "$f") -ge 2 || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-specum): add color and examples to all agents`
+  - _Requirements: AC-1.1, AC-1.3, AC-1.4, AC-1.5_
+  - _Design: ralph-specum Agents, Agent Color Assignments_
+
+- [ ] 1.2 Add color and examples to ralph-speckit agents (6 files)
+  - **Do**:
+    1. For each of 6 agents, add `color` field after `model` in frontmatter
+    2. Add 2 `<example>` blocks with Context/user/assistant/commentary format to description
+    3. Colors: constitution-architect=magenta, spec-analyst=blue, qa-engineer=yellow, spec-executor=green, plan-architect=cyan, task-planner=cyan
+  - **Files**:
+    - `plugins/ralph-speckit/agents/constitution-architect.md`
+    - `plugins/ralph-speckit/agents/spec-analyst.md`
+    - `plugins/ralph-speckit/agents/qa-engineer.md`
+    - `plugins/ralph-speckit/agents/spec-executor.md`
+    - `plugins/ralph-speckit/agents/plan-architect.md`
+    - `plugins/ralph-speckit/agents/task-planner.md`
+  - **Done when**: All 6 agents have color field and 2+ example blocks
+  - **Verify**: `for f in plugins/ralph-speckit/agents/*.md; do grep -q "^color:" "$f" && test $(grep -c "<example>" "$f") -ge 2 || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-speckit): add color and examples to all agents`
+  - _Requirements: AC-1.2, AC-1.3, AC-1.4, AC-1.5_
+  - _Design: ralph-speckit Agents, Agent Color Assignments_
+
+- [ ] 1.3 [VERIFY] Quality checkpoint: agent metadata
+  - **Do**: Verify all 14 agents have color field and 2+ example blocks
+  - **Verify**: `count=0; for f in plugins/*/agents/*.md; do grep -q "^color:" "$f" && test $(grep -c "<example>" "$f") -ge 2 || ((count++)); done; test $count -eq 0`
+  - **Done when**: All agents pass color and example validation
+  - **Commit**: `chore(plugins): pass agent metadata checkpoint` (only if fixes needed)
+
+### A2: Skill Metadata
+
+- [ ] 1.4 Add version to ralph-specum skills (6 files)
+  - **Do**:
+    1. Add `version: 0.1.0` to frontmatter of each skill
+    2. Fix interview-framework description to third-person format with trigger phrases
+  - **Files**:
+    - `plugins/ralph-specum/skills/communication-style/SKILL.md`
+    - `plugins/ralph-specum/skills/delegation-principle/SKILL.md`
+    - `plugins/ralph-specum/skills/interview-framework/SKILL.md`
+    - `plugins/ralph-specum/skills/reality-verification/SKILL.md`
+    - `plugins/ralph-specum/skills/smart-ralph/SKILL.md`
+    - `plugins/ralph-specum/skills/spec-workflow/SKILL.md`
+  - **Done when**: All 6 skills have version field, interview-framework has third-person description
+  - **Verify**: `for f in plugins/ralph-specum/skills/*/SKILL.md; do grep -q "^version:" "$f" || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-specum): add version to all skills`
+  - _Requirements: AC-2.1, AC-2.5_
+  - _Design: ralph-specum Skills_
+
+- [ ] 1.5 Add version and fix descriptions for ralph-speckit skills (4 files)
+  - **Do**:
+    1. Add `version: 0.1.0` to frontmatter of each skill
+    2. Rewrite all 4 descriptions to third-person format: "This skill should be used when..."
+    3. Include at least 3 trigger phrases in quotes
+  - **Files**:
+    - `plugins/ralph-speckit/skills/communication-style/SKILL.md`
+    - `plugins/ralph-speckit/skills/delegation-principle/SKILL.md`
+    - `plugins/ralph-speckit/skills/smart-ralph/SKILL.md`
+    - `plugins/ralph-speckit/skills/speckit-workflow/SKILL.md`
+  - **Done when**: All 4 skills have version field and third-person descriptions
+  - **Verify**: `for f in plugins/ralph-speckit/skills/*/SKILL.md; do grep -q "^version:" "$f" && grep -q "This skill should be used when" "$f" || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-speckit): add version and fix descriptions for all skills`
+  - _Requirements: AC-2.2, AC-2.3, AC-2.4_
+  - _Design: ralph-speckit Skills_
+
+### A3: Hook Metadata
+
+- [ ] 1.6 Add matcher field to hooks (2 files)
+  - **Do**:
+    1. Add `"matcher": "*"` to Stop entry in ralph-specum hooks.json
+    2. Add `"matcher": "*"` to SessionStart entry in ralph-specum hooks.json
+    3. Add `"matcher": "*"` to Stop entry in ralph-speckit hooks.json
+  - **Files**:
+    - `plugins/ralph-specum/hooks/hooks.json`
+    - `plugins/ralph-speckit/hooks/hooks.json`
+  - **Done when**: All hook entries have matcher field
+  - **Verify**: `for f in plugins/*/hooks/hooks.json; do grep -q '"matcher"' "$f" || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(plugins): add matcher field to all hook entries`
+  - _Requirements: AC-3.1, AC-3.2, AC-3.3_
+  - _Design: Hooks_
+
+- [ ] 1.7 [VERIFY] Quality checkpoint: skills and hooks
+  - **Do**: Verify all skills have version and all hooks have matcher
+  - **Verify**: `count=0; for f in plugins/*/skills/*/SKILL.md; do grep -q "^version:" "$f" || ((count++)); done; for f in plugins/*/hooks/hooks.json; do grep -q '"matcher"' "$f" || ((count++)); done; test $count -eq 0`
+  - **Done when**: All skills and hooks pass validation
+  - **Commit**: `chore(plugins): pass skills/hooks checkpoint` (only if fixes needed)
+
+### A4: Command Fixes
+
+- [ ] 1.8 Add name field to ralph-speckit modern commands (5 files)
+  - **Do**:
+    1. Add `name: <command>` field to frontmatter of each command
+    2. Names: start, status, switch, cancel, implement
+  - **Files**:
+    - `plugins/ralph-speckit/commands/start.md` (name: start)
+    - `plugins/ralph-speckit/commands/status.md` (name: status)
+    - `plugins/ralph-speckit/commands/switch.md` (name: switch)
+    - `plugins/ralph-speckit/commands/cancel.md` (name: cancel)
+    - `plugins/ralph-speckit/commands/implement.md` (name: implement)
+  - **Done when**: All 5 commands have name field in frontmatter
+  - **Verify**: `for f in plugins/ralph-speckit/commands/*.md; do grep -q "^name:" "$f" || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-speckit): add name field to modern commands`
+  - _Requirements: AC-4.1_
+  - _Design: ralph-speckit Commands_
+
+- [ ] 1.9 Migrate legacy commands to commands/ directory (8 files)
+  - **Do**:
+    1. For each legacy command in `.claude/commands/`:
+       - Copy to `plugins/ralph-speckit/commands/` with new name (strip speckit. prefix)
+       - Add proper frontmatter: name, description, allowed_tools
+    2. Files to migrate (skip speckit.implement.md - duplicate):
+       - speckit.analyze.md -> analyze.md
+       - speckit.checklist.md -> checklist.md
+       - speckit.clarify.md -> clarify.md
+       - speckit.constitution.md -> constitution.md
+       - speckit.plan.md -> plan.md
+       - speckit.specify.md -> specify.md
+       - speckit.tasks.md -> tasks.md
+       - speckit.taskstoissues.md -> taskstoissues.md
+  - **Files**:
+    - `plugins/ralph-speckit/commands/analyze.md` (create)
+    - `plugins/ralph-speckit/commands/checklist.md` (create)
+    - `plugins/ralph-speckit/commands/clarify.md` (create)
+    - `plugins/ralph-speckit/commands/constitution.md` (create)
+    - `plugins/ralph-speckit/commands/plan.md` (create)
+    - `plugins/ralph-speckit/commands/specify.md` (create)
+    - `plugins/ralph-speckit/commands/tasks.md` (create)
+    - `plugins/ralph-speckit/commands/taskstoissues.md` (create)
+  - **Done when**: All 8 commands exist in commands/ with proper frontmatter
+  - **Verify**: `for cmd in analyze checklist clarify constitution plan specify tasks taskstoissues; do test -f "plugins/ralph-speckit/commands/$cmd.md" && grep -q "^name:" "plugins/ralph-speckit/commands/$cmd.md" || echo "FAIL: $cmd"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `feat(ralph-speckit): migrate legacy commands to commands/`
+  - _Requirements: AC-4.2, AC-4.5_
+  - _Design: Legacy commands migration_
+
+- [ ] 1.10 Remove legacy commands directory
+  - **Do**:
+    1. Verify all commands migrated successfully (from 1.9)
+    2. Delete `.claude/commands/` directory from ralph-speckit
+    3. This removes duplicate speckit.implement.md
+  - **Files**:
+    - `plugins/ralph-speckit/.claude/commands/` (delete entire directory)
+  - **Done when**: Legacy directory no longer exists
+  - **Verify**: `test ! -d "plugins/ralph-speckit/.claude/commands"`
+  - **Commit**: `chore(ralph-speckit): remove legacy commands directory`
+  - _Requirements: AC-4.3, AC-4.4_
+  - _Design: Post-migration cleanup_
+
+- [ ] 1.11 [VERIFY] Quality checkpoint: commands
+  - **Do**: Verify all ralph-speckit commands have name field and legacy dir removed
+  - **Verify**: `count=0; for f in plugins/ralph-speckit/commands/*.md; do grep -q "^name:" "$f" || ((count++)); done; test ! -d "plugins/ralph-speckit/.claude/commands" || ((count++)); test $count -eq 0`
+  - **Done when**: All commands valid, legacy directory removed
+  - **Commit**: `chore(ralph-speckit): pass commands checkpoint` (only if fixes needed)
+
+### A5: Validation and Documentation
+
+- [ ] 1.12 Create validation script
+  - **Do**:
+    1. Create `scripts/validate-plugins.sh` with compliance checks:
+       - Agents have color field
+       - Agents have 2+ example blocks
+       - Skills have version field
+       - Hooks have matcher field
+       - No legacy commands directory
+    2. Script exits 0 on pass, non-zero on failure
+    3. Make script executable
+  - **Files**:
+    - `scripts/validate-plugins.sh` (create)
+  - **Done when**: Script runs and validates all checks
+  - **Verify**: `test -x scripts/validate-plugins.sh && bash scripts/validate-plugins.sh`
+  - **Commit**: `feat(scripts): add plugin compliance validation script`
+  - _Requirements: AC-5.1, AC-5.2, AC-5.3, AC-5.4_
+  - _Design: Validation Script Design_
+
+- [ ] 1.13 Update CLAUDE.md with best practices
+  - **Do**:
+    1. Add section referencing plugin-dev skills for best practices
+    2. Include validation script usage
+    3. Document color conventions for agents
+  - **Files**:
+    - `CLAUDE.md` (edit)
+  - **Done when**: CLAUDE.md has plugin best practices reference
+  - **Verify**: `grep -q "validate-plugins" CLAUDE.md && grep -q "plugin-dev" CLAUDE.md`
+  - **Commit**: `docs: add plugin best practices reference to CLAUDE.md`
+  - _Requirements: AC-5.5_
+  - _Design: Documentation_
+
+- [ ] 1.14 [VERIFY] Phase A complete validation
+  - **Do**: Run full validation script to verify all Phase A changes
+  - **Verify**: `bash scripts/validate-plugins.sh && echo "Phase A PASS"`
+  - **Done when**: Validation script passes with 0 errors
+  - **Commit**: `chore(plugins): pass Phase A validation` (only if fixes needed)
+
+---
+
+## Phase 2: Refactoring - Phase B Skill Consolidation
+
+Focus: Extract procedural logic from commands/agents into reusable skills, then simplify sources.
+
+### B1: Create New Skills
+
+- [ ] 2.1 Create failure-recovery skill
+  - **Do**:
+    1. Extract recovery orchestration logic from implement.md sections 6b-6d
+    2. Create skill with proper frontmatter (name, description, version)
+    3. Document recovery loop pattern, fix task generation, recovery state management
+  - **Files**:
+    - `plugins/ralph-specum/skills/failure-recovery/SKILL.md` (create)
+  - **Done when**: Skill contains full recovery pattern, ~300-400 lines
+  - **Verify**: `test -f plugins/ralph-specum/skills/failure-recovery/SKILL.md && grep -q "^version:" plugins/ralph-specum/skills/failure-recovery/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add failure-recovery skill`
+  - _Design: New Skills - failure-recovery_
+
+- [ ] 2.2 Create verification-layers skill
+  - **Do**:
+    1. Extract 4-layer verification pattern from implement.md section 7
+    2. Document: contradiction check, uncommitted changes, checkmark verification, completion signal
+  - **Files**:
+    - `plugins/ralph-specum/skills/verification-layers/SKILL.md` (create)
+  - **Done when**: Skill contains all 4 verification layers
+  - **Verify**: `test -f plugins/ralph-specum/skills/verification-layers/SKILL.md && grep -q "contradiction" plugins/ralph-specum/skills/verification-layers/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add verification-layers skill`
+  - _Design: New Skills - verification-layers_
+
+- [ ] 2.3 Create coordinator-pattern skill
+  - **Do**:
+    1. Extract coordinator prompt pattern from implement.md
+    2. Document role definition, state reading, task delegation, completion signaling
+  - **Files**:
+    - `plugins/ralph-specum/skills/coordinator-pattern/SKILL.md` (create)
+  - **Done when**: Skill contains coordinator delegation pattern
+  - **Verify**: `test -f plugins/ralph-specum/skills/coordinator-pattern/SKILL.md && grep -q "COORDINATOR" plugins/ralph-specum/skills/coordinator-pattern/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add coordinator-pattern skill`
+  - _Design: New Skills - coordinator-pattern_
+
+- [ ] 2.4 [VERIFY] Quality checkpoint: new skills batch 1
+  - **Do**: Verify first 3 new skills have proper structure
+  - **Verify**: `count=0; for s in failure-recovery verification-layers coordinator-pattern; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
+  - **Done when**: All 3 skills exist with version field
+  - **Commit**: `chore(ralph-specum): pass new skills batch 1 checkpoint` (only if fixes needed)
+
+- [ ] 2.5 Create branch-management skill
+  - **Do**:
+    1. Extract branch management logic from start.md
+    2. Document branch creation, worktree setup, naming conventions, default branch detection
+  - **Files**:
+    - `plugins/ralph-specum/skills/branch-management/SKILL.md` (create)
+  - **Done when**: Skill contains branch workflow patterns
+  - **Verify**: `test -f plugins/ralph-specum/skills/branch-management/SKILL.md && grep -q "branch" plugins/ralph-specum/skills/branch-management/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add branch-management skill`
+  - _Design: New Skills - branch-management_
+
+- [ ] 2.6 Create intent-classification skill
+  - **Do**:
+    1. Extract intent classification logic from start.md
+    2. Document goal type detection, keyword matching, question count determination
+  - **Files**:
+    - `plugins/ralph-specum/skills/intent-classification/SKILL.md` (create)
+  - **Done when**: Skill contains intent detection patterns and keyword tables
+  - **Verify**: `test -f plugins/ralph-specum/skills/intent-classification/SKILL.md && grep -q "intent" plugins/ralph-specum/skills/intent-classification/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add intent-classification skill`
+  - _Design: New Skills - intent-classification_
+
+- [ ] 2.7 Create spec-scanner skill
+  - **Do**:
+    1. Extract spec discovery logic from start.md
+    2. Document related specs finding, status checking, recommendation logic
+  - **Files**:
+    - `plugins/ralph-specum/skills/spec-scanner/SKILL.md` (create)
+  - **Done when**: Skill contains spec discovery patterns
+  - **Verify**: `test -f plugins/ralph-specum/skills/spec-scanner/SKILL.md && grep -q "spec" plugins/ralph-specum/skills/spec-scanner/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add spec-scanner skill`
+  - _Design: New Skills - spec-scanner_
+
+- [ ] 2.8 Create parallel-research skill
+  - **Do**:
+    1. Extract parallel execution pattern from research.md
+    2. Document multi-agent spawning, parallel search, results merge algorithm
+  - **Files**:
+    - `plugins/ralph-specum/skills/parallel-research/SKILL.md` (create)
+  - **Done when**: Skill contains parallel execution and merge patterns
+  - **Verify**: `test -f plugins/ralph-specum/skills/parallel-research/SKILL.md && grep -q "parallel" plugins/ralph-specum/skills/parallel-research/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add parallel-research skill`
+  - _Design: New Skills - parallel-research_
+
+- [ ] 2.9 [VERIFY] Quality checkpoint: new skills batch 2
+  - **Do**: Verify skills 4-7 have proper structure
+  - **Verify**: `count=0; for s in branch-management intent-classification spec-scanner parallel-research; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
+  - **Done when**: All 4 skills exist with version field
+  - **Commit**: `chore(ralph-specum): pass new skills batch 2 checkpoint` (only if fixes needed)
+
+- [ ] 2.10 Create phase-rules skill
+  - **Do**:
+    1. Extract phase-specific rules from spec-executor.md
+    2. Document POC/Refactor/Testing/Quality phase behaviors, shortcuts allowed per phase
+  - **Files**:
+    - `plugins/ralph-specum/skills/phase-rules/SKILL.md` (create)
+  - **Done when**: Skill contains all 4 phase behavior definitions
+  - **Verify**: `test -f plugins/ralph-specum/skills/phase-rules/SKILL.md && grep -q "Phase" plugins/ralph-specum/skills/phase-rules/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add phase-rules skill`
+  - _Design: New Skills - phase-rules_
+
+- [ ] 2.11 Create commit-discipline skill
+  - **Do**:
+    1. Extract commit rules from spec-executor.md
+    2. Document commit message format, spec file inclusion, commit frequency rules
+  - **Files**:
+    - `plugins/ralph-specum/skills/commit-discipline/SKILL.md` (create)
+  - **Done when**: Skill contains commit conventions and rules
+  - **Verify**: `test -f plugins/ralph-specum/skills/commit-discipline/SKILL.md && grep -q "commit" plugins/ralph-specum/skills/commit-discipline/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add commit-discipline skill`
+  - _Design: New Skills - commit-discipline_
+
+- [ ] 2.12 Create quality-checkpoints skill
+  - **Do**:
+    1. Extract [VERIFY] task rules from task-planner.md
+    2. Document checkpoint frequency, format, verification commands
+  - **Files**:
+    - `plugins/ralph-specum/skills/quality-checkpoints/SKILL.md` (create)
+  - **Done when**: Skill contains checkpoint insertion rules and formats
+  - **Verify**: `test -f plugins/ralph-specum/skills/quality-checkpoints/SKILL.md && grep -q "VERIFY" plugins/ralph-specum/skills/quality-checkpoints/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add quality-checkpoints skill`
+  - _Design: New Skills - quality-checkpoints_
+
+- [ ] 2.13 Create quality-commands skill
+  - **Do**:
+    1. Extract quality command discovery from research-analyst.md
+    2. Document package.json/Makefile/CI discovery patterns, fallback commands
+  - **Files**:
+    - `plugins/ralph-specum/skills/quality-commands/SKILL.md` (create)
+  - **Done when**: Skill contains command discovery patterns
+  - **Verify**: `test -f plugins/ralph-specum/skills/quality-commands/SKILL.md && grep -q "package.json" plugins/ralph-specum/skills/quality-commands/SKILL.md`
+  - **Commit**: `feat(ralph-specum): add quality-commands skill`
+  - _Design: New Skills - quality-commands_
+
+- [ ] 2.14 [VERIFY] Quality checkpoint: new skills batch 3
+  - **Do**: Verify skills 8-11 have proper structure
+  - **Verify**: `count=0; for s in phase-rules commit-discipline quality-checkpoints quality-commands; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
+  - **Done when**: All 4 skills exist with version field
+  - **Commit**: `chore(ralph-specum): pass new skills batch 3 checkpoint` (only if fixes needed)
+
+### B2: Simplify Commands
+
+- [ ] 2.15 Simplify implement.md command
+  - **Do**:
+    1. Replace inline coordinator prompt with skill reference to coordinator-pattern
+    2. Replace inline recovery logic with skill reference to failure-recovery
+    3. Replace inline verification logic with skill reference to verification-layers
+    4. Target: ~150 lines (down from 1200+)
+  - **Files**:
+    - `plugins/ralph-specum/commands/implement.md` (edit)
+  - **Done when**: Command references skills, reduced to ~150-200 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/commands/implement.md) -lt 300 && grep -q "skill" plugins/ralph-specum/commands/implement.md`
+  - **Commit**: `refactor(ralph-specum): simplify implement.md to reference skills`
+  - _Design: Command Simplification Plan_
+
+- [ ] 2.16 Simplify start.md command
+  - **Do**:
+    1. Replace inline branch management with skill reference to branch-management
+    2. Replace inline intent classification with skill reference to intent-classification
+    3. Replace inline spec scanning with skill reference to spec-scanner
+    4. Target: ~200 lines (down from 980+)
+  - **Files**:
+    - `plugins/ralph-specum/commands/start.md` (edit)
+  - **Done when**: Command references skills, reduced to ~200-250 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/commands/start.md) -lt 350 && grep -q "skill" plugins/ralph-specum/commands/start.md`
+  - **Commit**: `refactor(ralph-specum): simplify start.md to reference skills`
+  - _Design: Command Simplification Plan_
+
+- [ ] 2.17 Simplify research.md command
+  - **Do**:
+    1. Replace inline parallel execution with skill reference to parallel-research
+    2. Target: ~150 lines (down from 700+)
+  - **Files**:
+    - `plugins/ralph-specum/commands/research.md` (edit)
+  - **Done when**: Command references skills, reduced to ~150-200 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/commands/research.md) -lt 250 && grep -q "skill" plugins/ralph-specum/commands/research.md`
+  - **Commit**: `refactor(ralph-specum): simplify research.md to reference skills`
+  - _Design: Command Simplification Plan_
+
+- [ ] 2.18 Simplify design.md, requirements.md, tasks.md commands
+  - **Do**:
+    1. Each command already uses interview-framework skill
+    2. Ensure explicit skill references are present
+    3. Target: ~80 lines each (down from ~300)
+  - **Files**:
+    - `plugins/ralph-specum/commands/design.md` (edit)
+    - `plugins/ralph-specum/commands/requirements.md` (edit)
+    - `plugins/ralph-specum/commands/tasks.md` (edit)
+  - **Done when**: Commands explicitly reference interview-framework skill
+  - **Verify**: `for f in design requirements tasks; do test $(wc -l < "plugins/ralph-specum/commands/$f.md") -lt 150 || echo "FAIL: $f"; done | grep -c FAIL | xargs test 0 -eq`
+  - **Commit**: `refactor(ralph-specum): simplify phase commands to reference skills`
+  - _Design: Command Simplification Plan_
+
+- [ ] 2.19 [VERIFY] Quality checkpoint: command simplification
+  - **Do**: Verify all simplified commands are under target line counts
+  - **Verify**: `count=0; test $(wc -l < plugins/ralph-specum/commands/implement.md) -lt 300 || ((count++)); test $(wc -l < plugins/ralph-specum/commands/start.md) -lt 350 || ((count++)); test $(wc -l < plugins/ralph-specum/commands/research.md) -lt 250 || ((count++)); test $count -eq 0`
+  - **Done when**: All major commands under target line counts
+  - **Commit**: `chore(ralph-specum): pass command simplification checkpoint` (only if fixes needed)
+
+### B3: Simplify Agents
+
+- [ ] 2.20 Simplify spec-executor.md agent
+  - **Do**:
+    1. Replace inline phase rules with skill reference to phase-rules
+    2. Replace inline commit discipline with skill reference to commit-discipline
+    3. Add skill reference to verification-layers for [VERIFY] tasks
+    4. Target: ~200 lines (down from 440)
+  - **Files**:
+    - `plugins/ralph-specum/agents/spec-executor.md` (edit)
+  - **Done when**: Agent references skills, reduced to ~200-250 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/agents/spec-executor.md) -lt 300 && grep -q "skill" plugins/ralph-specum/agents/spec-executor.md`
+  - **Commit**: `refactor(ralph-specum): simplify spec-executor.md to reference skills`
+  - _Design: Agent Simplification Plan_
+
+- [ ] 2.21 Simplify task-planner.md agent
+  - **Do**:
+    1. Replace inline POC workflow with skill reference to phase-rules
+    2. Replace inline quality checkpoints with skill reference to quality-checkpoints
+    3. Target: ~250 lines (down from 520)
+  - **Files**:
+    - `plugins/ralph-specum/agents/task-planner.md` (edit)
+  - **Done when**: Agent references skills, reduced to ~250-300 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/agents/task-planner.md) -lt 350 && grep -q "skill" plugins/ralph-specum/agents/task-planner.md`
+  - **Commit**: `refactor(ralph-specum): simplify task-planner.md to reference skills`
+  - _Design: Agent Simplification Plan_
+
+- [ ] 2.22 Simplify research-analyst.md agent
+  - **Do**:
+    1. Replace inline quality command discovery with skill reference to quality-commands
+    2. Target: ~200 lines (down from 340)
+  - **Files**:
+    - `plugins/ralph-specum/agents/research-analyst.md` (edit)
+  - **Done when**: Agent references skills, reduced to ~200-250 lines
+  - **Verify**: `test $(wc -l < plugins/ralph-specum/agents/research-analyst.md) -lt 280 && grep -q "skill" plugins/ralph-specum/agents/research-analyst.md`
+  - **Commit**: `refactor(ralph-specum): simplify research-analyst.md to reference skills`
+  - _Design: Agent Simplification Plan_
+
+- [ ] 2.23 [VERIFY] Quality checkpoint: agent simplification
+  - **Do**: Verify all simplified agents are under target line counts
+  - **Verify**: `count=0; test $(wc -l < plugins/ralph-specum/agents/spec-executor.md) -lt 300 || ((count++)); test $(wc -l < plugins/ralph-specum/agents/task-planner.md) -lt 350 || ((count++)); test $(wc -l < plugins/ralph-specum/agents/research-analyst.md) -lt 280 || ((count++)); test $count -eq 0`
+  - **Done when**: All simplified agents under target line counts
+  - **Commit**: `chore(ralph-specum): pass agent simplification checkpoint` (only if fixes needed)
+
+---
+
+## Phase 3: Testing
+
+Minimal testing per interview context.
+
+- [ ] 3.1 Run full validation script
+  - **Do**: Execute validation script to verify all compliance requirements
+  - **Files**: (none - verification only)
+  - **Done when**: Validation script passes with 0 errors
+  - **Verify**: `bash scripts/validate-plugins.sh`
+  - **Commit**: None (verification only)
+  - _Requirements: AC-5.1, AC-5.2, AC-5.3, AC-5.4_
+
+---
+
+## Phase 4: Quality Gates
+
+- [ ] 4.1 [VERIFY] Full local validation
+  - **Do**: Run validation script and verify all components
+  - **Verify**: `bash scripts/validate-plugins.sh && echo "All checks pass"`
+  - **Done when**: Validation passes, no compliance issues
+  - **Commit**: `fix(plugins): address validation issues` (only if fixes needed)
+
+- [ ] 4.2 Create PR and verify
+  - **Do**:
+    1. Verify current branch is feature branch: `git branch --show-current`
+    2. Push branch: `git push -u origin $(git branch --show-current)`
+    3. Create PR: `gh pr create --title "refactor(plugins): apply plugin-dev best practices" --body "..."`
+    4. Wait for CI: `gh pr checks --watch`
+  - **Verify**: `gh pr checks | grep -v "pending\|in_progress" | grep -c "fail" | xargs test 0 -eq`
+  - **Done when**: PR created, CI passes
+  - **Commit**: None (PR creation)
+
+---
+
+## Phase 5: PR Lifecycle
+
+- [ ] 5.1 Monitor CI and fix failures
+  - **Do**:
+    1. Watch CI status: `gh pr checks --watch`
+    2. If failures, read logs and fix issues
+    3. Push fixes and re-verify
+  - **Verify**: `gh pr checks | grep -c "fail" | xargs test 0 -eq`
+  - **Done when**: All CI checks pass
+  - **Commit**: `fix(plugins): address CI failures` (only if fixes needed)
+
+- [ ] 5.2 [VERIFY] AC checklist verification
+  - **Do**: Programmatically verify each acceptance criterion
+  - **Verify**:
+    ```bash
+    # AC-1.1, AC-1.2: All agents have color
+    for f in plugins/*/agents/*.md; do grep -q "^color:" "$f" || exit 1; done
+    # AC-1.3: All agents have 2+ examples
+    for f in plugins/*/agents/*.md; do test $(grep -c "<example>" "$f") -ge 2 || exit 1; done
+    # AC-2.1, AC-2.2: All skills have version
+    for f in plugins/*/skills/*/SKILL.md; do grep -q "^version:" "$f" || exit 1; done
+    # AC-3.1, AC-3.2, AC-3.3: All hooks have matcher
+    for f in plugins/*/hooks/hooks.json; do grep -q '"matcher"' "$f" || exit 1; done
+    # AC-4.1: ralph-speckit commands have name
+    for f in plugins/ralph-speckit/commands/*.md; do grep -q "^name:" "$f" || exit 1; done
+    # AC-4.4: Legacy directory removed
+    test ! -d "plugins/ralph-speckit/.claude/commands" || exit 1
+    # AC-5.1-5.4: Validation script works
+    bash scripts/validate-plugins.sh
+    echo "All ACs verified"
+    ```
+  - **Done when**: All acceptance criteria confirmed met
+  - **Commit**: None (verification only)
+
+---
+
+## Notes
+
+### POC Shortcuts Taken
+
+- Validation script is bash-only (no test framework)
+- Manual Claude Code testing deferred to user
+- No automated E2E tests for plugin loading
+
+### Production TODOs
+
+- Consider adding CI integration for validation script
+- May want to add more sophisticated skill trigger phrase detection
+- Consider adding tools restrictions to agents in future iteration
+
+### File Counts
+
+| Phase | Files Changed | Files Created | Files Deleted |
+|-------|---------------|---------------|---------------|
+| Phase A | 32 | 9 | 9 |
+| Phase B | 10 | 11 | 0 |
+| **Total** | **42** | **20** | **9** |
+
+### Skill Reference Pattern
+
+Commands/agents reference skills using:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/failure-recovery/SKILL.md`
+Use the failure recovery pattern when spec-executor does not output TASK_COMPLETE.
+</skill-reference>
+```

From d8cf5965817e7fb5b46cdc3ee432994c21af5e8f Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:29:12 +0200
Subject: [PATCH 04/37] feat(ralph-specum): add color and examples to all
 agents

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../ralph-specum/agents/architect-reviewer.md  | 18 +++++++++++++++++-
 .../ralph-specum/agents/plan-synthesizer.md    | 18 +++++++++++++++++-
 plugins/ralph-specum/agents/product-manager.md | 18 +++++++++++++++++-
 plugins/ralph-specum/agents/qa-engineer.md     | 18 +++++++++++++++++-
 .../ralph-specum/agents/refactor-specialist.md | 18 +++++++++++++++++-
 .../ralph-specum/agents/research-analyst.md    | 18 +++++++++++++++++-
 plugins/ralph-specum/agents/spec-executor.md   | 18 +++++++++++++++++-
 plugins/ralph-specum/agents/task-planner.md    | 18 +++++++++++++++++-
 specs/refactor-plugins/tasks.md                |  2 +-
 9 files changed, 137 insertions(+), 9 deletions(-)

diff --git a/plugins/ralph-specum/agents/architect-reviewer.md b/plugins/ralph-specum/agents/architect-reviewer.md
index 9035594d..5f5a9d40 100644
--- a/plugins/ralph-specum/agents/architect-reviewer.md
+++ b/plugins/ralph-specum/agents/architect-reviewer.md
@@ -1,7 +1,23 @@
 ---
 name: architect-reviewer
-description: This agent should be used to "create technical design", "define architecture", "design components", "create design.md", "analyze trade-offs". Expert systems architect that designs scalable, maintainable systems with clear component boundaries.
+description: |
+  This agent should be used to "create technical design", "define architecture", "design components", "create design.md", "analyze trade-offs". Expert systems architect that designs scalable, maintainable systems with clear component boundaries.
+
+  <example>
+  Context: User has approved requirements and needs technical design
+  user: Design the architecture for the authentication feature
+  assistant: [Reads requirements.md, creates mermaid diagrams, defines component interfaces, documents technical decisions table]
+  commentary: The agent produces design.md with architecture diagrams, component definitions, data flow, and explicit trade-off decisions.
+  </example>
+
+  <example>
+  Context: User needs design review before implementation
+  user: Review the proposed file structure for the new module
+  assistant: [Analyzes existing codebase patterns via Explore, validates proposed structure against conventions, recommends adjustments]
+  commentary: The agent ensures designs align with existing codebase patterns and explicitly documents deviations with rationale.
+  </example>
 model: inherit
+color: blue
 ---
 
 You are a senior systems architect with expertise in designing scalable, maintainable systems. Your focus is architecture decisions, component boundaries, patterns, and technical feasibility.
diff --git a/plugins/ralph-specum/agents/plan-synthesizer.md b/plugins/ralph-specum/agents/plan-synthesizer.md
index bb4500b2..aeba9a5f 100644
--- a/plugins/ralph-specum/agents/plan-synthesizer.md
+++ b/plugins/ralph-specum/agents/plan-synthesizer.md
@@ -1,7 +1,23 @@
 ---
 name: plan-synthesizer
-description: This agent should be used to "run quick mode", "generate all spec artifacts", "synthesize spec from goal", "auto-generate research, requirements, design, tasks". Rapid spec synthesizer that creates all four spec artifacts in one pass for quick mode execution.
+description: |
+  This agent should be used to "run quick mode", "generate all spec artifacts", "synthesize spec from goal", "auto-generate research, requirements, design, tasks". Rapid spec synthesizer that creates all four spec artifacts in one pass for quick mode execution.
+
+  <example>
+  Context: User wants to skip interactive phases
+  user: Quick mode: Add a new user profile endpoint
+  assistant: [Explores codebase briefly, generates research.md, requirements.md, design.md, tasks.md in sequence, commits specs, transitions to execution phase]
+  commentary: The agent generates all four artifacts with `generated: auto` frontmatter, then updates state to enable immediate task execution.
+  </example>
+
+  <example>
+  Context: User provides a fix-type goal in quick mode
+  user: Quick mode: Fix the broken login validation
+  assistant: [Detects fix goal, runs reproduction command to capture BEFORE state, documents in .progress.md, generates minimal fix-focused artifacts]
+  commentary: For fix goals, the agent diagnoses the issue first, documents the BEFORE state, and generates a VF task to verify the fix works.
+  </example>
 model: inherit
+color: green
 ---
 
 You are a rapid spec synthesizer that converts a user plan/goal into complete spec artifacts. Your purpose is to enable quick mode where all spec phases are completed automatically.
diff --git a/plugins/ralph-specum/agents/product-manager.md b/plugins/ralph-specum/agents/product-manager.md
index 6c721884..ece5486d 100644
--- a/plugins/ralph-specum/agents/product-manager.md
+++ b/plugins/ralph-specum/agents/product-manager.md
@@ -1,7 +1,23 @@
 ---
 name: product-manager
-description: This agent should be used to "generate requirements", "write user stories", "define acceptance criteria", "create requirements.md", "gather product requirements". Expert product manager that translates user goals into structured requirements.
+description: |
+  This agent should be used to "generate requirements", "write user stories", "define acceptance criteria", "create requirements.md", "gather product requirements". Expert product manager that translates user goals into structured requirements.
+
+  <example>
+  Context: User has completed research and needs requirements
+  user: Generate requirements for the new API authentication feature
+  assistant: [Reads research.md, creates user stories with acceptance criteria, outputs requirements.md with FR/NFR tables]
+  commentary: The agent transforms research findings into testable user stories and prioritized requirements.
+  </example>
+
+  <example>
+  Context: User provides a high-level goal
+  user: I need requirements for adding dark mode to the app
+  assistant: [Explores codebase for UI patterns, creates US-1 through US-N with specific ACs, defines out-of-scope items]
+  commentary: The agent ensures every requirement has testable acceptance criteria and clear scope boundaries.
+  </example>
 model: inherit
+color: cyan
 ---
 
 You are a senior product manager with expertise in translating user goals into structured requirements. Your focus is user empathy, business value framing, and creating testable acceptance criteria.
diff --git a/plugins/ralph-specum/agents/qa-engineer.md b/plugins/ralph-specum/agents/qa-engineer.md
index 1ba8c40b..9bb93428 100644
--- a/plugins/ralph-specum/agents/qa-engineer.md
+++ b/plugins/ralph-specum/agents/qa-engineer.md
@@ -1,7 +1,23 @@
 ---
 name: qa-engineer
-description: This agent should be used to "run verification task", "check quality gate", "verify acceptance criteria", "run [VERIFY] task", "execute quality checkpoint". QA engineer that runs verification commands and outputs VERIFICATION_PASS or VERIFICATION_FAIL.
+description: |
+  This agent should be used to "run verification task", "check quality gate", "verify acceptance criteria", "run [VERIFY] task", "execute quality checkpoint". QA engineer that runs verification commands and outputs VERIFICATION_PASS or VERIFICATION_FAIL.
+
+  <example>
+  Context: spec-executor delegates a [VERIFY] task
+  user: Execute V4 [VERIFY] Full local CI: pnpm lint && pnpm test
+  assistant: [Runs each command via Bash, captures exit codes, outputs VERIFICATION_PASS if all pass, or VERIFICATION_FAIL with specific error details]
+  commentary: The agent runs verification commands sequentially, stopping on first failure and providing actionable error information.
+  </example>
+
+  <example>
+  Context: VF task to verify a fix resolved the original issue
+  user: Execute VF: Verify original issue resolved
+  assistant: [Reads BEFORE state from .progress.md, re-runs reproduction command, compares BEFORE/AFTER, documents result, outputs VERIFICATION_PASS or VERIFICATION_FAIL]
+  commentary: For VF tasks, the agent proves the fix worked by comparing the BEFORE failure state with the AFTER success state.
+  </example>
 model: inherit
+color: yellow
 ---
 
 You are a QA engineer agent that executes [VERIFY] tasks. You run verification commands and check acceptance criteria, then output VERIFICATION_PASS or VERIFICATION_FAIL.
diff --git a/plugins/ralph-specum/agents/refactor-specialist.md b/plugins/ralph-specum/agents/refactor-specialist.md
index 2edea487..0a2cd51d 100644
--- a/plugins/ralph-specum/agents/refactor-specialist.md
+++ b/plugins/ralph-specum/agents/refactor-specialist.md
@@ -1,7 +1,23 @@
 ---
 name: refactor-specialist
-description: This agent should be used to "update spec files", "refactor requirements", "revise design", "modify tasks after execution", "incrementally update specifications". Expert at methodically reviewing and updating spec files section-by-section after execution.
+description: |
+  This agent should be used to "update spec files", "refactor requirements", "revise design", "modify tasks after execution", "incrementally update specifications". Expert at methodically reviewing and updating spec files section-by-section after execution.
+
+  <example>
+  Context: User wants to update requirements after implementation learnings
+  user: Update requirements.md based on what we learned during implementation
+  assistant: [Reads current requirements.md and .progress.md learnings, presents each section for review, makes targeted changes, preserves valuable context]
+  commentary: The agent reviews section-by-section, asking before changing, and preserves implementation learnings in the updated spec.
+  </example>
+
+  <example>
+  Context: Design needs revision after POC revealed issues
+  user: Revise the design based on POC findings
+  assistant: [Reads design.md and .progress.md, identifies sections needing updates, makes focused changes, detects cascade to tasks.md]
+  commentary: The agent detects when changes cascade to downstream files and reports CASCADE_NEEDED with specific files and reasons.
+  </example>
 model: inherit
+color: magenta
 ---
 
 You are a spec refactoring specialist. Your role is to help users update their specifications after execution in a methodical, section-by-section approach.
diff --git a/plugins/ralph-specum/agents/research-analyst.md b/plugins/ralph-specum/agents/research-analyst.md
index dc97c67b..63390620 100644
--- a/plugins/ralph-specum/agents/research-analyst.md
+++ b/plugins/ralph-specum/agents/research-analyst.md
@@ -1,7 +1,23 @@
 ---
 name: research-analyst
-description: This agent should be used to "research a feature", "analyze feasibility", "explore codebase", "find existing patterns", "gather context before requirements". Expert analyzer that verifies through web search, documentation, and codebase exploration before providing findings.
+description: |
+  This agent should be used to "research a feature", "analyze feasibility", "explore codebase", "find existing patterns", "gather context before requirements". Expert analyzer that verifies through web search, documentation, and codebase exploration before providing findings.
+
+  <example>
+  Context: User wants to understand if a feature is feasible
+  user: Research how authentication is handled in this codebase
+  assistant: [Uses WebSearch and Glob/Grep to find auth patterns, creates research.md with feasibility assessment]
+  commentary: The agent searches externally for best practices, then internally for existing patterns, cross-references, and outputs structured findings.
+  </example>
+
+  <example>
+  Context: User needs codebase analysis before starting a new feature
+  user: Find existing patterns for API endpoints before we add a new one
+  assistant: [Explores src/ for endpoint patterns, documents conventions found, recommends approach aligned with existing code]
+  commentary: The agent prioritizes internal research for pattern questions, documenting file paths and code examples as sources.
+  </example>
 model: inherit
+color: blue
 ---
 
 You are a senior analyzer and researcher with a strict "verify-first, assume-never" methodology. Your core principle: **never guess, always check**.
diff --git a/plugins/ralph-specum/agents/spec-executor.md b/plugins/ralph-specum/agents/spec-executor.md
index 566e2f90..748ecf15 100644
--- a/plugins/ralph-specum/agents/spec-executor.md
+++ b/plugins/ralph-specum/agents/spec-executor.md
@@ -1,7 +1,23 @@
 ---
 name: spec-executor
-description: This agent should be used to "execute a task", "implement task from tasks.md", "run spec task", "complete verification task". Autonomous executor that implements one task, verifies completion, commits changes, and signals TASK_COMPLETE.
+description: |
+  This agent should be used to "execute a task", "implement task from tasks.md", "run spec task", "complete verification task". Autonomous executor that implements one task, verifies completion, commits changes, and signals TASK_COMPLETE.
+
+  <example>
+  Context: Coordinator delegates a task during implementation
+  user: Execute task 1.2: Add authentication middleware
+  assistant: [Reads task details, implements Do steps exactly, runs Verify command, commits with specified message, outputs TASK_COMPLETE]
+  commentary: The agent executes exactly one task, never deviating from the Files list, and only outputs TASK_COMPLETE after verification passes.
+  </example>
+
+  <example>
+  Context: Task is a [VERIFY] checkpoint
+  user: Execute task 2.3 [VERIFY] Quality checkpoint
+  assistant: [Detects [VERIFY] tag, delegates to qa-engineer agent, handles VERIFICATION_PASS or VERIFICATION_FAIL result]
+  commentary: [VERIFY] tasks are always delegated to qa-engineer; the spec-executor never runs verification commands directly for these tasks.
+  </example>
 model: inherit
+color: green
 ---
 
 You are an autonomous execution agent that implements ONE task from a spec. You execute the task exactly as specified, verify completion, commit changes, update progress, and signal completion.
diff --git a/plugins/ralph-specum/agents/task-planner.md b/plugins/ralph-specum/agents/task-planner.md
index a199ee4c..d568a6cb 100644
--- a/plugins/ralph-specum/agents/task-planner.md
+++ b/plugins/ralph-specum/agents/task-planner.md
@@ -1,7 +1,23 @@
 ---
 name: task-planner
-description: This agent should be used to "create tasks", "break down design into tasks", "generate tasks.md", "plan implementation steps", "define quality checkpoints". Expert task planner that creates POC-first task breakdowns with verification steps.
+description: |
+  This agent should be used to "create tasks", "break down design into tasks", "generate tasks.md", "plan implementation steps", "define quality checkpoints". Expert task planner that creates POC-first task breakdowns with verification steps.
+
+  <example>
+  Context: User has approved design and needs implementation tasks
+  user: Create tasks for the authentication feature
+  assistant: [Reads design.md, creates 4-phase task breakdown with POC first, adds [VERIFY] checkpoints every 2-3 tasks, each task has Do/Files/Verify/Commit]
+  commentary: The agent generates tasks.md with POC-first workflow, ensuring each task is autonomous-executable with explicit verification commands.
+  </example>
+
+  <example>
+  Context: User needs tasks for a bugfix
+  user: Plan tasks to fix the login validation bug
+  assistant: [Detects fix-type goal, plans reproduction task first, adds VF verification task at end, keeps Phase 1 minimal]
+  commentary: For fix goals, the agent includes reality-check tasks to prove the issue is resolved before and after implementation.
+  </example>
 model: inherit
+color: cyan
 ---
 
 You are a task planning specialist who breaks designs into executable implementation steps. Your focus is POC-first workflow, clear task definitions, and quality gates.
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 9fa231cf..76e3ee95 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -13,7 +13,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
 
 ### A1: Agent Metadata
 
-- [ ] 1.1 Add color and examples to ralph-specum agents (8 files)
+- [x] 1.1 Add color and examples to ralph-specum agents (8 files)
   - **Do**:
     1. For each of 8 agents, add `color` field after `model` in frontmatter
     2. Add 2 `<example>` blocks with Context/user/assistant/commentary format to description

From c7ca6d4d05d9297aeaa74323c8617f2aa95f676c Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:30:56 +0200
Subject: [PATCH 05/37] feat(ralph-speckit): add color and examples to all
 agents

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../agents/constitution-architect.md           | 18 +++++++++++++++++-
 plugins/ralph-speckit/agents/plan-architect.md | 18 +++++++++++++++++-
 plugins/ralph-speckit/agents/qa-engineer.md    | 18 +++++++++++++++++-
 plugins/ralph-speckit/agents/spec-analyst.md   | 18 +++++++++++++++++-
 plugins/ralph-speckit/agents/spec-executor.md  | 18 +++++++++++++++++-
 plugins/ralph-speckit/agents/task-planner.md   | 18 +++++++++++++++++-
 specs/refactor-plugins/tasks.md                |  2 +-
 7 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/plugins/ralph-speckit/agents/constitution-architect.md b/plugins/ralph-speckit/agents/constitution-architect.md
index d6d18979..09475f44 100644
--- a/plugins/ralph-speckit/agents/constitution-architect.md
+++ b/plugins/ralph-speckit/agents/constitution-architect.md
@@ -1,7 +1,23 @@
 ---
 name: constitution-architect
-description: Expert in creating and maintaining project constitutions. Establishes governance principles, technology standards, and quality guidelines.
+description: |
+  Expert in creating and maintaining project constitutions. Establishes governance principles, technology standards, and quality guidelines.
+
+  <example>
+  Context: User wants to establish project governance before starting feature development
+  user: /speckit:constitution
+  assistant: [Explores codebase to discover patterns, reads existing docs, then creates .specify/memory/constitution.md with MUST/SHOULD/MAY principles]
+  commentary: Triggered when user wants to establish or update project governance rules and standards
+  </example>
+
+  <example>
+  Context: User needs to update constitution after technology stack changes
+  user: We've switched from REST to GraphQL, update the constitution
+  assistant: [Reads current constitution, identifies affected sections, updates API patterns and examples, bumps version with changelog]
+  commentary: Triggered when project constraints or technology decisions change requiring constitution updates
+  </example>
 model: inherit
+color: magenta
 ---
 
 You are a constitution architect who establishes and maintains project governance documents. You create clear, actionable principles that guide all feature development.
diff --git a/plugins/ralph-speckit/agents/plan-architect.md b/plugins/ralph-speckit/agents/plan-architect.md
index 2c321cd2..a4eb1795 100644
--- a/plugins/ralph-speckit/agents/plan-architect.md
+++ b/plugins/ralph-speckit/agents/plan-architect.md
@@ -1,7 +1,23 @@
 ---
 name: plan-architect
-description: Technical architect for creating implementation plans from specifications. Designs architecture, data models, and API contracts aligned with constitution.
+description: |
+  Technical architect for creating implementation plans from specifications. Designs architecture, data models, and API contracts aligned with constitution.
+
+  <example>
+  Context: User has approved spec and wants technical design
+  user: /speckit:plan
+  assistant: [Reads spec.md and constitution, explores codebase architecture, creates plan.md with components, data models, API contracts, and constitution references]
+  commentary: Triggered when user wants to create technical implementation plan from approved specification
+  </example>
+
+  <example>
+  Context: Complex feature needs detailed architecture before implementation
+  user: The payment integration needs careful planning
+  assistant: [Parallel exploration for patterns, creates plan with security considerations per C§5.3, integration points, error handling per C§4.3]
+  commentary: Triggered when translating specifications into detailed technical architecture aligned with project constitution
+  </example>
 model: inherit
+color: cyan
 ---
 
 You are a technical architect who transforms feature specifications into detailed implementation plans. You design architectures, data models, and contracts that align with the project constitution.
diff --git a/plugins/ralph-speckit/agents/qa-engineer.md b/plugins/ralph-speckit/agents/qa-engineer.md
index e481d4e4..b19d7fe6 100644
--- a/plugins/ralph-speckit/agents/qa-engineer.md
+++ b/plugins/ralph-speckit/agents/qa-engineer.md
@@ -1,7 +1,23 @@
 ---
 name: qa-engineer
-description: QA engineer that runs verification commands and checks acceptance criteria for [VERIFY] tasks.
+description: |
+  QA engineer that runs verification commands and checks acceptance criteria for [VERIFY] tasks.
+
+  <example>
+  Context: spec-executor delegates a verification checkpoint task
+  user: [Task tool invocation] Execute this verification task - V4 [VERIFY] Full local CI: pnpm lint && pnpm test
+  assistant: [Runs each command, captures exit codes, analyzes test quality for mock-only anti-patterns, outputs VERIFICATION_PASS or VERIFICATION_FAIL]
+  commentary: Triggered by spec-executor when encountering [VERIFY] tasks - never executed directly by users
+  </example>
+
+  <example>
+  Context: spec-executor delegates AC checklist verification
+  user: [Task tool invocation] V6 [VERIFY] AC checklist - verify all acceptance criteria from spec.md
+  assistant: [Reads spec.md, extracts all AC-* entries, verifies each against implementation, outputs table with PASS/FAIL/SKIP and evidence]
+  commentary: Triggered for final acceptance criteria verification before marking feature complete
+  </example>
 model: inherit
+color: yellow
 ---
 
 You are a QA engineer agent that executes [VERIFY] tasks. You run verification commands and check acceptance criteria, then output VERIFICATION_PASS or VERIFICATION_FAIL.
diff --git a/plugins/ralph-speckit/agents/spec-analyst.md b/plugins/ralph-speckit/agents/spec-analyst.md
index 138b4d6e..1d22f9dd 100644
--- a/plugins/ralph-speckit/agents/spec-analyst.md
+++ b/plugins/ralph-speckit/agents/spec-analyst.md
@@ -1,7 +1,23 @@
 ---
 name: spec-analyst
-description: Expert specification analyst for creating feature specs aligned with project constitution. Generates user stories, acceptance criteria, and scope definitions.
+description: |
+  Expert specification analyst for creating feature specs aligned with project constitution. Generates user stories, acceptance criteria, and scope definitions.
+
+  <example>
+  Context: User wants to create a specification for a new feature
+  user: /speckit:specify user-auth
+  assistant: [Reads constitution, explores codebase for context, then creates .specify/specs/user-auth/spec.md with user stories, acceptance criteria, and scope]
+  commentary: Triggered when user wants to define feature requirements and acceptance criteria aligned with constitution
+  </example>
+
+  <example>
+  Context: User provides a feature goal that needs structured specification
+  user: I need to add webhook notifications when orders are placed
+  assistant: [Analyzes goal, maps to constitution principles, creates spec with US1: Order webhook delivery, AC-1.1 through AC-1.N with verifiable criteria]
+  commentary: Triggered when converting feature goals into structured specifications with testable acceptance criteria
+  </example>
 model: inherit
+color: blue
 ---
 
 You are a specification analyst who creates feature specifications grounded in the project constitution. You translate goals into structured specs with user stories and acceptance criteria.
diff --git a/plugins/ralph-speckit/agents/spec-executor.md b/plugins/ralph-speckit/agents/spec-executor.md
index bf2f138b..3be6921e 100644
--- a/plugins/ralph-speckit/agents/spec-executor.md
+++ b/plugins/ralph-speckit/agents/spec-executor.md
@@ -1,7 +1,23 @@
 ---
 name: spec-executor
-description: Autonomous task executor for spec-kit development. Executes a single task from tasks.md, verifies, commits, and signals completion.
+description: |
+  Autonomous task executor for spec-kit development. Executes a single task from tasks.md, verifies, commits, and signals completion.
+
+  <example>
+  Context: Coordinator delegates a task for implementation
+  user: [Task tool invocation] Execute task T003 from tasks.md - Implement core API endpoint
+  assistant: [Reads task details, executes Do steps, modifies only listed Files, runs Verify command, commits with exact message, outputs TASK_COMPLETE]
+  commentary: Triggered by coordinator during /speckit:implement - never invoked directly by users
+  </example>
+
+  <example>
+  Context: Coordinator delegates a [VERIFY] checkpoint task
+  user: [Task tool invocation] Execute task T004 [VERIFY] Quality checkpoint
+  assistant: [Detects [VERIFY] tag, delegates to qa-engineer via Task tool, handles VERIFICATION_PASS/FAIL result, updates progress]
+  commentary: Triggered for verification tasks - delegates to qa-engineer rather than executing directly
+  </example>
 model: inherit
+color: green
 ---
 
 You are an autonomous execution agent that implements ONE task from a spec. You execute the task exactly as specified, verify completion, commit changes, update progress, and signal completion.
diff --git a/plugins/ralph-speckit/agents/task-planner.md b/plugins/ralph-speckit/agents/task-planner.md
index d09088b5..4b2d123c 100644
--- a/plugins/ralph-speckit/agents/task-planner.md
+++ b/plugins/ralph-speckit/agents/task-planner.md
@@ -1,7 +1,23 @@
 ---
 name: task-planner
-description: Expert task planner for breaking plans into executable tasks. Masters POC-first workflow, task sequencing, quality gates, and constitution alignment.
+description: |
+  Expert task planner for breaking plans into executable tasks. Masters POC-first workflow, task sequencing, quality gates, and constitution alignment.
+
+  <example>
+  Context: User has approved plan and wants implementation tasks
+  user: /speckit:tasks
+  assistant: [Reads plan.md, spec.md, constitution, explores for commands, creates tasks.md with POC-first phases, [P] parallel markers, [VERIFY] checkpoints]
+  commentary: Triggered when user wants to break technical plan into executable task list for spec-executor
+  </example>
+
+  <example>
+  Context: Complex feature needs careful task sequencing
+  user: Generate tasks for the payment feature
+  assistant: [Creates tasks with Phase 1-5 structure, inserts [VERIFY] every 2-3 tasks, marks parallelizable tasks with [P], includes E2E validation tasks]
+  commentary: Triggered when converting technical plans into structured task sequences following POC-first workflow
+  </example>
 model: inherit
+color: cyan
 ---
 
 You are a task planning specialist who breaks technical plans into executable implementation steps. Your focus is POC-first workflow, clear task definitions, and quality gates aligned with the project constitution.
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 76e3ee95..e8d1c240 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -33,7 +33,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-1.1, AC-1.3, AC-1.4, AC-1.5_
   - _Design: ralph-specum Agents, Agent Color Assignments_
 
-- [ ] 1.2 Add color and examples to ralph-speckit agents (6 files)
+- [x] 1.2 Add color and examples to ralph-speckit agents (6 files)
   - **Do**:
     1. For each of 6 agents, add `color` field after `model` in frontmatter
     2. Add 2 `<example>` blocks with Context/user/assistant/commentary format to description

From 68538ab5eece032aa26e61fc6a44ee6ca11d3203 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:33:15 +0200
Subject: [PATCH 06/37] feat(ralph-specum): add version to all skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/skills/communication-style/SKILL.md  | 1 +
 plugins/ralph-specum/skills/delegation-principle/SKILL.md | 1 +
 plugins/ralph-specum/skills/interview-framework/SKILL.md  | 3 ++-
 plugins/ralph-specum/skills/reality-verification/SKILL.md | 1 +
 plugins/ralph-specum/skills/smart-ralph/SKILL.md          | 1 +
 plugins/ralph-specum/skills/spec-workflow/SKILL.md        | 1 +
 specs/refactor-plugins/tasks.md                           | 2 +-
 7 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/plugins/ralph-specum/skills/communication-style/SKILL.md b/plugins/ralph-specum/skills/communication-style/SKILL.md
index a009dfd2..ad7af0e8 100644
--- a/plugins/ralph-specum/skills/communication-style/SKILL.md
+++ b/plugins/ralph-specum/skills/communication-style/SKILL.md
@@ -1,5 +1,6 @@
 ---
 name: communication-style
+version: 0.1.0
 description: This skill should be used when the user asks about "output formatting", "concise responses", "Matt Pocock planning style", "scannable output", "action steps format", or needs guidance on communication and output formatting rules for Ralph agents.
 ---
 
diff --git a/plugins/ralph-specum/skills/delegation-principle/SKILL.md b/plugins/ralph-specum/skills/delegation-principle/SKILL.md
index 16ccd297..6a03c8f7 100644
--- a/plugins/ralph-specum/skills/delegation-principle/SKILL.md
+++ b/plugins/ralph-specum/skills/delegation-principle/SKILL.md
@@ -1,5 +1,6 @@
 ---
 name: delegation-principle
+version: 0.1.0
 description: This skill should be used when the user asks about "coordinator role", "delegate to subagent", "use Task tool", "never implement yourself", "subagent delegation", or needs guidance on proper delegation patterns for Ralph workflows.
 ---
 
diff --git a/plugins/ralph-specum/skills/interview-framework/SKILL.md b/plugins/ralph-specum/skills/interview-framework/SKILL.md
index 07587f83..ed5909cc 100644
--- a/plugins/ralph-specum/skills/interview-framework/SKILL.md
+++ b/plugins/ralph-specum/skills/interview-framework/SKILL.md
@@ -1,6 +1,7 @@
 ---
 name: interview-framework
-description: Standard single-question adaptive interview loop used across all spec phases
+version: 0.1.0
+description: This skill should be used when the user asks about "interview loop", "single-question interview", "adaptive interview", "question piping", "parameter chain", "completion signals", or needs guidance on the standard interview framework used across Ralph spec phases.
 ---
 
 # Interview Framework
diff --git a/plugins/ralph-specum/skills/reality-verification/SKILL.md b/plugins/ralph-specum/skills/reality-verification/SKILL.md
index 5bb9b669..a8d6baca 100644
--- a/plugins/ralph-specum/skills/reality-verification/SKILL.md
+++ b/plugins/ralph-specum/skills/reality-verification/SKILL.md
@@ -1,5 +1,6 @@
 ---
 name: reality-verification
+version: 0.1.0
 description: This skill should be used when the user asks to "verify a fix", "reproduce failure", "diagnose issue", "check BEFORE/AFTER state", "VF task", "reality check", or needs guidance on verifying fixes by reproducing failures before and after implementation.
 ---
 
diff --git a/plugins/ralph-specum/skills/smart-ralph/SKILL.md b/plugins/ralph-specum/skills/smart-ralph/SKILL.md
index 8d2d2f6f..244a6573 100644
--- a/plugins/ralph-specum/skills/smart-ralph/SKILL.md
+++ b/plugins/ralph-specum/skills/smart-ralph/SKILL.md
@@ -1,5 +1,6 @@
 ---
 name: smart-ralph
+version: 0.1.0
 description: This skill should be used when the user asks about "ralph arguments", "quick mode", "commit spec", "max iterations", "ralph state file", "execution modes", "ralph loop integration", or needs guidance on common Ralph plugin arguments and state management patterns.
 ---
 
diff --git a/plugins/ralph-specum/skills/spec-workflow/SKILL.md b/plugins/ralph-specum/skills/spec-workflow/SKILL.md
index a57bc697..e0ed8507 100644
--- a/plugins/ralph-specum/skills/spec-workflow/SKILL.md
+++ b/plugins/ralph-specum/skills/spec-workflow/SKILL.md
@@ -1,5 +1,6 @@
 ---
 name: spec-workflow
+version: 0.1.0
 description: This skill should be used when the user asks to "build a feature", "create a spec", "start spec-driven development", "run research phase", "generate requirements", "create design", "plan tasks", "implement spec", "check spec status", or needs guidance on the spec-driven development workflow.
 ---
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index e8d1c240..2ac796f1 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -59,7 +59,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
 
 ### A2: Skill Metadata
 
-- [ ] 1.4 Add version to ralph-specum skills (6 files)
+- [x] 1.4 Add version to ralph-specum skills (6 files)
   - **Do**:
     1. Add `version: 0.1.0` to frontmatter of each skill
     2. Fix interview-framework description to third-person format with trigger phrases

From e32d256b9da77274211ab33ab3f75ac079a9e498 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:34:25 +0200
Subject: [PATCH 07/37] feat(ralph-speckit): add version and fix descriptions
 for all skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-speckit/skills/communication-style/SKILL.md  | 3 ++-
 plugins/ralph-speckit/skills/delegation-principle/SKILL.md | 3 ++-
 plugins/ralph-speckit/skills/smart-ralph/SKILL.md          | 3 ++-
 plugins/ralph-speckit/skills/speckit-workflow/SKILL.md     | 3 ++-
 specs/refactor-plugins/tasks.md                            | 2 +-
 5 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/plugins/ralph-speckit/skills/communication-style/SKILL.md b/plugins/ralph-speckit/skills/communication-style/SKILL.md
index be50584d..1ae4d24e 100644
--- a/plugins/ralph-speckit/skills/communication-style/SKILL.md
+++ b/plugins/ralph-speckit/skills/communication-style/SKILL.md
@@ -1,6 +1,7 @@
 ---
 name: communication-style
-description: Output rules for all agents - concise, scannable, actionable. Based on Matt Pocock's planning principles.
+version: 0.1.0
+description: This skill should be used when the user asks about "output formatting", "concise responses", "Matt Pocock planning style", "scannable output", "action steps format", or needs guidance on communication and output formatting rules for Ralph agents.
 ---
 
 # Communication Style
diff --git a/plugins/ralph-speckit/skills/delegation-principle/SKILL.md b/plugins/ralph-speckit/skills/delegation-principle/SKILL.md
index be25edea..fa88f751 100644
--- a/plugins/ralph-speckit/skills/delegation-principle/SKILL.md
+++ b/plugins/ralph-speckit/skills/delegation-principle/SKILL.md
@@ -1,6 +1,7 @@
 ---
 name: delegation-principle
-description: Core principle that the main agent is a coordinator, not an implementer. All work must be delegated to subagents.
+version: 0.1.0
+description: This skill should be used when the user asks about "coordinator role", "delegate to subagent", "use Task tool", "never implement yourself", "subagent delegation", or needs guidance on proper delegation patterns for Ralph workflows.
 ---
 
 # Delegation Principle
diff --git a/plugins/ralph-speckit/skills/smart-ralph/SKILL.md b/plugins/ralph-speckit/skills/smart-ralph/SKILL.md
index fcecf060..8951d7b6 100644
--- a/plugins/ralph-speckit/skills/smart-ralph/SKILL.md
+++ b/plugins/ralph-speckit/skills/smart-ralph/SKILL.md
@@ -1,6 +1,7 @@
 ---
 name: smart-ralph
-description: Core Smart Ralph skill defining common arguments, execution modes, and shared behaviors across all Ralph plugins.
+version: 0.1.0
+description: This skill should be used when the user asks about "ralph arguments", "quick mode", "commit spec", "max iterations", "ralph state file", "execution modes", "ralph loop integration", or needs guidance on common Ralph plugin arguments and state management patterns.
 ---
 
 # Smart Ralph
diff --git a/plugins/ralph-speckit/skills/speckit-workflow/SKILL.md b/plugins/ralph-speckit/skills/speckit-workflow/SKILL.md
index c1de6239..768a9307 100644
--- a/plugins/ralph-speckit/skills/speckit-workflow/SKILL.md
+++ b/plugins/ralph-speckit/skills/speckit-workflow/SKILL.md
@@ -1,6 +1,7 @@
 ---
 name: speckit-workflow
-description: Comprehensive understanding of the spec-kit methodology. Constitution-driven feature development with specify, plan, tasks, and implement phases.
+version: 0.1.0
+description: This skill should be used when the user asks to "build a feature", "create a spec", "start spec-driven development", "run research phase", "generate requirements", "create design", "plan tasks", "implement spec", "check spec status", or needs guidance on the spec-driven development workflow.
 ---
 
 # SpecKit Workflow
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 2ac796f1..5131d583 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -76,7 +76,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-2.1, AC-2.5_
   - _Design: ralph-specum Skills_
 
-- [ ] 1.5 Add version and fix descriptions for ralph-speckit skills (4 files)
+- [x] 1.5 Add version and fix descriptions for ralph-speckit skills (4 files)
   - **Do**:
     1. Add `version: 0.1.0` to frontmatter of each skill
     2. Rewrite all 4 descriptions to third-person format: "This skill should be used when..."

From 8301b9441ae9529e3ea3cb2b9f966454292e7ee1 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:35:27 +0200
Subject: [PATCH 08/37] feat(plugins): add matcher field to all hook entries

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-speckit/hooks/hooks.json | 1 +
 plugins/ralph-specum/hooks/hooks.json  | 2 ++
 specs/refactor-plugins/tasks.md        | 2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/plugins/ralph-speckit/hooks/hooks.json b/plugins/ralph-speckit/hooks/hooks.json
index e5560f9e..1eba5f47 100644
--- a/plugins/ralph-speckit/hooks/hooks.json
+++ b/plugins/ralph-speckit/hooks/hooks.json
@@ -2,6 +2,7 @@
   "hooks": {
     "Stop": [
       {
+        "matcher": "*",
         "hooks": [
           {
             "type": "command",
diff --git a/plugins/ralph-specum/hooks/hooks.json b/plugins/ralph-specum/hooks/hooks.json
index a4a228e9..31b87ed3 100644
--- a/plugins/ralph-specum/hooks/hooks.json
+++ b/plugins/ralph-specum/hooks/hooks.json
@@ -3,6 +3,7 @@
   "hooks": {
     "Stop": [
       {
+        "matcher": "*",
         "hooks": [
           {
             "type": "command",
@@ -13,6 +14,7 @@
     ],
     "SessionStart": [
       {
+        "matcher": "*",
         "hooks": [
           {
             "type": "command",
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 5131d583..f9df354a 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -94,7 +94,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
 
 ### A3: Hook Metadata
 
-- [ ] 1.6 Add matcher field to hooks (2 files)
+- [x] 1.6 Add matcher field to hooks (2 files)
   - **Do**:
     1. Add `"matcher": "*"` to Stop entry in ralph-specum hooks.json
     2. Add `"matcher": "*"` to SessionStart entry in ralph-specum hooks.json

From 34bd45bcff29ae55a01d2e0e3443b40e0cdaad72 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:37:23 +0200
Subject: [PATCH 09/37] feat(ralph-speckit): add name field to modern commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-speckit/commands/cancel.md    | 1 +
 plugins/ralph-speckit/commands/implement.md | 1 +
 plugins/ralph-speckit/commands/start.md     | 1 +
 plugins/ralph-speckit/commands/status.md    | 1 +
 plugins/ralph-speckit/commands/switch.md    | 1 +
 specs/refactor-plugins/tasks.md             | 2 +-
 6 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/plugins/ralph-speckit/commands/cancel.md b/plugins/ralph-speckit/commands/cancel.md
index 90c50005..1d92fee3 100644
--- a/plugins/ralph-speckit/commands/cancel.md
+++ b/plugins/ralph-speckit/commands/cancel.md
@@ -1,4 +1,5 @@
 ---
+name: cancel
 description: Cancel active execution loop and cleanup state
 argument-hint: [feature-name]
 allowed-tools: [Read, Bash, Task]
diff --git a/plugins/ralph-speckit/commands/implement.md b/plugins/ralph-speckit/commands/implement.md
index 129f10af..3df1e250 100644
--- a/plugins/ralph-speckit/commands/implement.md
+++ b/plugins/ralph-speckit/commands/implement.md
@@ -1,4 +1,5 @@
 ---
+name: implement
 description: Start task execution loop
 argument-hint: [--max-task-iterations 5]
 allowed-tools: [Read, Write, Edit, Task, Bash, Skill]
diff --git a/plugins/ralph-speckit/commands/start.md b/plugins/ralph-speckit/commands/start.md
index 8dd7f207..9c6e5546 100644
--- a/plugins/ralph-speckit/commands/start.md
+++ b/plugins/ralph-speckit/commands/start.md
@@ -1,4 +1,5 @@
 ---
+name: start
 description: Smart entry point for new features with auto ID and branch management
 argument-hint: <feature-name> [goal]
 allowed-tools: [Read, Write, Edit, Task, Bash]
diff --git a/plugins/ralph-speckit/commands/status.md b/plugins/ralph-speckit/commands/status.md
index 28b9d3e5..e82585ca 100644
--- a/plugins/ralph-speckit/commands/status.md
+++ b/plugins/ralph-speckit/commands/status.md
@@ -1,4 +1,5 @@
 ---
+name: status
 description: Show current feature status and progress
 argument-hint:
 allowed-tools: [Read, Bash, Glob]
diff --git a/plugins/ralph-speckit/commands/switch.md b/plugins/ralph-speckit/commands/switch.md
index 32c29335..2bc4233b 100644
--- a/plugins/ralph-speckit/commands/switch.md
+++ b/plugins/ralph-speckit/commands/switch.md
@@ -1,4 +1,5 @@
 ---
+name: switch
 description: Switch active feature
 argument-hint: <feature-name>
 allowed-tools: [Read, Write, Bash, Glob, Task]
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index f9df354a..9ea7f3de 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -116,7 +116,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
 
 ### A4: Command Fixes
 
-- [ ] 1.8 Add name field to ralph-speckit modern commands (5 files)
+- [x] 1.8 Add name field to ralph-speckit modern commands (5 files)
   - **Do**:
     1. Add `name: <command>` field to frontmatter of each command
     2. Names: start, status, switch, cancel, implement

From 0f617ad0012fe3aed7f1b9e5678ce9615a481fde Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:45:08 +0200
Subject: [PATCH 10/37] feat(ralph-speckit): migrate legacy commands to
 commands/

Migrate 8 legacy commands from .claude/commands/ to commands/:
- analyze.md - cross-artifact consistency analysis
- checklist.md - requirements quality validation
- clarify.md - spec ambiguity detection and resolution
- constitution.md - project constitution management
- plan.md - implementation planning workflow
- specify.md - feature specification creation
- tasks.md - task generation from design artifacts
- taskstoissues.md - GitHub issue creation from tasks

All commands now have proper frontmatter with name, description,
and allowed-tools fields.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-speckit/commands/analyze.md     | 186 +++++++++++
 plugins/ralph-speckit/commands/checklist.md   | 296 ++++++++++++++++++
 plugins/ralph-speckit/commands/clarify.md     | 183 +++++++++++
 .../ralph-speckit/commands/constitution.md    |  84 +++++
 plugins/ralph-speckit/commands/plan.md        |  91 ++++++
 plugins/ralph-speckit/commands/specify.md     | 260 +++++++++++++++
 plugins/ralph-speckit/commands/tasks.md       | 139 ++++++++
 .../ralph-speckit/commands/taskstoissues.md   |  31 ++
 specs/refactor-plugins/tasks.md               |   2 +-
 9 files changed, 1271 insertions(+), 1 deletion(-)
 create mode 100644 plugins/ralph-speckit/commands/analyze.md
 create mode 100644 plugins/ralph-speckit/commands/checklist.md
 create mode 100644 plugins/ralph-speckit/commands/clarify.md
 create mode 100644 plugins/ralph-speckit/commands/constitution.md
 create mode 100644 plugins/ralph-speckit/commands/plan.md
 create mode 100644 plugins/ralph-speckit/commands/specify.md
 create mode 100644 plugins/ralph-speckit/commands/tasks.md
 create mode 100644 plugins/ralph-speckit/commands/taskstoissues.md

diff --git a/plugins/ralph-speckit/commands/analyze.md b/plugins/ralph-speckit/commands/analyze.md
new file mode 100644
index 00000000..272992b6
--- /dev/null
+++ b/plugins/ralph-speckit/commands/analyze.md
@@ -0,0 +1,186 @@
+---
+name: analyze
+description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
+allowed-tools: [Read, Bash]
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Goal
+
+Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
+
+## Operating Constraints
+
+**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).
+
+**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`.
+
+## Execution Steps
+
+### 1. Initialize Analysis Context
+
+Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
+
+- SPEC = FEATURE_DIR/spec.md
+- PLAN = FEATURE_DIR/plan.md
+- TASKS = FEATURE_DIR/tasks.md
+
+Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
+For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+### 2. Load Artifacts (Progressive Disclosure)
+
+Load only the minimal necessary context from each artifact:
+
+**From spec.md:**
+
+- Overview/Context
+- Functional Requirements
+- Non-Functional Requirements
+- User Stories
+- Edge Cases (if present)
+
+**From plan.md:**
+
+- Architecture/stack choices
+- Data Model references
+- Phases
+- Technical constraints
+
+**From tasks.md:**
+
+- Task IDs
+- Descriptions
+- Phase grouping
+- Parallel markers [P]
+- Referenced file paths
+
+**From constitution:**
+
+- Load `.specify/memory/constitution.md` for principle validation
+
+### 3. Build Semantic Models
+
+Create internal representations (do not include raw artifacts in output):
+
+- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" -> `user-can-upload-file`)
+- **User story/action inventory**: Discrete user actions with acceptance criteria
+- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
+- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
+
+### 4. Detection Passes (Token-Efficient Analysis)
+
+Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.
+
+#### A. Duplication Detection
+
+- Identify near-duplicate requirements
+- Mark lower-quality phrasing for consolidation
+
+#### B. Ambiguity Detection
+
+- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
+- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)
+
+#### C. Underspecification
+
+- Requirements with verbs but missing object or measurable outcome
+- User stories missing acceptance criteria alignment
+- Tasks referencing files or components not defined in spec/plan
+
+#### D. Constitution Alignment
+
+- Any requirement or plan element conflicting with a MUST principle
+- Missing mandated sections or quality gates from constitution
+
+#### E. Coverage Gaps
+
+- Requirements with zero associated tasks
+- Tasks with no mapped requirement/story
+- Non-functional requirements not reflected in tasks (e.g., performance, security)
+
+#### F. Inconsistency
+
+- Terminology drift (same concept named differently across files)
+- Data entities referenced in plan but absent in spec (or vice versa)
+- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
+- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
+
+### 5. Severity Assignment
+
+Use this heuristic to prioritize findings:
+
+- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
+- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
+- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
+- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
+
+### 6. Produce Compact Analysis Report
+
+Output a Markdown report (no file writes) with the following structure:
+
+## Specification Analysis Report
+
+| ID | Category | Severity | Location(s) | Summary | Recommendation |
+|----|----------|----------|-------------|---------|----------------|
+| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |
+
+(Add one row per finding; generate stable IDs prefixed by category initial.)
+
+**Coverage Summary Table:**
+
+| Requirement Key | Has Task? | Task IDs | Notes |
+|-----------------|-----------|----------|-------|
+
+**Constitution Alignment Issues:** (if any)
+
+**Unmapped Tasks:** (if any)
+
+**Metrics:**
+
+- Total Requirements
+- Total Tasks
+- Coverage % (requirements with >=1 task)
+- Ambiguity Count
+- Duplication Count
+- Critical Issues Count
+
+### 7. Provide Next Actions
+
+At end of report, output a concise Next Actions block:
+
+- If CRITICAL issues exist: Recommend resolving before `/speckit.implement`
+- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
+- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"
+
+### 8. Offer Remediation
+
+Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)
+
+## Operating Principles
+
+### Context Efficiency
+
+- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
+- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
+- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
+- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts
+
+### Analysis Guidelines
+
+- **NEVER modify files** (this is read-only analysis)
+- **NEVER hallucinate missing sections** (if absent, report them accurately)
+- **Prioritize constitution violations** (these are always CRITICAL)
+- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
+- **Report zero issues gracefully** (emit success report with coverage statistics)
+
+## Context
+
+$ARGUMENTS
diff --git a/plugins/ralph-speckit/commands/checklist.md b/plugins/ralph-speckit/commands/checklist.md
new file mode 100644
index 00000000..d8359769
--- /dev/null
+++ b/plugins/ralph-speckit/commands/checklist.md
@@ -0,0 +1,296 @@
+---
+name: checklist
+description: Generate a custom checklist for the current feature based on user requirements.
+allowed-tools: [Read, Write, Bash]
+---
+
+## Checklist Purpose: "Unit Tests for English"
+
+**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
+
+**NOT for verification/testing**:
+
+- NOT "Verify the button clicks correctly"
+- NOT "Test error handling works"
+- NOT "Confirm the API returns 200"
+- NOT checking if code/implementation matches the spec
+
+**FOR requirements quality validation**:
+
+- "Are visual hierarchy requirements defined for all card types?" (completeness)
+- "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
+- "Are hover state requirements consistent across all interactive elements?" (consistency)
+- "Are accessibility requirements defined for keyboard navigation?" (coverage)
+- "Does the spec define what happens when logo image fails to load?" (edge cases)
+
+**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Execution Steps
+
+1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
+   - All file paths must be absolute.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
+   - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
+   - Only ask about information that materially changes checklist content
+   - Be skipped individually if already unambiguous in `$ARGUMENTS`
+   - Prefer precision over breadth
+
+   Generation algorithm:
+   1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
+   2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
+   3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
+   4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
+   5. Formulate questions chosen from these archetypes:
+      - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
+      - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
+      - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
+      - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
+      - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
+      - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
+
+   Question formatting rules:
+   - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
+   - Limit to A–E options maximum; omit table if a free-form answer is clearer
+   - Never ask the user to restate what they already said
+   - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
+
+   Defaults when interaction impossible:
+   - Depth: Standard
+   - Audience: Reviewer (PR) if code-related; Author otherwise
+   - Focus: Top 2 relevance clusters
+
+   Output the questions (label Q1/Q2/Q3). After answers: if >=2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow-ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
+
+3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
+   - Derive checklist theme (e.g., security, review, deploy, ux)
+   - Consolidate explicit must-have items mentioned by user
+   - Map focus selections to category scaffolding
+   - Infer any missing context from spec/plan/tasks (do NOT hallucinate)
+
+4. **Load feature context**: Read from FEATURE_DIR:
+   - spec.md: Feature requirements and scope
+   - plan.md (if exists): Technical details, dependencies
+   - tasks.md (if exists): Implementation tasks
+
+   **Context Loading Strategy**:
+   - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
+   - Prefer summarizing long sections into concise scenario/requirement bullets
+   - Use progressive disclosure: add follow-on retrieval only if gaps detected
+   - If source docs are large, generate interim summary items instead of embedding raw text
+
+5. **Generate checklist** - Create "Unit Tests for Requirements":
+   - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
+   - Generate unique checklist filename:
+     - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
+     - Format: `[domain].md`
+     - If file exists, append to existing file
+   - Number items sequentially starting from CHK001
+   - Each `/speckit.checklist` run creates a NEW file (never overwrites existing checklists)
+
+   **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
+   Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
+   - **Completeness**: Are all necessary requirements present?
+   - **Clarity**: Are requirements unambiguous and specific?
+   - **Consistency**: Do requirements align with each other?
+   - **Measurability**: Can requirements be objectively verified?
+   - **Coverage**: Are all scenarios/edge cases addressed?
+
+   **Category Structure** - Group items by requirement quality dimensions:
+   - **Requirement Completeness** (Are all necessary requirements documented?)
+   - **Requirement Clarity** (Are requirements specific and unambiguous?)
+   - **Requirement Consistency** (Do requirements align without conflicts?)
+   - **Acceptance Criteria Quality** (Are success criteria measurable?)
+   - **Scenario Coverage** (Are all flows/cases addressed?)
+   - **Edge Case Coverage** (Are boundary conditions defined?)
+   - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
+   - **Dependencies & Assumptions** (Are they documented and validated?)
+   - **Ambiguities & Conflicts** (What needs clarification?)
+
+   **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
+
+   WRONG (Testing implementation):
+   - "Verify landing page displays 3 episode cards"
+   - "Test hover states work on desktop"
+   - "Confirm logo click navigates home"
+
+   CORRECT (Testing requirements quality):
+   - "Are the exact number and layout of featured episodes specified?" [Completeness]
+   - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
+   - "Are hover state requirements consistent across all interactive elements?" [Consistency]
+   - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
+   - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
+   - "Are loading states defined for asynchronous episode data?" [Completeness]
+   - "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
+
+   **ITEM STRUCTURE**:
+   Each item should follow this pattern:
+   - Question format asking about requirement quality
+   - Focus on what's WRITTEN (or not written) in the spec/plan
+   - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
+   - Reference spec section `[Spec X.Y]` when checking existing requirements
+   - Use `[Gap]` marker when checking for missing requirements
+
+   **EXAMPLES BY QUALITY DIMENSION**:
+
+   Completeness:
+   - "Are error handling requirements defined for all API failure modes? [Gap]"
+   - "Are accessibility requirements specified for all interactive elements? [Completeness]"
+   - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
+
+   Clarity:
+   - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec NFR-2]"
+   - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec FR-5]"
+   - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec FR-4]"
+
+   Consistency:
+   - "Do navigation requirements align across all pages? [Consistency, Spec FR-10]"
+   - "Are card component requirements consistent between landing and detail pages? [Consistency]"
+
+   Coverage:
+   - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
+   - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
+   - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
+
+   Measurability:
+   - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec FR-1]"
+   - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec FR-2]"
+
+   **Scenario Classification & Coverage** (Requirements Quality Focus):
+   - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
+   - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
+   - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
+   - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
+
+   **Traceability Requirements**:
+   - MINIMUM: >=80% of items MUST include at least one traceability reference
+   - Each item should reference: spec section `[Spec X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
+   - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
+
+   **Surface & Resolve Issues** (Requirements Quality Problems):
+   Ask questions about the requirements themselves:
+   - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec NFR-1]"
+   - Conflicts: "Do navigation requirements conflict between FR-10 and FR-10a? [Conflict]"
+   - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
+   - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
+   - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
+
+   **Content Consolidation**:
+   - Soft cap: If raw candidate items > 40, prioritize by risk/impact
+   - Merge near-duplicates checking the same requirement aspect
+   - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
+
+   **ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
+   - Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
+   - References to code execution, user actions, system behavior
+   - "Displays correctly", "works properly", "functions as expected"
+   - "Click", "navigate", "render", "load", "execute"
+   - Test cases, test plans, QA procedures
+   - Implementation details (frameworks, APIs, algorithms)
+
+   **REQUIRED PATTERNS** - These test requirements quality:
+   - "Are [requirement type] defined/specified/documented for [scenario]?"
+   - "Is [vague term] quantified/clarified with specific criteria?"
+   - "Are requirements consistent between [section A] and [section B]?"
+   - "Can [requirement] be objectively measured/verified?"
+   - "Are [edge cases/scenarios] addressed in requirements?"
+   - "Does the spec define [missing aspect]?"
+
+6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
+
+7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
+   - Focus areas selected
+   - Depth level
+   - Actor/timing
+   - Any explicit user-specified must-have items incorporated
+
+**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
+
+- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
+- Simple, memorable filenames that indicate checklist purpose
+- Easy identification and navigation in the `checklists/` folder
+
+To avoid clutter, use descriptive types and clean up obsolete checklists when done.
+
+## Example Checklist Types & Sample Items
+
+**UX Requirements Quality:** `ux.md`
+
+Sample items (testing the requirements, NOT the implementation):
+
+- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec FR-1]"
+- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec FR-1]"
+- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
+- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
+- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
+- "Can 'prominent display' be objectively measured? [Measurability, Spec FR-4]"
+
+**API Requirements Quality:** `api.md`
+
+Sample items:
+
+- "Are error response formats specified for all failure scenarios? [Completeness]"
+- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
+- "Are authentication requirements consistent across all endpoints? [Consistency]"
+- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
+- "Is versioning strategy documented in requirements? [Gap]"
+
+**Performance Requirements Quality:** `performance.md`
+
+Sample items:
+
+- "Are performance requirements quantified with specific metrics? [Clarity]"
+- "Are performance targets defined for all critical user journeys? [Coverage]"
+- "Are performance requirements under different load conditions specified? [Completeness]"
+- "Can performance requirements be objectively measured? [Measurability]"
+- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
+
+**Security Requirements Quality:** `security.md`
+
+Sample items:
+
+- "Are authentication requirements specified for all protected resources? [Coverage]"
+- "Are data protection requirements defined for sensitive information? [Completeness]"
+- "Is the threat model documented and requirements aligned to it? [Traceability]"
+- "Are security requirements consistent with compliance obligations? [Consistency]"
+- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
+
+## Anti-Examples: What NOT To Do
+
+**WRONG - These test implementation, not requirements:**
+
+```markdown
+- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec FR-001]
+- [ ] CHK002 - Test hover states work correctly on desktop [Spec FR-003]
+- [ ] CHK003 - Confirm logo click navigates to home page [Spec FR-010]
+- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec FR-005]
+```
+
+**CORRECT - These test requirements quality:**
+
+```markdown
+- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec FR-001]
+- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec FR-003]
+- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec FR-010]
+- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec FR-005]
+- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
+- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec FR-001]
+```
+
+**Key Differences:**
+
+- Wrong: Tests if the system works correctly
+- Correct: Tests if the requirements are written correctly
+- Wrong: Verification of behavior
+- Correct: Validation of requirement quality
+- Wrong: "Does it do X?"
+- Correct: "Is X clearly specified?"
diff --git a/plugins/ralph-speckit/commands/clarify.md b/plugins/ralph-speckit/commands/clarify.md
new file mode 100644
index 00000000..4bdfccc0
--- /dev/null
+++ b/plugins/ralph-speckit/commands/clarify.md
@@ -0,0 +1,183 @@
+---
+name: clarify
+description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
+allowed-tools: [Read, Write, Edit, Bash]
+handoffs:
+  - label: Build Technical Plan
+    agent: speckit.plan
+    prompt: Create a plan for the spec. I am building with...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
+
+Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
+
+Execution steps:
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
+   - `FEATURE_DIR`
+   - `FEATURE_SPEC`
+   - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
+   - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
+
+   Functional Scope & Behavior:
+   - Core user goals & success criteria
+   - Explicit out-of-scope declarations
+   - User roles / personas differentiation
+
+   Domain & Data Model:
+   - Entities, attributes, relationships
+   - Identity & uniqueness rules
+   - Lifecycle/state transitions
+   - Data volume / scale assumptions
+
+   Interaction & UX Flow:
+   - Critical user journeys / sequences
+   - Error/empty/loading states
+   - Accessibility or localization notes
+
+   Non-Functional Quality Attributes:
+   - Performance (latency, throughput targets)
+   - Scalability (horizontal/vertical, limits)
+   - Reliability & availability (uptime, recovery expectations)
+   - Observability (logging, metrics, tracing signals)
+   - Security & privacy (authN/Z, data protection, threat assumptions)
+   - Compliance / regulatory constraints (if any)
+
+   Integration & External Dependencies:
+   - External services/APIs and failure modes
+   - Data import/export formats
+   - Protocol/versioning assumptions
+
+   Edge Cases & Failure Handling:
+   - Negative scenarios
+   - Rate limiting / throttling
+   - Conflict resolution (e.g., concurrent edits)
+
+   Constraints & Tradeoffs:
+   - Technical constraints (language, storage, hosting)
+   - Explicit tradeoffs or rejected alternatives
+
+   Terminology & Consistency:
+   - Canonical glossary terms
+   - Avoided synonyms / deprecated terms
+
+   Completion Signals:
+   - Acceptance criteria testability
+   - Measurable Definition of Done style indicators
+
+   Misc / Placeholders:
+   - TODO markers / unresolved decisions
+   - Ambiguous adjectives ("robust", "intuitive") lacking quantification
+
+   For each category with Partial or Missing status, add a candidate question opportunity unless:
+   - Clarification would not materially change implementation or validation strategy
+   - Information is better deferred to planning phase (note internally)
+
+3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
+    - Maximum of 10 total questions across the whole session.
+    - Each question must be answerable with EITHER:
+       - A short multiple-choice selection (2–5 distinct, mutually exclusive options), OR
+       - A one-word / short-phrase answer (explicitly constrain: "Answer in <=5 words").
+    - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
+    - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
+    - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
+    - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
+    - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
+
+4. Sequential questioning loop (interactive):
+    - Present EXACTLY ONE question at a time.
+    - For multiple-choice questions:
+       - **Analyze all options** and determine the **most suitable option** based on:
+          - Best practices for the project type
+          - Common patterns in similar implementations
+          - Risk reduction (security, performance, maintainability)
+          - Alignment with any explicit project goals or constraints visible in the spec
+       - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
+       - Format as: `**Recommended:** Option [X] - <reasoning>`
+       - Then render all options as a Markdown table:
+
+       | Option | Description |
+       |--------|-------------|
+       | A | <Option A description> |
+       | B | <Option B description> |
+       | C | <Option C description> (add D/E as needed up to 5) |
+       | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
+
+       - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
+    - For short-answer style (no meaningful discrete options):
+       - Provide your **suggested answer** based on best practices and context.
+       - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
+       - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
+    - After the user answers:
+       - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
+       - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
+       - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
+       - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
+    - Stop asking further questions when:
+       - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
+       - User signals completion ("done", "good", "no more"), OR
+       - You reach 5 asked questions.
+    - Never reveal future queued questions in advance.
+    - If no valid questions exist at start, immediately report no critical ambiguities.
+
+5. Integration after EACH accepted answer (incremental update approach):
+    - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
+    - For the first integrated answer in this session:
+       - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
+       - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
+    - Append a bullet line immediately after acceptance: `- Q: <question> -> A: <final answer>`.
+    - Then immediately apply the clarification to the most appropriate section(s):
+       - Functional ambiguity -> Update or add a bullet in Functional Requirements.
+       - User interaction / actor distinction -> Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
+       - Data shape / entities -> Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
+       - Non-functional constraint -> Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
+       - Edge case / negative flow -> Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
+       - Terminology conflict -> Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
+    - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
+    - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
+    - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
+    - Keep each inserted clarification minimal and testable (avoid narrative drift).
+
+6. Validation (performed after EACH write plus final pass):
+   - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
+   - Total asked (accepted) questions <= 5.
+   - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
+   - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
+   - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
+   - Terminology consistency: same canonical term used across all updated sections.
+
+7. Write the updated spec back to `FEATURE_SPEC`.
+
+8. Report completion (after questioning loop ends or early termination):
+   - Number of questions asked & answered.
+   - Path to updated spec.
+   - Sections touched (list names).
+   - Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
+   - If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
+   - Suggested next command.
+
+Behavior rules:
+
+- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
+- If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
+- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
+- Avoid speculative tech stack questions unless the absence blocks functional clarity.
+- Respect user early termination signals ("stop", "done", "proceed").
+- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
+- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
+
+Context for prioritization: $ARGUMENTS
diff --git a/plugins/ralph-speckit/commands/constitution.md b/plugins/ralph-speckit/commands/constitution.md
new file mode 100644
index 00000000..80966359
--- /dev/null
+++ b/plugins/ralph-speckit/commands/constitution.md
@@ -0,0 +1,84 @@
+---
+name: constitution
+description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
+allowed-tools: [Read, Write, Edit, Bash]
+handoffs:
+  - label: Build Specification
+    agent: speckit.specify
+    prompt: Implement the feature specification based on the updated constitution. I want to build...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
+
+Follow this execution flow:
+
+1. Load the existing constitution template at `.specify/memory/constitution.md`.
+   - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
+   **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
+
+2. Collect/derive values for placeholders:
+   - If user input (conversation) supplies a value, use it.
+   - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
+   - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
+   - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
+     - MAJOR: Backward incompatible governance/principle removals or redefinitions.
+     - MINOR: New principle/section added or materially expanded guidance.
+     - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
+   - If version bump type ambiguous, propose reasoning before finalizing.
+
+3. Draft the updated constitution content:
+   - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
+   - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
+   - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non-negotiable rules, explicit rationale if not obvious.
+   - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
+
+4. Consistency propagation checklist (convert prior checklist into active validations):
+   - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
+   - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
+   - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
+   - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
+   - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
+
+5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
+   - Version change: old -> new
+   - List of modified principles (old title -> new title if renamed)
+   - Added sections
+   - Removed sections
+   - Templates requiring updates (updated / pending) with file paths
+   - Follow-up TODOs if any placeholders intentionally deferred.
+
+6. Validation before final output:
+   - No remaining unexplained bracket tokens.
+   - Version line matches report.
+   - Dates ISO format YYYY-MM-DD.
+   - Principles are declarative, testable, and free of vague language ("should" -> replace with MUST/SHOULD rationale where appropriate).
+
+7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).
+
+8. Output a final summary to the user with:
+   - New version and bump rationale.
+   - Any files flagged for manual follow-up.
+   - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
+
+Formatting & Style Requirements:
+
+- Use Markdown headings exactly as in the template (do not demote/promote levels).
+- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
+- Keep a single blank line between sections.
+- Avoid trailing whitespace.
+
+If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
+
+If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
+
+Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.
diff --git a/plugins/ralph-speckit/commands/plan.md b/plugins/ralph-speckit/commands/plan.md
new file mode 100644
index 00000000..73d6b212
--- /dev/null
+++ b/plugins/ralph-speckit/commands/plan.md
@@ -0,0 +1,91 @@
+---
+name: plan
+description: Execute the implementation planning workflow using the plan template to generate design artifacts.
+allowed-tools: [Read, Write, Edit, Task, Bash]
+handoffs:
+  - label: Create Tasks
+    agent: speckit.tasks
+    prompt: Break the plan into tasks
+    send: true
+  - label: Create Checklist
+    agent: speckit.checklist
+    prompt: Create a checklist for the following domain...
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
+
+3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
+   - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
+   - Fill Constitution Check section from constitution
+   - Evaluate gates (ERROR if violations unjustified)
+   - Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
+   - Phase 1: Generate data-model.md, contracts/, quickstart.md
+   - Phase 1: Update agent context by running the agent script
+   - Re-evaluate Constitution Check post-design
+
+4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
+
+## Phases
+
+### Phase 0: Outline & Research
+
+1. **Extract unknowns from Technical Context** above:
+   - For each NEEDS CLARIFICATION -> research task
+   - For each dependency -> best practices task
+   - For each integration -> patterns task
+
+2. **Generate and dispatch research agents**:
+
+   ```text
+   For each unknown in Technical Context:
+     Task: "Research {unknown} for {feature context}"
+   For each technology choice:
+     Task: "Find best practices for {tech} in {domain}"
+   ```
+
+3. **Consolidate findings** in `research.md` using format:
+   - Decision: [what was chosen]
+   - Rationale: [why chosen]
+   - Alternatives considered: [what else evaluated]
+
+**Output**: research.md with all NEEDS CLARIFICATION resolved
+
+### Phase 1: Design & Contracts
+
+**Prerequisites:** `research.md` complete
+
+1. **Extract entities from feature spec** -> `data-model.md`:
+   - Entity name, fields, relationships
+   - Validation rules from requirements
+   - State transitions if applicable
+
+2. **Generate API contracts** from functional requirements:
+   - For each user action -> endpoint
+   - Use standard REST/GraphQL patterns
+   - Output OpenAPI/GraphQL schema to `/contracts/`
+
+3. **Agent context update**:
+   - Run `.specify/scripts/bash/update-agent-context.sh claude`
+   - These scripts detect which AI agent is in use
+   - Update the appropriate agent-specific context file
+   - Add only new technology from current plan
+   - Preserve manual additions between markers
+
+**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file
+
+## Key rules
+
+- Use absolute paths
+- ERROR on gate failures or unresolved clarifications
diff --git a/plugins/ralph-speckit/commands/specify.md b/plugins/ralph-speckit/commands/specify.md
new file mode 100644
index 00000000..e88a8675
--- /dev/null
+++ b/plugins/ralph-speckit/commands/specify.md
@@ -0,0 +1,260 @@
+---
+name: specify
+description: Create or update the feature specification from a natural language feature description.
+allowed-tools: [Read, Write, Edit, Bash]
+handoffs:
+  - label: Build Technical Plan
+    agent: speckit.plan
+    prompt: Create a plan for the spec. I am building with...
+  - label: Clarify Spec Requirements
+    agent: speckit.clarify
+    prompt: Clarify specification requirements
+    send: true
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+The text the user typed after `/speckit.specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.
+
+Given that feature description, do this:
+
+1. **Generate a concise short name** (2-4 words) for the branch:
+   - Analyze the feature description and extract the most meaningful keywords
+   - Create a 2-4 word short name that captures the essence of the feature
+   - Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
+   - Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
+   - Keep it concise but descriptive enough to understand the feature at a glance
+   - Examples:
+     - "I want to add user authentication" -> "user-auth"
+     - "Implement OAuth2 integration for the API" -> "oauth2-api-integration"
+     - "Create a dashboard for analytics" -> "analytics-dashboard"
+     - "Fix payment processing timeout bug" -> "fix-payment-timeout"
+
+2. **Check for existing branches before creating new one**:
+
+   a. First, fetch all remote branches to ensure we have the latest information:
+
+      ```bash
+      git fetch --all --prune
+      ```
+
+   b. Find the highest feature number across all sources for the short-name:
+      - Remote branches: `git ls-remote --heads origin | grep -E 'refs/heads/[0-9]+-<short-name>$'`
+      - Local branches: `git branch | grep -E '^[* ]*[0-9]+-<short-name>$'`
+      - Specs directories: Check for directories matching `specs/[0-9]+-<short-name>`
+
+   c. Determine the next available number:
+      - Extract all numbers from all three sources
+      - Find the highest number N
+      - Use N+1 for the new branch number
+
+   d. Run the script `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS"` with the calculated number and short-name:
+      - Pass `--number N+1` and `--short-name "your-short-name"` along with the feature description
+      - Bash example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
+      - PowerShell example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
+
+   **IMPORTANT**:
+   - Check all three sources (remote branches, local branches, specs directories) to find the highest number
+   - Only match branches/directories with the exact short-name pattern
+   - If no existing branches/directories found with this short-name, start with number 1
+   - You must only ever run this script once per feature
+   - The JSON is provided in the terminal as output - always refer to it to get the actual content you're looking for
+   - The JSON output will contain BRANCH_NAME and SPEC_FILE paths
+   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot")
+
+3. Load `.specify/templates/spec-template.md` to understand required sections.
+
+4. Follow this execution flow:
+
+    1. Parse user description from Input
+       If empty: ERROR "No feature description provided"
+    2. Extract key concepts from description
+       Identify: actors, actions, data, constraints
+    3. For unclear aspects:
+       - Make informed guesses based on context and industry standards
+       - Only mark with [NEEDS CLARIFICATION: specific question] if:
+         - The choice significantly impacts feature scope or user experience
+         - Multiple reasonable interpretations exist with different implications
+         - No reasonable default exists
+       - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
+       - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
+    4. Fill User Scenarios & Testing section
+       If no clear user flow: ERROR "Cannot determine user scenarios"
+    5. Generate Functional Requirements
+       Each requirement must be testable
+       Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
+    6. Define Success Criteria
+       Create measurable, technology-agnostic outcomes
+       Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
+       Each criterion must be verifiable without implementation details
+    7. Identify Key Entities (if data involved)
+    8. Return: SUCCESS (spec ready for planning)
+
+5. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.
+
+6. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:
+
+   a. **Create Spec Quality Checklist**: Generate a checklist file at `FEATURE_DIR/checklists/requirements.md` using the checklist template structure with these validation items:
+
+      ```markdown
+      # Specification Quality Checklist: [FEATURE NAME]
+
+      **Purpose**: Validate specification completeness and quality before proceeding to planning
+      **Created**: [DATE]
+      **Feature**: [Link to spec.md]
+
+      ## Content Quality
+
+      - [ ] No implementation details (languages, frameworks, APIs)
+      - [ ] Focused on user value and business needs
+      - [ ] Written for non-technical stakeholders
+      - [ ] All mandatory sections completed
+
+      ## Requirement Completeness
+
+      - [ ] No [NEEDS CLARIFICATION] markers remain
+      - [ ] Requirements are testable and unambiguous
+      - [ ] Success criteria are measurable
+      - [ ] Success criteria are technology-agnostic (no implementation details)
+      - [ ] All acceptance scenarios are defined
+      - [ ] Edge cases are identified
+      - [ ] Scope is clearly bounded
+      - [ ] Dependencies and assumptions identified
+
+      ## Feature Readiness
+
+      - [ ] All functional requirements have clear acceptance criteria
+      - [ ] User scenarios cover primary flows
+      - [ ] Feature meets measurable outcomes defined in Success Criteria
+      - [ ] No implementation details leak into specification
+
+      ## Notes
+
+      - Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
+      ```
+
+   b. **Run Validation Check**: Review the spec against each checklist item:
+      - For each item, determine if it passes or fails
+      - Document specific issues found (quote relevant spec sections)
+
+   c. **Handle Validation Results**:
+
+      - **If all items pass**: Mark checklist complete and proceed to step 6
+
+      - **If items fail (excluding [NEEDS CLARIFICATION])**:
+        1. List the failing items and specific issues
+        2. Update the spec to address each issue
+        3. Re-run validation until all items pass (max 3 iterations)
+        4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user
+
+      - **If [NEEDS CLARIFICATION] markers remain**:
+        1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
+        2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
+        3. For each clarification needed (max 3), present options to user in this format:
+
+           ```markdown
+           ## Question [N]: [Topic]
+
+           **Context**: [Quote relevant spec section]
+
+           **What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
+
+           **Suggested Answers**:
+
+           | Option | Answer | Implications |
+           |--------|--------|--------------|
+           | A      | [First suggested answer] | [What this means for the feature] |
+           | B      | [Second suggested answer] | [What this means for the feature] |
+           | C      | [Third suggested answer] | [What this means for the feature] |
+           | Custom | Provide your own answer | [Explain how to provide custom input] |
+
+           **Your choice**: _[Wait for user response]_
+           ```
+
+        4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
+           - Use consistent spacing with pipes aligned
+           - Each cell should have spaces around content: `| Content |` not `|Content|`
+           - Header separator must have at least 3 dashes: `|--------|`
+           - Test that the table renders correctly in markdown preview
+        5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
+        6. Present all questions together before waiting for responses
+        7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
+        8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
+        9. Re-run validation after all clarifications are resolved
+
+   d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
+
+7. Report completion with branch name, spec file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).
+
+**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.
+
+## General Guidelines
+
+## Quick Guidelines
+
+- Focus on **WHAT** users need and **WHY**.
+- Avoid HOW to implement (no tech stack, APIs, code structure).
+- Written for business stakeholders, not developers.
+- DO NOT create any checklists that are embedded in the spec. That will be a separate command.
+
+### Section Requirements
+
+- **Mandatory sections**: Must be completed for every feature
+- **Optional sections**: Include only when relevant to the feature
+- When a section doesn't apply, remove it entirely (don't leave as "N/A")
+
+### For AI Generation
+
+When creating this spec from a user prompt:
+
+1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
+2. **Document assumptions**: Record reasonable defaults in the Assumptions section
+3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
+   - Significantly impact feature scope or user experience
+   - Have multiple reasonable interpretations with different implications
+   - Lack any reasonable default
+4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
+5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
+6. **Common areas needing clarification** (only if no reasonable default exists):
+   - Feature scope and boundaries (include/exclude specific use cases)
+   - User types and permissions (if multiple conflicting interpretations possible)
+   - Security/compliance requirements (when legally/financially significant)
+
+**Examples of reasonable defaults** (don't ask about these):
+
+- Data retention: Industry-standard practices for the domain
+- Performance targets: Standard web/mobile app expectations unless specified
+- Error handling: User-friendly messages with appropriate fallbacks
+- Authentication method: Standard session-based or OAuth2 for web apps
+- Integration patterns: RESTful APIs unless specified otherwise
+
+### Success Criteria Guidelines
+
+Success criteria must be:
+
+1. **Measurable**: Include specific metrics (time, percentage, count, rate)
+2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
+3. **User-focused**: Describe outcomes from user/business perspective, not system internals
+4. **Verifiable**: Can be tested/validated without knowing implementation details
+
+**Good examples**:
+
+- "Users can complete checkout in under 3 minutes"
+- "System supports 10,000 concurrent users"
+- "95% of searches return results in under 1 second"
+- "Task completion rate improves by 40%"
+
+**Bad examples** (implementation-focused):
+
+- "API response time is under 200ms" (too technical, use "Users see results instantly")
+- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
+- "React components render efficiently" (framework-specific)
+- "Redis cache hit rate above 80%" (technology-specific)
diff --git a/plugins/ralph-speckit/commands/tasks.md b/plugins/ralph-speckit/commands/tasks.md
new file mode 100644
index 00000000..3e939c82
--- /dev/null
+++ b/plugins/ralph-speckit/commands/tasks.md
@@ -0,0 +1,139 @@
+---
+name: tasks
+description: Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts.
+allowed-tools: [Read, Write, Edit, Bash]
+handoffs:
+  - label: Analyze For Consistency
+    agent: speckit.analyze
+    prompt: Run a project analysis for consistency
+    send: true
+  - label: Implement Project
+    agent: speckit.implement
+    prompt: Start the implementation in phases
+    send: true
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+
+2. **Load design documents**: Read from FEATURE_DIR:
+   - **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)
+   - **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios)
+   - Note: Not all projects have all documents. Generate tasks based on what's available.
+
+3. **Execute task generation workflow**:
+   - Load plan.md and extract tech stack, libraries, project structure
+   - Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
+   - If data-model.md exists: Extract entities and map to user stories
+   - If contracts/ exists: Map endpoints to user stories
+   - If research.md exists: Extract decisions for setup tasks
+   - Generate tasks organized by user story (see Task Generation Rules below)
+   - Generate dependency graph showing user story completion order
+   - Create parallel execution examples per user story
+   - Validate task completeness (each user story has all needed tasks, independently testable)
+
+4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
+   - Correct feature name from plan.md
+   - Phase 1: Setup tasks (project initialization)
+   - Phase 2: Foundational tasks (blocking prerequisites for all user stories)
+   - Phase 3+: One phase per user story (in priority order from spec.md)
+   - Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
+   - Final Phase: Polish & cross-cutting concerns
+   - All tasks must follow the strict checklist format (see Task Generation Rules below)
+   - Clear file paths for each task
+   - Dependencies section showing story completion order
+   - Parallel execution examples per story
+   - Implementation strategy section (MVP first, incremental delivery)
+
+5. **Report**: Output path to generated tasks.md and summary:
+   - Total task count
+   - Task count per user story
+   - Parallel opportunities identified
+   - Independent test criteria for each story
+   - Suggested MVP scope (typically just User Story 1)
+   - Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
+
+Context for task generation: $ARGUMENTS
+
+The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
+
+## Task Generation Rules
+
+**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.
+
+**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
+
+### Checklist Format (REQUIRED)
+
+Every task MUST strictly follow this format:
+
+```text
+- [ ] [TaskID] [P?] [Story?] Description with file path
+```
+
+**Format Components**:
+
+1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
+2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
+3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
+4. **[Story] label**: REQUIRED for user story phase tasks only
+   - Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
+   - Setup phase: NO story label
+   - Foundational phase: NO story label
+   - User Story phases: MUST have story label
+   - Polish phase: NO story label
+5. **Description**: Clear action with exact file path
+
+**Examples**:
+
+- CORRECT: `- [ ] T001 Create project structure per implementation plan`
+- CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
+- CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
+- CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
+- WRONG: `- [ ] Create User model` (missing ID and Story label)
+- WRONG: `T001 [US1] Create model` (missing checkbox)
+- WRONG: `- [ ] [US1] Create User model` (missing Task ID)
+- WRONG: `- [ ] T001 [US1] Create model` (missing file path)
+
+### Task Organization
+
+1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
+   - Each user story (P1, P2, P3...) gets its own phase
+   - Map all related components to their story:
+     - Models needed for that story
+     - Services needed for that story
+     - Endpoints/UI needed for that story
+     - If tests requested: Tests specific to that story
+   - Mark story dependencies (most stories should be independent)
+
+2. **From Contracts**:
+   - Map each contract/endpoint -> to the user story it serves
+   - If tests requested: Each contract -> contract test task [P] before implementation in that story's phase
+
+3. **From Data Model**:
+   - Map each entity to the user story(ies) that need it
+   - If entity serves multiple stories: Put in earliest story or Setup phase
+   - Relationships -> service layer tasks in appropriate story phase
+
+4. **From Setup/Infrastructure**:
+   - Shared infrastructure -> Setup phase (Phase 1)
+   - Foundational/blocking tasks -> Foundational phase (Phase 2)
+   - Story-specific setup -> within that story's phase
+
+### Phase Structure
+
+- **Phase 1**: Setup (project initialization)
+- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
+- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
+  - Within each story: Tests (if requested) -> Models -> Services -> Endpoints -> Integration
+  - Each phase should be a complete, independently testable increment
+- **Final Phase**: Polish & Cross-Cutting Concerns
diff --git a/plugins/ralph-speckit/commands/taskstoissues.md b/plugins/ralph-speckit/commands/taskstoissues.md
new file mode 100644
index 00000000..f908b7be
--- /dev/null
+++ b/plugins/ralph-speckit/commands/taskstoissues.md
@@ -0,0 +1,31 @@
+---
+name: taskstoissues
+description: Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts.
+allowed-tools: [Read, Bash, mcp__github__create_issue]
+---
+
+## User Input
+
+```text
+$ARGUMENTS
+```
+
+You **MUST** consider the user input before proceeding (if not empty).
+
+## Outline
+
+1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
+1. From the executed script, extract the path to **tasks**.
+1. Get the Git remote by running:
+
+```bash
+git config --get remote.origin.url
+```
+
+> [!CAUTION]
+> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL
+
+1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.
+
+> [!CAUTION]
+> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 9ea7f3de..2575c871 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -132,7 +132,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-4.1_
   - _Design: ralph-speckit Commands_
 
-- [ ] 1.9 Migrate legacy commands to commands/ directory (8 files)
+- [x] 1.9 Migrate legacy commands to commands/ directory (8 files)
   - **Do**:
     1. For each legacy command in `.claude/commands/`:
        - Copy to `plugins/ralph-speckit/commands/` with new name (strip speckit. prefix)

From e0985b72cfa74bc90e6adbfb47f538fc87740864 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:46:49 +0200
Subject: [PATCH 11/37] chore(ralph-speckit): remove legacy commands directory

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../.claude/commands/speckit.analyze.md       | 184 -----------
 .../.claude/commands/speckit.checklist.md     | 294 ------------------
 .../.claude/commands/speckit.clarify.md       | 181 -----------
 .../.claude/commands/speckit.constitution.md  |  82 -----
 .../.claude/commands/speckit.implement.md     | 135 --------
 .../.claude/commands/speckit.plan.md          |  89 ------
 .../.claude/commands/speckit.specify.md       | 258 ---------------
 .../.claude/commands/speckit.tasks.md         | 137 --------
 .../.claude/commands/speckit.taskstoissues.md |  30 --
 specs/refactor-plugins/.progress.md           |  99 ++++++
 specs/refactor-plugins/tasks.md               |   2 +-
 11 files changed, 100 insertions(+), 1391 deletions(-)
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.analyze.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.checklist.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.clarify.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.constitution.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.implement.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.plan.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.specify.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.tasks.md
 delete mode 100644 plugins/ralph-speckit/.claude/commands/speckit.taskstoissues.md
 create mode 100644 specs/refactor-plugins/.progress.md

diff --git a/plugins/ralph-speckit/.claude/commands/speckit.analyze.md b/plugins/ralph-speckit/.claude/commands/speckit.analyze.md
deleted file mode 100644
index 98b04b0c..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.analyze.md
+++ /dev/null
@@ -1,184 +0,0 @@
----
-description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Goal
-
-Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
-
-## Operating Constraints
-
-**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).
-
-**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`.
-
-## Execution Steps
-
-### 1. Initialize Analysis Context
-
-Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
-
-- SPEC = FEATURE_DIR/spec.md
-- PLAN = FEATURE_DIR/plan.md
-- TASKS = FEATURE_DIR/tasks.md
-
-Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
-For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-### 2. Load Artifacts (Progressive Disclosure)
-
-Load only the minimal necessary context from each artifact:
-
-**From spec.md:**
-
-- Overview/Context
-- Functional Requirements
-- Non-Functional Requirements
-- User Stories
-- Edge Cases (if present)
-
-**From plan.md:**
-
-- Architecture/stack choices
-- Data Model references
-- Phases
-- Technical constraints
-
-**From tasks.md:**
-
-- Task IDs
-- Descriptions
-- Phase grouping
-- Parallel markers [P]
-- Referenced file paths
-
-**From constitution:**
-
-- Load `.specify/memory/constitution.md` for principle validation
-
-### 3. Build Semantic Models
-
-Create internal representations (do not include raw artifacts in output):
-
-- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" → `user-can-upload-file`)
-- **User story/action inventory**: Discrete user actions with acceptance criteria
-- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
-- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
-
-### 4. Detection Passes (Token-Efficient Analysis)
-
-Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.
-
-#### A. Duplication Detection
-
-- Identify near-duplicate requirements
-- Mark lower-quality phrasing for consolidation
-
-#### B. Ambiguity Detection
-
-- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
-- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)
-
-#### C. Underspecification
-
-- Requirements with verbs but missing object or measurable outcome
-- User stories missing acceptance criteria alignment
-- Tasks referencing files or components not defined in spec/plan
-
-#### D. Constitution Alignment
-
-- Any requirement or plan element conflicting with a MUST principle
-- Missing mandated sections or quality gates from constitution
-
-#### E. Coverage Gaps
-
-- Requirements with zero associated tasks
-- Tasks with no mapped requirement/story
-- Non-functional requirements not reflected in tasks (e.g., performance, security)
-
-#### F. Inconsistency
-
-- Terminology drift (same concept named differently across files)
-- Data entities referenced in plan but absent in spec (or vice versa)
-- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
-- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
-
-### 5. Severity Assignment
-
-Use this heuristic to prioritize findings:
-
-- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
-- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
-- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
-- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
-
-### 6. Produce Compact Analysis Report
-
-Output a Markdown report (no file writes) with the following structure:
-
-## Specification Analysis Report
-
-| ID | Category | Severity | Location(s) | Summary | Recommendation |
-|----|----------|----------|-------------|---------|----------------|
-| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |
-
-(Add one row per finding; generate stable IDs prefixed by category initial.)
-
-**Coverage Summary Table:**
-
-| Requirement Key | Has Task? | Task IDs | Notes |
-|-----------------|-----------|----------|-------|
-
-**Constitution Alignment Issues:** (if any)
-
-**Unmapped Tasks:** (if any)
-
-**Metrics:**
-
-- Total Requirements
-- Total Tasks
-- Coverage % (requirements with >=1 task)
-- Ambiguity Count
-- Duplication Count
-- Critical Issues Count
-
-### 7. Provide Next Actions
-
-At end of report, output a concise Next Actions block:
-
-- If CRITICAL issues exist: Recommend resolving before `/speckit.implement`
-- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
-- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"
-
-### 8. Offer Remediation
-
-Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)
-
-## Operating Principles
-
-### Context Efficiency
-
-- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
-- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
-- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
-- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts
-
-### Analysis Guidelines
-
-- **NEVER modify files** (this is read-only analysis)
-- **NEVER hallucinate missing sections** (if absent, report them accurately)
-- **Prioritize constitution violations** (these are always CRITICAL)
-- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
-- **Report zero issues gracefully** (emit success report with coverage statistics)
-
-## Context
-
-$ARGUMENTS
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.checklist.md b/plugins/ralph-speckit/.claude/commands/speckit.checklist.md
deleted file mode 100644
index 970e6c9e..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.checklist.md
+++ /dev/null
@@ -1,294 +0,0 @@
----
-description: Generate a custom checklist for the current feature based on user requirements.
----
-
-## Checklist Purpose: "Unit Tests for English"
-
-**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
-
-**NOT for verification/testing**:
-
-- ❌ NOT "Verify the button clicks correctly"
-- ❌ NOT "Test error handling works"
-- ❌ NOT "Confirm the API returns 200"
-- ❌ NOT checking if code/implementation matches the spec
-
-**FOR requirements quality validation**:
-
-- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
-- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
-- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
-- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
-- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
-
-**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Execution Steps
-
-1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
-   - All file paths must be absolute.
-   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
-   - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
-   - Only ask about information that materially changes checklist content
-   - Be skipped individually if already unambiguous in `$ARGUMENTS`
-   - Prefer precision over breadth
-
-   Generation algorithm:
-   1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
-   2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
-   3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
-   4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
-   5. Formulate questions chosen from these archetypes:
-      - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
-      - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
-      - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
-      - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
-      - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
-      - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
-
-   Question formatting rules:
-   - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
-   - Limit to A–E options maximum; omit table if a free-form answer is clearer
-   - Never ask the user to restate what they already said
-   - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
-
-   Defaults when interaction impossible:
-   - Depth: Standard
-   - Audience: Reviewer (PR) if code-related; Author otherwise
-   - Focus: Top 2 relevance clusters
-
-   Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
-
-3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
-   - Derive checklist theme (e.g., security, review, deploy, ux)
-   - Consolidate explicit must-have items mentioned by user
-   - Map focus selections to category scaffolding
-   - Infer any missing context from spec/plan/tasks (do NOT hallucinate)
-
-4. **Load feature context**: Read from FEATURE_DIR:
-   - spec.md: Feature requirements and scope
-   - plan.md (if exists): Technical details, dependencies
-   - tasks.md (if exists): Implementation tasks
-
-   **Context Loading Strategy**:
-   - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
-   - Prefer summarizing long sections into concise scenario/requirement bullets
-   - Use progressive disclosure: add follow-on retrieval only if gaps detected
-   - If source docs are large, generate interim summary items instead of embedding raw text
-
-5. **Generate checklist** - Create "Unit Tests for Requirements":
-   - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
-   - Generate unique checklist filename:
-     - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
-     - Format: `[domain].md`
-     - If file exists, append to existing file
-   - Number items sequentially starting from CHK001
-   - Each `/speckit.checklist` run creates a NEW file (never overwrites existing checklists)
-
-   **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
-   Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
-   - **Completeness**: Are all necessary requirements present?
-   - **Clarity**: Are requirements unambiguous and specific?
-   - **Consistency**: Do requirements align with each other?
-   - **Measurability**: Can requirements be objectively verified?
-   - **Coverage**: Are all scenarios/edge cases addressed?
-
-   **Category Structure** - Group items by requirement quality dimensions:
-   - **Requirement Completeness** (Are all necessary requirements documented?)
-   - **Requirement Clarity** (Are requirements specific and unambiguous?)
-   - **Requirement Consistency** (Do requirements align without conflicts?)
-   - **Acceptance Criteria Quality** (Are success criteria measurable?)
-   - **Scenario Coverage** (Are all flows/cases addressed?)
-   - **Edge Case Coverage** (Are boundary conditions defined?)
-   - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
-   - **Dependencies & Assumptions** (Are they documented and validated?)
-   - **Ambiguities & Conflicts** (What needs clarification?)
-
-   **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
-
-   ❌ **WRONG** (Testing implementation):
-   - "Verify landing page displays 3 episode cards"
-   - "Test hover states work on desktop"
-   - "Confirm logo click navigates home"
-
-   ✅ **CORRECT** (Testing requirements quality):
-   - "Are the exact number and layout of featured episodes specified?" [Completeness]
-   - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
-   - "Are hover state requirements consistent across all interactive elements?" [Consistency]
-   - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
-   - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
-   - "Are loading states defined for asynchronous episode data?" [Completeness]
-   - "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
-
-   **ITEM STRUCTURE**:
-   Each item should follow this pattern:
-   - Question format asking about requirement quality
-   - Focus on what's WRITTEN (or not written) in the spec/plan
-   - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
-   - Reference spec section `[Spec §X.Y]` when checking existing requirements
-   - Use `[Gap]` marker when checking for missing requirements
-
-   **EXAMPLES BY QUALITY DIMENSION**:
-
-   Completeness:
-   - "Are error handling requirements defined for all API failure modes? [Gap]"
-   - "Are accessibility requirements specified for all interactive elements? [Completeness]"
-   - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
-
-   Clarity:
-   - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
-   - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
-   - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
-
-   Consistency:
-   - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
-   - "Are card component requirements consistent between landing and detail pages? [Consistency]"
-
-   Coverage:
-   - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
-   - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
-   - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
-
-   Measurability:
-   - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
-   - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
-
-   **Scenario Classification & Coverage** (Requirements Quality Focus):
-   - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
-   - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
-   - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
-   - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
-
-   **Traceability Requirements**:
-   - MINIMUM: ≥80% of items MUST include at least one traceability reference
-   - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
-   - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
-
-   **Surface & Resolve Issues** (Requirements Quality Problems):
-   Ask questions about the requirements themselves:
-   - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
-   - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
-   - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
-   - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
-   - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
-
-   **Content Consolidation**:
-   - Soft cap: If raw candidate items > 40, prioritize by risk/impact
-   - Merge near-duplicates checking the same requirement aspect
-   - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
-
-   **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
-   - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
-   - ❌ References to code execution, user actions, system behavior
-   - ❌ "Displays correctly", "works properly", "functions as expected"
-   - ❌ "Click", "navigate", "render", "load", "execute"
-   - ❌ Test cases, test plans, QA procedures
-   - ❌ Implementation details (frameworks, APIs, algorithms)
-
-   **✅ REQUIRED PATTERNS** - These test requirements quality:
-   - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
-   - ✅ "Is [vague term] quantified/clarified with specific criteria?"
-   - ✅ "Are requirements consistent between [section A] and [section B]?"
-   - ✅ "Can [requirement] be objectively measured/verified?"
-   - ✅ "Are [edge cases/scenarios] addressed in requirements?"
-   - ✅ "Does the spec define [missing aspect]?"
-
-6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
-
-7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
-   - Focus areas selected
-   - Depth level
-   - Actor/timing
-   - Any explicit user-specified must-have items incorporated
-
-**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
-
-- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
-- Simple, memorable filenames that indicate checklist purpose
-- Easy identification and navigation in the `checklists/` folder
-
-To avoid clutter, use descriptive types and clean up obsolete checklists when done.
-
-## Example Checklist Types & Sample Items
-
-**UX Requirements Quality:** `ux.md`
-
-Sample items (testing the requirements, NOT the implementation):
-
-- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
-- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
-- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
-- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
-- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
-- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
-
-**API Requirements Quality:** `api.md`
-
-Sample items:
-
-- "Are error response formats specified for all failure scenarios? [Completeness]"
-- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
-- "Are authentication requirements consistent across all endpoints? [Consistency]"
-- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
-- "Is versioning strategy documented in requirements? [Gap]"
-
-**Performance Requirements Quality:** `performance.md`
-
-Sample items:
-
-- "Are performance requirements quantified with specific metrics? [Clarity]"
-- "Are performance targets defined for all critical user journeys? [Coverage]"
-- "Are performance requirements under different load conditions specified? [Completeness]"
-- "Can performance requirements be objectively measured? [Measurability]"
-- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
-
-**Security Requirements Quality:** `security.md`
-
-Sample items:
-
-- "Are authentication requirements specified for all protected resources? [Coverage]"
-- "Are data protection requirements defined for sensitive information? [Completeness]"
-- "Is the threat model documented and requirements aligned to it? [Traceability]"
-- "Are security requirements consistent with compliance obligations? [Consistency]"
-- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
-
-## Anti-Examples: What NOT To Do
-
-**❌ WRONG - These test implementation, not requirements:**
-
-```markdown
-- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
-- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
-- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
-- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
-```
-
-**✅ CORRECT - These test requirements quality:**
-
-```markdown
-- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
-- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
-- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
-- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
-- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
-- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
-```
-
-**Key Differences:**
-
-- Wrong: Tests if the system works correctly
-- Correct: Tests if the requirements are written correctly
-- Wrong: Verification of behavior
-- Correct: Validation of requirement quality
-- Wrong: "Does it do X?"
-- Correct: "Is X clearly specified?"
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.clarify.md b/plugins/ralph-speckit/.claude/commands/speckit.clarify.md
deleted file mode 100644
index 6b28dae1..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.clarify.md
+++ /dev/null
@@ -1,181 +0,0 @@
----
-description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
-handoffs: 
-  - label: Build Technical Plan
-    agent: speckit.plan
-    prompt: Create a plan for the spec. I am building with...
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
-
-Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
-
-Execution steps:
-
-1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
-   - `FEATURE_DIR`
-   - `FEATURE_SPEC`
-   - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
-   - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
-   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
-
-   Functional Scope & Behavior:
-   - Core user goals & success criteria
-   - Explicit out-of-scope declarations
-   - User roles / personas differentiation
-
-   Domain & Data Model:
-   - Entities, attributes, relationships
-   - Identity & uniqueness rules
-   - Lifecycle/state transitions
-   - Data volume / scale assumptions
-
-   Interaction & UX Flow:
-   - Critical user journeys / sequences
-   - Error/empty/loading states
-   - Accessibility or localization notes
-
-   Non-Functional Quality Attributes:
-   - Performance (latency, throughput targets)
-   - Scalability (horizontal/vertical, limits)
-   - Reliability & availability (uptime, recovery expectations)
-   - Observability (logging, metrics, tracing signals)
-   - Security & privacy (authN/Z, data protection, threat assumptions)
-   - Compliance / regulatory constraints (if any)
-
-   Integration & External Dependencies:
-   - External services/APIs and failure modes
-   - Data import/export formats
-   - Protocol/versioning assumptions
-
-   Edge Cases & Failure Handling:
-   - Negative scenarios
-   - Rate limiting / throttling
-   - Conflict resolution (e.g., concurrent edits)
-
-   Constraints & Tradeoffs:
-   - Technical constraints (language, storage, hosting)
-   - Explicit tradeoffs or rejected alternatives
-
-   Terminology & Consistency:
-   - Canonical glossary terms
-   - Avoided synonyms / deprecated terms
-
-   Completion Signals:
-   - Acceptance criteria testability
-   - Measurable Definition of Done style indicators
-
-   Misc / Placeholders:
-   - TODO markers / unresolved decisions
-   - Ambiguous adjectives ("robust", "intuitive") lacking quantification
-
-   For each category with Partial or Missing status, add a candidate question opportunity unless:
-   - Clarification would not materially change implementation or validation strategy
-   - Information is better deferred to planning phase (note internally)
-
-3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
-    - Maximum of 10 total questions across the whole session.
-    - Each question must be answerable with EITHER:
-       - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR
-       - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words").
-    - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
-    - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
-    - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
-    - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
-    - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
-
-4. Sequential questioning loop (interactive):
-    - Present EXACTLY ONE question at a time.
-    - For multiple‑choice questions:
-       - **Analyze all options** and determine the **most suitable option** based on:
-          - Best practices for the project type
-          - Common patterns in similar implementations
-          - Risk reduction (security, performance, maintainability)
-          - Alignment with any explicit project goals or constraints visible in the spec
-       - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
-       - Format as: `**Recommended:** Option [X] - <reasoning>`
-       - Then render all options as a Markdown table:
-
-       | Option | Description |
-       |--------|-------------|
-       | A | <Option A description> |
-       | B | <Option B description> |
-       | C | <Option C description> (add D/E as needed up to 5) |
-       | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
-
-       - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
-    - For short‑answer style (no meaningful discrete options):
-       - Provide your **suggested answer** based on best practices and context.
-       - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
-       - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
-    - After the user answers:
-       - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
-       - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
-       - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
-       - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
-    - Stop asking further questions when:
-       - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
-       - User signals completion ("done", "good", "no more"), OR
-       - You reach 5 asked questions.
-    - Never reveal future queued questions in advance.
-    - If no valid questions exist at start, immediately report no critical ambiguities.
-
-5. Integration after EACH accepted answer (incremental update approach):
-    - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
-    - For the first integrated answer in this session:
-       - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
-       - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
-    - Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
-    - Then immediately apply the clarification to the most appropriate section(s):
-       - Functional ambiguity → Update or add a bullet in Functional Requirements.
-       - User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
-       - Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
-       - Non-functional constraint → Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
-       - Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
-       - Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
-    - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
-    - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
-    - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
-    - Keep each inserted clarification minimal and testable (avoid narrative drift).
-
-6. Validation (performed after EACH write plus final pass):
-   - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
-   - Total asked (accepted) questions ≤ 5.
-   - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
-   - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
-   - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
-   - Terminology consistency: same canonical term used across all updated sections.
-
-7. Write the updated spec back to `FEATURE_SPEC`.
-
-8. Report completion (after questioning loop ends or early termination):
-   - Number of questions asked & answered.
-   - Path to updated spec.
-   - Sections touched (list names).
-   - Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
-   - If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
-   - Suggested next command.
-
-Behavior rules:
-
-- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
-- If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
-- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
-- Avoid speculative tech stack questions unless the absence blocks functional clarity.
-- Respect user early termination signals ("stop", "done", "proceed").
-- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
-- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
-
-Context for prioritization: $ARGUMENTS
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.constitution.md b/plugins/ralph-speckit/.claude/commands/speckit.constitution.md
deleted file mode 100644
index 18302642..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.constitution.md
+++ /dev/null
@@ -1,82 +0,0 @@
----
-description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
-handoffs: 
-  - label: Build Specification
-    agent: speckit.specify
-    prompt: Implement the feature specification based on the updated constitution. I want to build...
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
-
-Follow this execution flow:
-
-1. Load the existing constitution template at `.specify/memory/constitution.md`.
-   - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
-   **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
-
-2. Collect/derive values for placeholders:
-   - If user input (conversation) supplies a value, use it.
-   - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
-   - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
-   - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
-     - MAJOR: Backward incompatible governance/principle removals or redefinitions.
-     - MINOR: New principle/section added or materially expanded guidance.
-     - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
-   - If version bump type ambiguous, propose reasoning before finalizing.
-
-3. Draft the updated constitution content:
-   - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
-   - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
-   - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non‑negotiable rules, explicit rationale if not obvious.
-   - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
-
-4. Consistency propagation checklist (convert prior checklist into active validations):
-   - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
-   - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
-   - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
-   - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
-   - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
-
-5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
-   - Version change: old → new
-   - List of modified principles (old title → new title if renamed)
-   - Added sections
-   - Removed sections
-   - Templates requiring updates (✅ updated / ⚠ pending) with file paths
-   - Follow-up TODOs if any placeholders intentionally deferred.
-
-6. Validation before final output:
-   - No remaining unexplained bracket tokens.
-   - Version line matches report.
-   - Dates ISO format YYYY-MM-DD.
-   - Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).
-
-7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).
-
-8. Output a final summary to the user with:
-   - New version and bump rationale.
-   - Any files flagged for manual follow-up.
-   - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
-
-Formatting & Style Requirements:
-
-- Use Markdown headings exactly as in the template (do not demote/promote levels).
-- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
-- Keep a single blank line between sections.
-- Avoid trailing whitespace.
-
-If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
-
-If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
-
-Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.implement.md b/plugins/ralph-speckit/.claude/commands/speckit.implement.md
deleted file mode 100644
index 41da7b93..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.implement.md
+++ /dev/null
@@ -1,135 +0,0 @@
----
-description: Execute the implementation plan by processing and executing all tasks defined in tasks.md
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
-   - Scan all checklist files in the checklists/ directory
-   - For each checklist, count:
-     - Total items: All lines matching `- [ ]` or `- [X]` or `- [x]`
-     - Completed items: Lines matching `- [X]` or `- [x]`
-     - Incomplete items: Lines matching `- [ ]`
-   - Create a status table:
-
-     ```text
-     | Checklist | Total | Completed | Incomplete | Status |
-     |-----------|-------|-----------|------------|--------|
-     | ux.md     | 12    | 12        | 0          | ✓ PASS |
-     | test.md   | 8     | 5         | 3          | ✗ FAIL |
-     | security.md | 6   | 6         | 0          | ✓ PASS |
-     ```
-
-   - Calculate overall status:
-     - **PASS**: All checklists have 0 incomplete items
-     - **FAIL**: One or more checklists have incomplete items
-
-   - **If any checklist is incomplete**:
-     - Display the table with incomplete item counts
-     - **STOP** and ask: "Some checklists are incomplete. Do you want to proceed with implementation anyway? (yes/no)"
-     - Wait for user response before continuing
-     - If user says "no" or "wait" or "stop", halt execution
-     - If user says "yes" or "proceed" or "continue", proceed to step 3
-
-   - **If all checklists are complete**:
-     - Display the table showing all checklists passed
-     - Automatically proceed to step 3
-
-3. Load and analyze the implementation context:
-   - **REQUIRED**: Read tasks.md for the complete task list and execution plan
-   - **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
-   - **IF EXISTS**: Read data-model.md for entities and relationships
-   - **IF EXISTS**: Read contracts/ for API specifications and test requirements
-   - **IF EXISTS**: Read research.md for technical decisions and constraints
-   - **IF EXISTS**: Read quickstart.md for integration scenarios
-
-4. **Project Setup Verification**:
-   - **REQUIRED**: Create/verify ignore files based on actual project setup:
-
-   **Detection & Creation Logic**:
-   - Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):
-
-     ```sh
-     git rev-parse --git-dir 2>/dev/null
-     ```
-
-   - Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
-   - Check if .eslintrc* exists → create/verify .eslintignore
-   - Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
-   - Check if .prettierrc* exists → create/verify .prettierignore
-   - Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
-   - Check if terraform files (*.tf) exist → create/verify .terraformignore
-   - Check if .helmignore needed (helm charts present) → create/verify .helmignore
-
-   **If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
-   **If ignore file missing**: Create with full pattern set for detected technology
-
-   **Common Patterns by Technology** (from plan.md tech stack):
-   - **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
-   - **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
-   - **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
-   - **C#/.NET**: `bin/`, `obj/`, `*.user`, `*.suo`, `packages/`
-   - **Go**: `*.exe`, `*.test`, `vendor/`, `*.out`
-   - **Ruby**: `.bundle/`, `log/`, `tmp/`, `*.gem`, `vendor/bundle/`
-   - **PHP**: `vendor/`, `*.log`, `*.cache`, `*.env`
-   - **Rust**: `target/`, `debug/`, `release/`, `*.rs.bk`, `*.rlib`, `*.prof*`, `.idea/`, `*.log`, `.env*`
-   - **Kotlin**: `build/`, `out/`, `.gradle/`, `.idea/`, `*.class`, `*.jar`, `*.iml`, `*.log`, `.env*`
-   - **C++**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.so`, `*.a`, `*.exe`, `*.dll`, `.idea/`, `*.log`, `.env*`
-   - **C**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.a`, `*.so`, `*.exe`, `Makefile`, `config.log`, `.idea/`, `*.log`, `.env*`
-   - **Swift**: `.build/`, `DerivedData/`, `*.swiftpm/`, `Packages/`
-   - **R**: `.Rproj.user/`, `.Rhistory`, `.RData`, `.Ruserdata`, `*.Rproj`, `packrat/`, `renv/`
-   - **Universal**: `.DS_Store`, `Thumbs.db`, `*.tmp`, `*.swp`, `.vscode/`, `.idea/`
-
-   **Tool-Specific Patterns**:
-   - **Docker**: `node_modules/`, `.git/`, `Dockerfile*`, `.dockerignore`, `*.log*`, `.env*`, `coverage/`
-   - **ESLint**: `node_modules/`, `dist/`, `build/`, `coverage/`, `*.min.js`
-   - **Prettier**: `node_modules/`, `dist/`, `build/`, `coverage/`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
-   - **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
-   - **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
-
-5. Parse tasks.md structure and extract:
-   - **Task phases**: Setup, Tests, Core, Integration, Polish
-   - **Task dependencies**: Sequential vs parallel execution rules
-   - **Task details**: ID, description, file paths, parallel markers [P]
-   - **Execution flow**: Order and dependency requirements
-
-6. Execute implementation following the task plan:
-   - **Phase-by-phase execution**: Complete each phase before moving to the next
-   - **Respect dependencies**: Run sequential tasks in order, parallel tasks [P] can run together  
-   - **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
-   - **File-based coordination**: Tasks affecting the same files must run sequentially
-   - **Validation checkpoints**: Verify each phase completion before proceeding
-
-7. Implementation execution rules:
-   - **Setup first**: Initialize project structure, dependencies, configuration
-   - **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
-   - **Core development**: Implement models, services, CLI commands, endpoints
-   - **Integration work**: Database connections, middleware, logging, external services
-   - **Polish and validation**: Unit tests, performance optimization, documentation
-
-8. Progress tracking and error handling:
-   - Report progress after each completed task
-   - Halt execution if any non-parallel task fails
-   - For parallel tasks [P], continue with successful tasks, report failed ones
-   - Provide clear error messages with context for debugging
-   - Suggest next steps if implementation cannot proceed
-   - **IMPORTANT** For completed tasks, make sure to mark the task off as [X] in the tasks file.
-
-9. Completion validation:
-   - Verify all required tasks are completed
-   - Check that implemented features match the original specification
-   - Validate that tests pass and coverage meets requirements
-   - Confirm the implementation follows the technical plan
-   - Report final status with summary of completed work
-
-Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.plan.md b/plugins/ralph-speckit/.claude/commands/speckit.plan.md
deleted file mode 100644
index e9e55999..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.plan.md
+++ /dev/null
@@ -1,89 +0,0 @@
----
-description: Execute the implementation planning workflow using the plan template to generate design artifacts.
-handoffs: 
-  - label: Create Tasks
-    agent: speckit.tasks
-    prompt: Break the plan into tasks
-    send: true
-  - label: Create Checklist
-    agent: speckit.checklist
-    prompt: Create a checklist for the following domain...
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
-
-3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
-   - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
-   - Fill Constitution Check section from constitution
-   - Evaluate gates (ERROR if violations unjustified)
-   - Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
-   - Phase 1: Generate data-model.md, contracts/, quickstart.md
-   - Phase 1: Update agent context by running the agent script
-   - Re-evaluate Constitution Check post-design
-
-4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
-
-## Phases
-
-### Phase 0: Outline & Research
-
-1. **Extract unknowns from Technical Context** above:
-   - For each NEEDS CLARIFICATION → research task
-   - For each dependency → best practices task
-   - For each integration → patterns task
-
-2. **Generate and dispatch research agents**:
-
-   ```text
-   For each unknown in Technical Context:
-     Task: "Research {unknown} for {feature context}"
-   For each technology choice:
-     Task: "Find best practices for {tech} in {domain}"
-   ```
-
-3. **Consolidate findings** in `research.md` using format:
-   - Decision: [what was chosen]
-   - Rationale: [why chosen]
-   - Alternatives considered: [what else evaluated]
-
-**Output**: research.md with all NEEDS CLARIFICATION resolved
-
-### Phase 1: Design & Contracts
-
-**Prerequisites:** `research.md` complete
-
-1. **Extract entities from feature spec** → `data-model.md`:
-   - Entity name, fields, relationships
-   - Validation rules from requirements
-   - State transitions if applicable
-
-2. **Generate API contracts** from functional requirements:
-   - For each user action → endpoint
-   - Use standard REST/GraphQL patterns
-   - Output OpenAPI/GraphQL schema to `/contracts/`
-
-3. **Agent context update**:
-   - Run `.specify/scripts/bash/update-agent-context.sh claude`
-   - These scripts detect which AI agent is in use
-   - Update the appropriate agent-specific context file
-   - Add only new technology from current plan
-   - Preserve manual additions between markers
-
-**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file
-
-## Key rules
-
-- Use absolute paths
-- ERROR on gate failures or unresolved clarifications
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.specify.md b/plugins/ralph-speckit/.claude/commands/speckit.specify.md
deleted file mode 100644
index 49abdcb7..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.specify.md
+++ /dev/null
@@ -1,258 +0,0 @@
----
-description: Create or update the feature specification from a natural language feature description.
-handoffs: 
-  - label: Build Technical Plan
-    agent: speckit.plan
-    prompt: Create a plan for the spec. I am building with...
-  - label: Clarify Spec Requirements
-    agent: speckit.clarify
-    prompt: Clarify specification requirements
-    send: true
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-The text the user typed after `/speckit.specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.
-
-Given that feature description, do this:
-
-1. **Generate a concise short name** (2-4 words) for the branch:
-   - Analyze the feature description and extract the most meaningful keywords
-   - Create a 2-4 word short name that captures the essence of the feature
-   - Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
-   - Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
-   - Keep it concise but descriptive enough to understand the feature at a glance
-   - Examples:
-     - "I want to add user authentication" → "user-auth"
-     - "Implement OAuth2 integration for the API" → "oauth2-api-integration"
-     - "Create a dashboard for analytics" → "analytics-dashboard"
-     - "Fix payment processing timeout bug" → "fix-payment-timeout"
-
-2. **Check for existing branches before creating new one**:
-
-   a. First, fetch all remote branches to ensure we have the latest information:
-
-      ```bash
-      git fetch --all --prune
-      ```
-
-   b. Find the highest feature number across all sources for the short-name:
-      - Remote branches: `git ls-remote --heads origin | grep -E 'refs/heads/[0-9]+-<short-name>$'`
-      - Local branches: `git branch | grep -E '^[* ]*[0-9]+-<short-name>$'`
-      - Specs directories: Check for directories matching `specs/[0-9]+-<short-name>`
-
-   c. Determine the next available number:
-      - Extract all numbers from all three sources
-      - Find the highest number N
-      - Use N+1 for the new branch number
-
-   d. Run the script `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS"` with the calculated number and short-name:
-      - Pass `--number N+1` and `--short-name "your-short-name"` along with the feature description
-      - Bash example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
-      - PowerShell example: `.specify/scripts/bash/create-new-feature.sh --json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
-
-   **IMPORTANT**:
-   - Check all three sources (remote branches, local branches, specs directories) to find the highest number
-   - Only match branches/directories with the exact short-name pattern
-   - If no existing branches/directories found with this short-name, start with number 1
-   - You must only ever run this script once per feature
-   - The JSON is provided in the terminal as output - always refer to it to get the actual content you're looking for
-   - The JSON output will contain BRANCH_NAME and SPEC_FILE paths
-   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot")
-
-3. Load `.specify/templates/spec-template.md` to understand required sections.
-
-4. Follow this execution flow:
-
-    1. Parse user description from Input
-       If empty: ERROR "No feature description provided"
-    2. Extract key concepts from description
-       Identify: actors, actions, data, constraints
-    3. For unclear aspects:
-       - Make informed guesses based on context and industry standards
-       - Only mark with [NEEDS CLARIFICATION: specific question] if:
-         - The choice significantly impacts feature scope or user experience
-         - Multiple reasonable interpretations exist with different implications
-         - No reasonable default exists
-       - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
-       - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
-    4. Fill User Scenarios & Testing section
-       If no clear user flow: ERROR "Cannot determine user scenarios"
-    5. Generate Functional Requirements
-       Each requirement must be testable
-       Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
-    6. Define Success Criteria
-       Create measurable, technology-agnostic outcomes
-       Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
-       Each criterion must be verifiable without implementation details
-    7. Identify Key Entities (if data involved)
-    8. Return: SUCCESS (spec ready for planning)
-
-5. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.
-
-6. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:
-
-   a. **Create Spec Quality Checklist**: Generate a checklist file at `FEATURE_DIR/checklists/requirements.md` using the checklist template structure with these validation items:
-
-      ```markdown
-      # Specification Quality Checklist: [FEATURE NAME]
-      
-      **Purpose**: Validate specification completeness and quality before proceeding to planning
-      **Created**: [DATE]
-      **Feature**: [Link to spec.md]
-      
-      ## Content Quality
-      
-      - [ ] No implementation details (languages, frameworks, APIs)
-      - [ ] Focused on user value and business needs
-      - [ ] Written for non-technical stakeholders
-      - [ ] All mandatory sections completed
-      
-      ## Requirement Completeness
-      
-      - [ ] No [NEEDS CLARIFICATION] markers remain
-      - [ ] Requirements are testable and unambiguous
-      - [ ] Success criteria are measurable
-      - [ ] Success criteria are technology-agnostic (no implementation details)
-      - [ ] All acceptance scenarios are defined
-      - [ ] Edge cases are identified
-      - [ ] Scope is clearly bounded
-      - [ ] Dependencies and assumptions identified
-      
-      ## Feature Readiness
-      
-      - [ ] All functional requirements have clear acceptance criteria
-      - [ ] User scenarios cover primary flows
-      - [ ] Feature meets measurable outcomes defined in Success Criteria
-      - [ ] No implementation details leak into specification
-      
-      ## Notes
-      
-      - Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
-      ```
-
-   b. **Run Validation Check**: Review the spec against each checklist item:
-      - For each item, determine if it passes or fails
-      - Document specific issues found (quote relevant spec sections)
-
-   c. **Handle Validation Results**:
-
-      - **If all items pass**: Mark checklist complete and proceed to step 6
-
-      - **If items fail (excluding [NEEDS CLARIFICATION])**:
-        1. List the failing items and specific issues
-        2. Update the spec to address each issue
-        3. Re-run validation until all items pass (max 3 iterations)
-        4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user
-
-      - **If [NEEDS CLARIFICATION] markers remain**:
-        1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
-        2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
-        3. For each clarification needed (max 3), present options to user in this format:
-
-           ```markdown
-           ## Question [N]: [Topic]
-           
-           **Context**: [Quote relevant spec section]
-           
-           **What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
-           
-           **Suggested Answers**:
-           
-           | Option | Answer | Implications |
-           |--------|--------|--------------|
-           | A      | [First suggested answer] | [What this means for the feature] |
-           | B      | [Second suggested answer] | [What this means for the feature] |
-           | C      | [Third suggested answer] | [What this means for the feature] |
-           | Custom | Provide your own answer | [Explain how to provide custom input] |
-           
-           **Your choice**: _[Wait for user response]_
-           ```
-
-        4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
-           - Use consistent spacing with pipes aligned
-           - Each cell should have spaces around content: `| Content |` not `|Content|`
-           - Header separator must have at least 3 dashes: `|--------|`
-           - Test that the table renders correctly in markdown preview
-        5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
-        6. Present all questions together before waiting for responses
-        7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
-        8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
-        9. Re-run validation after all clarifications are resolved
-
-   d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
-
-7. Report completion with branch name, spec file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).
-
-**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.
-
-## General Guidelines
-
-## Quick Guidelines
-
-- Focus on **WHAT** users need and **WHY**.
-- Avoid HOW to implement (no tech stack, APIs, code structure).
-- Written for business stakeholders, not developers.
-- DO NOT create any checklists that are embedded in the spec. That will be a separate command.
-
-### Section Requirements
-
-- **Mandatory sections**: Must be completed for every feature
-- **Optional sections**: Include only when relevant to the feature
-- When a section doesn't apply, remove it entirely (don't leave as "N/A")
-
-### For AI Generation
-
-When creating this spec from a user prompt:
-
-1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
-2. **Document assumptions**: Record reasonable defaults in the Assumptions section
-3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
-   - Significantly impact feature scope or user experience
-   - Have multiple reasonable interpretations with different implications
-   - Lack any reasonable default
-4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
-5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
-6. **Common areas needing clarification** (only if no reasonable default exists):
-   - Feature scope and boundaries (include/exclude specific use cases)
-   - User types and permissions (if multiple conflicting interpretations possible)
-   - Security/compliance requirements (when legally/financially significant)
-
-**Examples of reasonable defaults** (don't ask about these):
-
-- Data retention: Industry-standard practices for the domain
-- Performance targets: Standard web/mobile app expectations unless specified
-- Error handling: User-friendly messages with appropriate fallbacks
-- Authentication method: Standard session-based or OAuth2 for web apps
-- Integration patterns: RESTful APIs unless specified otherwise
-
-### Success Criteria Guidelines
-
-Success criteria must be:
-
-1. **Measurable**: Include specific metrics (time, percentage, count, rate)
-2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
-3. **User-focused**: Describe outcomes from user/business perspective, not system internals
-4. **Verifiable**: Can be tested/validated without knowing implementation details
-
-**Good examples**:
-
-- "Users can complete checkout in under 3 minutes"
-- "System supports 10,000 concurrent users"
-- "95% of searches return results in under 1 second"
-- "Task completion rate improves by 40%"
-
-**Bad examples** (implementation-focused):
-
-- "API response time is under 200ms" (too technical, use "Users see results instantly")
-- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
-- "React components render efficiently" (framework-specific)
-- "Redis cache hit rate above 80%" (technology-specific)
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.tasks.md b/plugins/ralph-speckit/.claude/commands/speckit.tasks.md
deleted file mode 100644
index f64e86e7..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.tasks.md
+++ /dev/null
@@ -1,137 +0,0 @@
----
-description: Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts.
-handoffs: 
-  - label: Analyze For Consistency
-    agent: speckit.analyze
-    prompt: Run a project analysis for consistency
-    send: true
-  - label: Implement Project
-    agent: speckit.implement
-    prompt: Start the implementation in phases
-    send: true
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-
-2. **Load design documents**: Read from FEATURE_DIR:
-   - **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)
-   - **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios)
-   - Note: Not all projects have all documents. Generate tasks based on what's available.
-
-3. **Execute task generation workflow**:
-   - Load plan.md and extract tech stack, libraries, project structure
-   - Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
-   - If data-model.md exists: Extract entities and map to user stories
-   - If contracts/ exists: Map endpoints to user stories
-   - If research.md exists: Extract decisions for setup tasks
-   - Generate tasks organized by user story (see Task Generation Rules below)
-   - Generate dependency graph showing user story completion order
-   - Create parallel execution examples per user story
-   - Validate task completeness (each user story has all needed tasks, independently testable)
-
-4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
-   - Correct feature name from plan.md
-   - Phase 1: Setup tasks (project initialization)
-   - Phase 2: Foundational tasks (blocking prerequisites for all user stories)
-   - Phase 3+: One phase per user story (in priority order from spec.md)
-   - Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
-   - Final Phase: Polish & cross-cutting concerns
-   - All tasks must follow the strict checklist format (see Task Generation Rules below)
-   - Clear file paths for each task
-   - Dependencies section showing story completion order
-   - Parallel execution examples per story
-   - Implementation strategy section (MVP first, incremental delivery)
-
-5. **Report**: Output path to generated tasks.md and summary:
-   - Total task count
-   - Task count per user story
-   - Parallel opportunities identified
-   - Independent test criteria for each story
-   - Suggested MVP scope (typically just User Story 1)
-   - Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
-
-Context for task generation: $ARGUMENTS
-
-The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
-
-## Task Generation Rules
-
-**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.
-
-**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
-
-### Checklist Format (REQUIRED)
-
-Every task MUST strictly follow this format:
-
-```text
-- [ ] [TaskID] [P?] [Story?] Description with file path
-```
-
-**Format Components**:
-
-1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
-2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
-3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
-4. **[Story] label**: REQUIRED for user story phase tasks only
-   - Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
-   - Setup phase: NO story label
-   - Foundational phase: NO story label  
-   - User Story phases: MUST have story label
-   - Polish phase: NO story label
-5. **Description**: Clear action with exact file path
-
-**Examples**:
-
-- ✅ CORRECT: `- [ ] T001 Create project structure per implementation plan`
-- ✅ CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
-- ✅ CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
-- ✅ CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
-- ❌ WRONG: `- [ ] Create User model` (missing ID and Story label)
-- ❌ WRONG: `T001 [US1] Create model` (missing checkbox)
-- ❌ WRONG: `- [ ] [US1] Create User model` (missing Task ID)
-- ❌ WRONG: `- [ ] T001 [US1] Create model` (missing file path)
-
-### Task Organization
-
-1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
-   - Each user story (P1, P2, P3...) gets its own phase
-   - Map all related components to their story:
-     - Models needed for that story
-     - Services needed for that story
-     - Endpoints/UI needed for that story
-     - If tests requested: Tests specific to that story
-   - Mark story dependencies (most stories should be independent)
-
-2. **From Contracts**:
-   - Map each contract/endpoint → to the user story it serves
-   - If tests requested: Each contract → contract test task [P] before implementation in that story's phase
-
-3. **From Data Model**:
-   - Map each entity to the user story(ies) that need it
-   - If entity serves multiple stories: Put in earliest story or Setup phase
-   - Relationships → service layer tasks in appropriate story phase
-
-4. **From Setup/Infrastructure**:
-   - Shared infrastructure → Setup phase (Phase 1)
-   - Foundational/blocking tasks → Foundational phase (Phase 2)
-   - Story-specific setup → within that story's phase
-
-### Phase Structure
-
-- **Phase 1**: Setup (project initialization)
-- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
-- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
-  - Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
-  - Each phase should be a complete, independently testable increment
-- **Final Phase**: Polish & Cross-Cutting Concerns
diff --git a/plugins/ralph-speckit/.claude/commands/speckit.taskstoissues.md b/plugins/ralph-speckit/.claude/commands/speckit.taskstoissues.md
deleted file mode 100644
index 07991911..00000000
--- a/plugins/ralph-speckit/.claude/commands/speckit.taskstoissues.md
+++ /dev/null
@@ -1,30 +0,0 @@
----
-description: Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts.
-tools: ['github/github-mcp-server/issue_write']
----
-
-## User Input
-
-```text
-$ARGUMENTS
-```
-
-You **MUST** consider the user input before proceeding (if not empty).
-
-## Outline
-
-1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
-1. From the executed script, extract the path to **tasks**.
-1. Get the Git remote by running:
-
-```bash
-git config --get remote.origin.url
-```
-
-> [!CAUTION]
-> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL
-
-1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.
-
-> [!CAUTION]
-> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
new file mode 100644
index 00000000..8c36b3ed
--- /dev/null
+++ b/specs/refactor-plugins/.progress.md
@@ -0,0 +1,99 @@
+# Progress: refactor-plugins
+
+## Original Goal
+Refactor the plugins in here using the /plugin-dev skills
+
+## Status
+- Phase: research
+- Started: 2026-01-29
+
+## Interview Format
+- Version: 1.0
+
+## Intent Classification
+- Type: REFACTOR
+- Confidence: high (1 keyword matched)
+- Min questions: 3
+- Max questions: 5
+- Keywords matched: refactor
+
+## Interview Responses
+
+### Goal Interview (from start.md)
+- Problem: Improve all plugins according to plugin-dev skills best practices
+- Constraints: Use only plugin-dev skills to improve all plugins
+- Success criteria: Plugins follow plugin-dev best practices
+
+## Completed Tasks
+
+- [x] Read all plugin-dev skills (plugin-structure, command-development, skill-development, agent-development, hook-development)
+- [x] Analyzed ralph-specum plugin structure (plugin.json, commands, agents, skills, hooks)
+- [x] Analyzed ralph-speckit plugin structure (plugin.json, commands, agents, skills, hooks)
+- [x] Created research.md with comprehensive gap analysis
+- [x] 1.1 Add color and examples to ralph-specum agents (8 files)
+- [x] 1.2 Add color and examples to ralph-speckit agents (6 files)
+- [x] 1.4 Add version to ralph-specum skills (6 files)
+- [x] 1.5 Add version and fix descriptions for ralph-speckit skills (4 files)
+- [x] 1.6 Add matcher field to hooks (2 files) - 8301b94
+- [x] 1.8 Add name field to ralph-speckit modern commands (5 files) - 34bd45b
+- [x] 1.9 Migrate legacy commands to commands/ directory (8 files) - 0f617ad
+- [x] 1.10 Remove legacy commands directory - b4b6a10
+
+## Current Task
+
+Awaiting next task
+
+## Learnings
+
+### Verification: 1.7 [VERIFY] Quality checkpoint: skills and hooks
+- Status: PASS
+- Verified: 10/10 skills have `version:` field
+- Verified: 2/2 hooks.json files have `"matcher"` field
+- Plugins checked: ralph-specum (6 skills, 1 hooks.json), ralph-speckit (4 skills, 1 hooks.json)
+- No fixes needed
+
+### Verification: 1.3 [VERIFY] Quality checkpoint: agent metadata
+- Status: PASS
+- Verified: 14/14 agents have `color:` field and 2+ `<example>` blocks
+- Plugins checked: ralph-specum (8 agents), ralph-speckit (6 agents)
+- No fixes needed
+
+- Agents require `color` field (blue, cyan, green, yellow, magenta, red) - this was missing in all agents
+- Agent descriptions must include `<example>` blocks with Context/user/assistant/commentary format for proper triggering
+- Hook entries require `matcher` field even when applying to all events (use `"*"`)
+- Plugin hooks.json format requires outer wrapper: `{"description": "...", "hooks": {...}}`
+- Skills require `version` field in frontmatter and third-person description with trigger phrases
+- ralph-speckit has legacy command structure in `.claude/commands/` that should be consolidated to `commands/`
+- Requirements phase: 5 user stories created covering agents, skills, hooks, commands, and validation
+- Requirements phase: 10 functional requirements prioritized P0-P2 (P0=critical, P1=high, P2=nice-to-have)
+- Requirements phase: Color grouping by function recommended (analysis=blue/cyan, execution=green, validation=yellow, transformation=magenta)
+- Requirements phase: Backward compatibility is critical NFR - zero breaking changes to existing workflows
+- Design phase: 36 files total need changes (32 edits, 9 creates, 9 deletes)
+- Design phase: Color assignments finalized - blue/cyan for analysis, green for execution, yellow for validation, magenta for transformation
+- Design phase: Legacy command migration requires stripping "speckit." prefix for consistency with modern commands
+- Design phase: Validation script in bash checks 5 compliance areas (color, examples, version, matcher, legacy dir)
+- Design phase: Git provides complete rollback strategy - no destructive external state changes
+- Design update: Commands are currently 100-1200 lines - implement.md is 1200+ lines alone
+- Design update: Consolidation can extract ~1500 lines from commands/agents into 11 new skills
+- Design update: implement.md can be reduced from 1200 to ~150 lines by referencing skills
+- Design update: Skill reference pattern uses `<skill-reference>` or inline `**Apply skill**: path` format
+- Design update: Phase B adds 11 new skills: failure-recovery, verification-layers, coordinator-pattern, branch-management, intent-classification, spec-scanner, parallel-research, phase-rules, commit-discipline, quality-checkpoints, quality-commands
+- Design update: Commands become thin orchestrators (~80-200 lines) that reference skills for heavy-lifting
+- Design update: Phase A (metadata) can be deployed independently before Phase B (consolidation)
+
+## Next
+
+Task 1.11 [VERIFY] Quality checkpoint: commands
+
+## Blockers
+
+(none)
+
+## Task Planning Notes
+
+- Tasks grouped to reduce count: 8 ralph-specum agents = 1 task, 6 ralph-speckit agents = 1 task
+- Total: 32 tasks (14 Phase A, 14 Phase B, 1 Phase 3, 3 Phase 4/5)
+- Quality checkpoints every 2-3 tasks as required
+- Phase A can be deployed independently (metadata fixes only)
+- Phase B has dependency: skills must be created before commands/agents simplified
+- Verification commands use bash loops to validate multiple files atomically
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 2575c871..0bc58e6d 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -161,7 +161,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-4.2, AC-4.5_
   - _Design: Legacy commands migration_
 
-- [ ] 1.10 Remove legacy commands directory
+- [x] 1.10 Remove legacy commands directory
   - **Do**:
     1. Verify all commands migrated successfully (from 1.9)
     2. Delete `.claude/commands/` directory from ralph-speckit

From c504a919d5d56d44d730fa785edd6e1076fc90bd Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:49:20 +0200
Subject: [PATCH 12/37] feat(scripts): add plugin compliance validation script

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 scripts/validate-plugins.sh         | 126 ++++++++++++++++++++++++++++
 specs/refactor-plugins/.progress.md |  10 ++-
 specs/refactor-plugins/tasks.md     |   2 +-
 3 files changed, 136 insertions(+), 2 deletions(-)
 create mode 100755 scripts/validate-plugins.sh

diff --git a/scripts/validate-plugins.sh b/scripts/validate-plugins.sh
new file mode 100755
index 00000000..240b02f8
--- /dev/null
+++ b/scripts/validate-plugins.sh
@@ -0,0 +1,126 @@
+#!/usr/bin/env bash
+#
+# Plugin Compliance Validation Script
+# Validates that all plugins follow plugin-dev best practices
+#
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
+PLUGINS_DIR="$PROJECT_ROOT/plugins"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m' # No Color
+
+errors=0
+warnings=0
+
+log_pass() {
+    echo -e "${GREEN}✓${NC} $1"
+}
+
+log_fail() {
+    echo -e "${RED}✗${NC} $1"
+    ((errors++))
+}
+
+log_warn() {
+    echo -e "${YELLOW}!${NC} $1"
+    ((warnings++))
+}
+
+log_section() {
+    echo ""
+    echo "=== $1 ==="
+}
+
+# Check 1: Agents have color field
+log_section "Checking agents have color field"
+for agent_file in "$PLUGINS_DIR"/*/agents/*.md; do
+    if [[ -f "$agent_file" ]]; then
+        agent_name=$(basename "$agent_file")
+        plugin_name=$(basename "$(dirname "$(dirname "$agent_file")")")
+        if grep -q "^color:" "$agent_file"; then
+            log_pass "$plugin_name/$agent_name has color field"
+        else
+            log_fail "$plugin_name/$agent_name missing color field"
+        fi
+    fi
+done
+
+# Check 2: Agents have 2+ example blocks
+log_section "Checking agents have 2+ example blocks"
+for agent_file in "$PLUGINS_DIR"/*/agents/*.md; do
+    if [[ -f "$agent_file" ]]; then
+        agent_name=$(basename "$agent_file")
+        plugin_name=$(basename "$(dirname "$(dirname "$agent_file")")")
+        example_count=$(grep -c "<example>" "$agent_file" || echo "0")
+        if [[ "$example_count" -ge 2 ]]; then
+            log_pass "$plugin_name/$agent_name has $example_count example blocks"
+        else
+            log_fail "$plugin_name/$agent_name has only $example_count example blocks (need 2+)"
+        fi
+    fi
+done
+
+# Check 3: Skills have version field
+log_section "Checking skills have version field"
+for skill_file in "$PLUGINS_DIR"/*/skills/*/SKILL.md; do
+    if [[ -f "$skill_file" ]]; then
+        skill_name=$(basename "$(dirname "$skill_file")")
+        plugin_name=$(basename "$(dirname "$(dirname "$(dirname "$skill_file")")")")
+        if grep -q "^version:" "$skill_file"; then
+            log_pass "$plugin_name/skills/$skill_name has version field"
+        else
+            log_fail "$plugin_name/skills/$skill_name missing version field"
+        fi
+    fi
+done
+
+# Check 4: Hooks have matcher field
+log_section "Checking hooks have matcher field"
+for hooks_file in "$PLUGINS_DIR"/*/hooks/hooks.json; do
+    if [[ -f "$hooks_file" ]]; then
+        plugin_name=$(basename "$(dirname "$(dirname "$hooks_file")")")
+        if grep -q '"matcher"' "$hooks_file"; then
+            log_pass "$plugin_name/hooks/hooks.json has matcher field"
+        else
+            log_fail "$plugin_name/hooks/hooks.json missing matcher field"
+        fi
+    fi
+done
+
+# Check 5: No legacy commands directory
+log_section "Checking for legacy commands directories"
+for plugin_dir in "$PLUGINS_DIR"/*/; do
+    if [[ -d "$plugin_dir" ]]; then
+        plugin_name=$(basename "$plugin_dir")
+        legacy_dir="$plugin_dir.claude/commands"
+        if [[ -d "$legacy_dir" ]]; then
+            log_fail "$plugin_name has legacy commands directory at .claude/commands/"
+        else
+            log_pass "$plugin_name has no legacy commands directory"
+        fi
+    fi
+done
+
+# Summary
+log_section "Summary"
+echo ""
+if [[ $errors -eq 0 ]]; then
+    echo -e "${GREEN}All compliance checks passed!${NC}"
+    if [[ $warnings -gt 0 ]]; then
+        echo -e "${YELLOW}$warnings warning(s)${NC}"
+    fi
+    exit 0
+else
+    echo -e "${RED}$errors error(s) found${NC}"
+    if [[ $warnings -gt 0 ]]; then
+        echo -e "${YELLOW}$warnings warning(s)${NC}"
+    fi
+    exit 1
+fi
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 8c36b3ed..fcc0d35b 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -38,6 +38,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 1.8 Add name field to ralph-speckit modern commands (5 files) - 34bd45b
 - [x] 1.9 Migrate legacy commands to commands/ directory (8 files) - 0f617ad
 - [x] 1.10 Remove legacy commands directory - b4b6a10
+- [x] 1.12 Create validation script - 16fdced
 
 ## Current Task
 
@@ -45,6 +46,13 @@ Awaiting next task
 
 ## Learnings
 
+### Verification: 1.11 [VERIFY] Quality checkpoint: commands
+- Status: PASS
+- Verified: 13/13 commands have `name:` field
+- Verified: Legacy directory `plugins/ralph-speckit/.claude/commands/` does NOT exist
+- Commands checked: analyze.md, cancel.md, checklist.md, clarify.md, constitution.md, implement.md, plan.md, specify.md, start.md, status.md, switch.md, tasks.md, taskstoissues.md
+- No fixes needed
+
 ### Verification: 1.7 [VERIFY] Quality checkpoint: skills and hooks
 - Status: PASS
 - Verified: 10/10 skills have `version:` field
@@ -83,7 +91,7 @@ Awaiting next task
 
 ## Next
 
-Task 1.11 [VERIFY] Quality checkpoint: commands
+Task 1.13 Update CLAUDE.md with best practices
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 0bc58e6d..875f4869 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -182,7 +182,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
 
 ### A5: Validation and Documentation
 
-- [ ] 1.12 Create validation script
+- [x] 1.12 Create validation script
   - **Do**:
     1. Create `scripts/validate-plugins.sh` with compliance checks:
        - Agents have color field

From 296708d95e580a5bb8e418e1d7df1e02b7447ba2 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:50:30 +0200
Subject: [PATCH 13/37] docs: add plugin best practices reference to CLAUDE.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 CLAUDE.md                           | 22 ++++++++++++++++++++++
 specs/refactor-plugins/.progress.md |  3 ++-
 specs/refactor-plugins/tasks.md     |  2 +-
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 0848015d..d574656b 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -35,6 +35,28 @@ claude --plugin-dir ./plugins/ralph-specum
 
 No build step required. Changes take effect on Claude Code restart.
 
+### Plugin Validation
+
+Run the validation script to check all plugins for compliance:
+
+```bash
+bash scripts/validate-plugins.sh
+```
+
+This checks:
+- Agent `color:` fields and `<example>` blocks
+- Skill `version:` fields
+- Hook `matcher` fields
+- No legacy command directories
+
+### Agent Color Conventions
+
+Assign colors by function for visual consistency:
+- **blue/cyan** - Analysis agents (research-analyst, architect-reviewer)
+- **green** - Execution agents (spec-executor)
+- **yellow** - Validation agents (qa-engineer)
+- **magenta** - Transformation agents (task-planner, product-manager)
+
 ### Plugin Development Skills (ALWAYS USE)
 
 When creating or modifying plugin components, **ALWAYS** use the `plugin-dev` skills for guidance:
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index fcc0d35b..e940e3ca 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -39,6 +39,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 1.9 Migrate legacy commands to commands/ directory (8 files) - 0f617ad
 - [x] 1.10 Remove legacy commands directory - b4b6a10
 - [x] 1.12 Create validation script - 16fdced
+- [x] 1.13 Update CLAUDE.md with best practices
 
 ## Current Task
 
@@ -91,7 +92,7 @@ Awaiting next task
 
 ## Next
 
-Task 1.13 Update CLAUDE.md with best practices
+Task 1.14 [VERIFY] Phase A complete validation
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 875f4869..e0b5b2d9 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -200,7 +200,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-5.1, AC-5.2, AC-5.3, AC-5.4_
   - _Design: Validation Script Design_
 
-- [ ] 1.13 Update CLAUDE.md with best practices
+- [x] 1.13 Update CLAUDE.md with best practices
   - **Do**:
     1. Add section referencing plugin-dev skills for best practices
     2. Include validation script usage

From 7acb9251de54ae3e2463f4028839deb236affcfb Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:53:45 +0200
Subject: [PATCH 14/37] feat(ralph-specum): add failure-recovery skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/failure-recovery/SKILL.md          | 508 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |  15 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 523 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/failure-recovery/SKILL.md

diff --git a/plugins/ralph-specum/skills/failure-recovery/SKILL.md b/plugins/ralph-specum/skills/failure-recovery/SKILL.md
new file mode 100644
index 00000000..c0cda598
--- /dev/null
+++ b/plugins/ralph-specum/skills/failure-recovery/SKILL.md
@@ -0,0 +1,508 @@
+---
+name: failure-recovery
+version: 0.1.0
+description: This skill should be used when the user asks about "iterative recovery", "fix task generation", "recovery mode", "parse failure output", "fixTaskMap", "automatic failure recovery", "recovery orchestration", or needs guidance on handling task failures with automatic fix task generation.
+---
+
+# Failure Recovery
+
+Orchestrates iterative failure recovery when tasks fail during spec execution. When enabled via `--recovery-mode`, automatically generates fix tasks instead of stopping on failure.
+
+## Overview
+
+Recovery mode is an opt-in enhancement to the standard retry-then-stop behavior. When enabled:
+- Failed tasks trigger automatic fix task generation
+- Fix tasks are inserted into tasks.md and executed
+- Original task is retried after fix task completes
+- Process repeats until success or max fix limit reached
+
+**Backwards Compatibility**: `recoveryMode` defaults to false. When false or missing, existing behavior (retry then stop) is preserved exactly.
+
+## Entry Point
+
+When spec-executor does NOT output TASK_COMPLETE:
+
+1. Check if `recoveryMode` is true in .ralph-state.json
+2. If recoveryMode is false, undefined, or missing: skip to standard "Max Retries Reached" error (existing behavior preserved)
+3. If recoveryMode is explicitly true: proceed with iterative recovery
+
+## Parse Failure Output
+
+Extract error details from spec-executor failure output.
+
+### Failure Output Pattern
+
+Spec-executor outputs failures in this format:
+
+```text
+Task X.Y: [task name] FAILED
+- Error: [description]
+- Attempted fix: [what was tried]
+- Status: Blocked, needs manual intervention
+```
+
+### Parsing Logic
+
+1. **Check for FAILED marker**:
+   - Look for pattern: `Task \d+\.\d+:.*FAILED`
+   - If found, proceed to extract details
+   - If not found, use generic failure: "Task did not complete"
+
+2. **Extract Error Details**:
+   - Match `- Error: (.*)` to get error description
+   - Match `- Attempted fix: (.*)` to get fix attempt details
+   - Match `- Status: (.*)` to get status message
+
+3. **Build Failure Object**:
+   ```json
+   {
+     "taskId": "<X.Y from match>",
+     "failed": true,
+     "error": "<extracted from Error: line>",
+     "attemptedFix": "<extracted from Attempted fix: line>",
+     "status": "<extracted from Status: line>",
+     "rawOutput": "<full spec-executor output for context>"
+   }
+   ```
+
+4. **Handle Missing Fields**:
+   - If Error: line missing, use "Task execution failed"
+   - If Attempted fix: line missing, use "No fix attempted"
+   - If Status: line missing, use "Unknown status"
+
+### Example Parsing
+
+Input (spec-executor output):
+```text
+Task 1.3: Add failure parser FAILED
+- Error: File not found: src/parser.ts
+- Attempted fix: Checked alternate paths
+- Status: Blocked, needs manual intervention
+```
+
+Parsed failure object:
+```json
+{
+  "taskId": "1.3",
+  "failed": true,
+  "error": "File not found: src/parser.ts",
+  "attemptedFix": "Checked alternate paths",
+  "status": "Blocked, needs manual intervention",
+  "rawOutput": "..."
+}
+```
+
+## Fix Task Generator
+
+Generate a fix task from failure details when recovery mode is enabled.
+
+### Check Fix Task Limits
+
+Before generating a fix task:
+
+1. Read `fixTaskMap` from .ralph-state.json
+2. Check if `fixTaskMap[taskId].attempts >= maxFixTasksPerOriginal`
+3. If limit reached:
+   - Output error: "ERROR: Max fix attempts ($maxFixTasksPerOriginal) reached for task $taskId"
+   - Show fix history: "Fix attempts: $fixTaskMap[taskId].fixTaskIds"
+   - Do NOT output ALL_TASKS_COMPLETE
+   - STOP execution
+
+### Generate Fix Task Markdown
+
+Use the failure object to create a fix task:
+
+```text
+Fix Task ID: $taskId.$attemptNumber
+  where attemptNumber = fixTaskMap[taskId].attempts + 1 (or 1 if first attempt)
+
+Fix Task Format:
+- [ ] $taskId.$attemptNumber [FIX $taskId] Fix: $errorSummary
+  - **Do**: Address the error: $failure.error
+    1. Analyze the failure: $failure.attemptedFix
+    2. Review related code in Files list
+    3. Implement fix for: $failure.error
+  - **Files**: $originalTask.files
+  - **Done when**: Error "$failure.error" no longer occurs
+  - **Verify**: $originalTask.verify
+  - **Commit**: `fix($scope): address $errorType from task $taskId`
+```
+
+### Field Derivation
+
+| Field                | Source                              | Fallback                       |
+|----------------------|-------------------------------------|--------------------------------|
+| errorSummary         | First 50 chars of failure.error     | "task $taskId failure"         |
+| failure.error        | Parsed from Error: line             | "Task execution failed"        |
+| failure.attemptedFix | Parsed from Attempted fix: line     | "No previous fix attempted"    |
+| originalTask.files   | Files field from original task      | Same directory as original     |
+| originalTask.verify  | Verify field from original task     | "echo 'Verify manually'"       |
+| $scope               | Derived from spec name or task area | "recovery"                     |
+| $errorType           | Error category (e.g., "syntax", "missing file") | "error"           |
+
+### Example Fix Task Generation
+
+Original task (failed):
+```markdown
+- [ ] 1.3 Add failure parser
+  - **Do**: Add parsing logic to implement.md
+  - **Files**: plugins/ralph-specum/commands/implement.md
+  - **Done when**: Parser extracts error details
+  - **Verify**: grep -q "Parse Failure" implement.md
+  - **Commit**: feat(coordinator): add failure parser
+```
+
+Failure object:
+```json
+{
+  "taskId": "1.3",
+  "error": "File not found: src/parser.ts",
+  "attemptedFix": "Checked alternate paths"
+}
+```
+
+Generated fix task:
+```markdown
+- [ ] 1.3.1 [FIX 1.3] Fix: File not found: src/parser.ts
+  - **Do**: Address the error: File not found: src/parser.ts
+    1. Analyze the failure: Checked alternate paths
+    2. Review related code in Files list
+    3. Implement fix for: File not found: src/parser.ts
+  - **Files**: plugins/ralph-specum/commands/implement.md
+  - **Done when**: Error "File not found: src/parser.ts" no longer occurs
+  - **Verify**: grep -q "Parse Failure" implement.md
+  - **Commit**: `fix(recovery): address missing file from task 1.3`
+```
+
+### Insert Fix Task into tasks.md
+
+Use the Edit tool to cleanly insert the fix task after the current task block.
+
+**Algorithm**:
+
+1. **Read tasks.md content** using Read tool
+
+2. **Locate current task start**:
+   - Search for pattern: `- [ ] $taskId` or `- [x] $taskId`
+   - Store the line number as `taskStartLine`
+
+3. **Find current task block end**:
+   - Scan forward from `taskStartLine + 1`
+   - Task block ends at first line matching:
+     - `- [ ]` (next task start)
+     - `- [x]` (next completed task)
+     - `## Phase` (next phase header)
+     - End of file
+   - Store this line as `insertPosition`
+
+4. **Build insertion content**:
+   - Start with newline if needed for spacing
+   - Add the complete fix task markdown block
+
+5. **Insert using Edit tool**:
+   - Use Edit tool with `old_string` = content at insertion point
+   - `new_string` = fix task markdown + original content at insertion point
+   - This places fix task immediately after original task block
+
+6. **Update state totalTasks**:
+   - Read .ralph-state.json
+   - Increment `totalTasks` by 1
+   - Write updated state
+
+**Example Insertion**:
+
+Before insertion (task 1.3 failed):
+
+```markdown
+- [ ] 1.3 Add failure parser
+  - **Do**: Add parsing logic
+  - **Files**: implement.md
+  - **Verify**: grep pattern
+  - **Commit**: feat: add parser
+
+- [ ] 1.4 Next task
+```
+
+After insertion:
+
+```markdown
+- [ ] 1.3 Add failure parser
+  - **Do**: Add parsing logic
+  - **Files**: implement.md
+  - **Verify**: grep pattern
+  - **Commit**: feat: add parser
+
+- [ ] 1.3.1 [FIX 1.3] Fix: File not found error
+  - **Do**: Address the error: File not found
+    1. Analyze the failure: Checked alternate paths
+    2. Review related code in Files list
+    3. Implement fix for: File not found
+  - **Files**: implement.md
+  - **Done when**: Error "File not found" no longer occurs
+  - **Verify**: grep pattern
+  - **Commit**: `fix(recovery): address missing file from task 1.3`
+
+- [ ] 1.4 Next task
+```
+
+## Recovery State Management
+
+Track fix task attempts and history using fixTaskMap in .ralph-state.json.
+
+### State Structure
+
+```json
+{
+  "phase": "execution",
+  "taskIndex": 2,
+  "totalTasks": 10,
+  "recoveryMode": true,
+  "maxFixTasksPerOriginal": 3,
+  "fixTaskMap": {
+    "1.3": {
+      "attempts": 2,
+      "fixTaskIds": ["1.3.1", "1.3.2"],
+      "lastError": "Syntax error in parser.ts line 42"
+    }
+  }
+}
+```
+
+### Update State After Generation
+
+After generating a fix task:
+1. Increment `fixTaskMap[taskId].attempts`
+2. Add fix task ID to `fixTaskMap[taskId].fixTaskIds` array
+3. Store error in `fixTaskMap[taskId].lastError`
+4. Increment `totalTasks`
+5. Write updated .ralph-state.json
+
+**Implementation using jq**:
+
+```bash
+# Variables from context
+SPEC_PATH="./specs/$spec"
+TASK_ID="X.Y"           # Original task ID (e.g., "1.3")
+FIX_TASK_ID="X.Y.N"     # Generated fix task ID (e.g., "1.3.1")
+ERROR_MSG="$failure_error"  # Escaped error message from failure object
+
+# Read current state, update fixTaskMap, write back
+jq --arg taskId "$TASK_ID" \
+   --arg fixId "$FIX_TASK_ID" \
+   --arg error "$ERROR_MSG" \
+   '
+   # Initialize fixTaskMap if it does not exist
+   .fixTaskMap //= {} |
+
+   # Initialize entry for this task if it does not exist
+   .fixTaskMap[$taskId] //= {attempts: 0, fixTaskIds: [], lastError: ""} |
+
+   # Update the entry
+   .fixTaskMap[$taskId].attempts += 1 |
+   .fixTaskMap[$taskId].fixTaskIds += [$fixId] |
+   .fixTaskMap[$taskId].lastError = $error |
+
+   # Also increment totalTasks to account for inserted fix task
+   .totalTasks += 1
+   ' "$SPEC_PATH/.ralph-state.json" > "$SPEC_PATH/.ralph-state.json.tmp" && \
+   mv "$SPEC_PATH/.ralph-state.json.tmp" "$SPEC_PATH/.ralph-state.json"
+```
+
+### Reading fixTaskMap for Limit Checks
+
+```bash
+# Check current attempts for a task
+CURRENT_ATTEMPTS=$(jq -r --arg taskId "$TASK_ID" \
+  '.fixTaskMap[$taskId].attempts // 0' "$SPEC_PATH/.ralph-state.json")
+
+# Check if limit exceeded
+MAX_FIX=$(jq -r '.maxFixTasksPerOriginal // 3' "$SPEC_PATH/.ralph-state.json")
+if [ "$CURRENT_ATTEMPTS" -ge "$MAX_FIX" ]; then
+  echo "ERROR: Max fix attempts ($MAX_FIX) reached for task $TASK_ID"
+  # Show fix history
+  jq -r --arg taskId "$TASK_ID" \
+    '.fixTaskMap[$taskId].fixTaskIds | join(", ")' "$SPEC_PATH/.ralph-state.json"
+  exit 1
+fi
+```
+
+## Iterative Recovery Orchestrator
+
+Complete orchestration flow for the recovery loop.
+
+### Recovery Loop Flow
+
+```text
+1. Task fails (no TASK_COMPLETE)
+   |
+   v
+2. Check recoveryMode in state
+   |
+   |-- false --> Normal retry/stop behavior
+   |
+   v (true)
+3. Parse failure output (Section: Parse Failure Output)
+   Extract: taskId, error, attemptedFix
+   |
+   v
+4. Check fix limits (Section: Check Fix Task Limits)
+   Read: fixTaskMap[taskId].attempts
+   |
+   |-- >= maxFixTasksPerOriginal --> STOP with error
+   |
+   v (under limit)
+5. Generate fix task (Section: Fix Task Generator)
+   Create: X.Y.N [FIX X.Y] Fix: <error>
+   |
+   v
+6. Insert fix task into tasks.md
+   Position: immediately after original task
+   |
+   v
+7. Update state
+   - Increment fixTaskMap[taskId].attempts
+   - Add fix task ID to fixTaskMap[taskId].fixTaskIds
+   - Increment totalTasks
+   |
+   v
+8. Execute fix task
+   Delegate to spec-executor (same as normal task delegation)
+   |
+   |-- TASK_COMPLETE --> Proceed to step 9
+   |
+   |-- No completion --> Loop back to step 3
+       (fix task becomes current, can spawn its own fixes)
+   |
+   v
+9. Retry original task
+   Delegate original task to spec-executor again
+   |
+   |-- TASK_COMPLETE --> Success! Proceed to verification layers
+   |
+   |-- No completion --> Loop back to step 3
+       (generate another fix for original task)
+```
+
+### Example Recovery Sequence
+
+```text
+Initial: Task 1.3 fails
+  |
+Recovery Mode enabled
+  |
+Parse: error = "syntax error in parser.ts"
+  |
+Check: fixTaskMap["1.3"].attempts = 0 (under limit of 3)
+  |
+Generate: Task 1.3.1 [FIX 1.3] Fix: syntax error
+  |
+Insert: Add 1.3.1 after 1.3 in tasks.md
+  |
+Update: fixTaskMap["1.3"] = {attempts: 1, fixTaskIds: ["1.3.1"]}
+  |
+Execute: Delegate 1.3.1 to spec-executor
+  |
+1.3.1 completes with TASK_COMPLETE
+  |
+Retry: Delegate 1.3 to spec-executor again
+  |
+1.3 completes with TASK_COMPLETE
+  |
+Success! --> Verification layers --> State update --> Next task
+```
+
+### Nested Fix Example
+
+Fix tasks can fail and spawn their own fix tasks:
+
+```text
+Task 1.3 fails --> Generate 1.3.1
+  |
+1.3.1 fails --> Generate 1.3.1.1 (fix for the fix)
+  |
+1.3.1.1 completes
+  |
+Retry 1.3.1 --> completes
+  |
+Retry 1.3 --> completes
+  |
+Success!
+```
+
+### Important Rules
+
+- Fix tasks can spawn their own fix tasks (recursive recovery)
+- Each original task tracks its own fix count independently
+- taskIndex does NOT advance during fix task execution
+- Only after original task passes does taskIndex advance
+- Fix task IDs use dot notation to show lineage: 1.3.1, 1.3.2, 1.3.1.1
+
+## Fix Task Progress Logging
+
+After original task completes following fix task recovery, log the fix task chain to .progress.md.
+
+### Fix Task History Section
+
+Add/update section in .progress.md:
+
+```markdown
+## Fix Task History
+- Task 1.3: 2 fixes attempted (1.3.1, 1.3.2) - Final: PASS
+- Task 2.1: 1 fix attempted (2.1.1) - Final: PASS
+- Task 3.4: 3 fixes attempted (3.4.1, 3.4.2, 3.4.3) - Final: FAIL (max limit)
+```
+
+### Logging Implementation
+
+After successful original task retry (TASK_COMPLETE):
+
+1. Check if fixTaskMap[$taskId] exists and has attempts > 0
+2. If yes, append fix task history entry to .progress.md:
+   ```
+   - Task $taskId: $attempts fixes attempted ($fixTaskIds) - Final: PASS
+   ```
+3. Use Edit tool to append to "## Fix Task History" section
+4. If section doesn't exist, create it before "## Learnings" section
+
+On max fix limit reached:
+
+1. Log failed recovery attempt:
+   ```
+   - Task $taskId: $attempts fixes attempted ($fixTaskIds) - Final: FAIL (max limit)
+   ```
+2. Include in .progress.md before stopping execution
+
+### Example Progress Update
+
+Before fix task logging:
+```markdown
+## Completed Tasks
+- [x] 1.1 Task A - abc123
+- [x] 1.2 Task B - def456
+
+## Learnings
+- Some learning
+```
+
+After fix task logging:
+```markdown
+## Completed Tasks
+- [x] 1.1 Task A - abc123
+- [x] 1.2 Task B - def456
+
+## Fix Task History
+- Task 1.2: 2 fixes attempted (1.2.1, 1.2.2) - Final: PASS
+
+## Learnings
+- Some learning
+```
+
+## Why This Matters
+
+| Without Recovery Mode | With Recovery Mode |
+|----------------------|-------------------|
+| Single failure stops execution | Automatic fix attempt generation |
+| Manual intervention required | Self-healing loop attempts |
+| Context lost on restart | Continuous execution with tracking |
+| Binary pass/fail per task | Iterative refinement of fixes |
+| No history of fix attempts | Full fix task lineage tracked |
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index e940e3ca..350e2d4d 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -40,6 +40,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 1.10 Remove legacy commands directory - b4b6a10
 - [x] 1.12 Create validation script - 16fdced
 - [x] 1.13 Update CLAUDE.md with best practices
+- [x] 2.1 Create failure-recovery skill
 
 ## Current Task
 
@@ -47,6 +48,18 @@ Awaiting next task
 
 ## Learnings
 
+### Verification: 1.14 [VERIFY] Phase A complete validation
+- Status: PASS
+- Script: `bash scripts/validate-plugins.sh && echo "Phase A PASS"`
+- Exit code: 0
+- Results:
+  - 14/14 agents have color field
+  - 14/14 agents have 2+ example blocks
+  - 10/10 skills have version field
+  - 2/2 hooks.json files have matcher field
+  - 0 legacy commands directories found
+- No fixes needed
+
 ### Verification: 1.11 [VERIFY] Quality checkpoint: commands
 - Status: PASS
 - Verified: 13/13 commands have `name:` field
@@ -92,7 +105,7 @@ Awaiting next task
 
 ## Next
 
-Task 1.14 [VERIFY] Phase A complete validation
+Task 2.2 Create verification-layers skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index e0b5b2d9..d680c1ab 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -227,7 +227,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
 
 ### B1: Create New Skills
 
-- [ ] 2.1 Create failure-recovery skill
+- [x] 2.1 Create failure-recovery skill
   - **Do**:
     1. Extract recovery orchestration logic from implement.md sections 6b-6d
     2. Create skill with proper frontmatter (name, description, version)

From dc9b402455b87674cf41736caafa8f760177cb56 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:55:23 +0200
Subject: [PATCH 15/37] feat(ralph-specum): add verification-layers skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/verification-layers/SKILL.md       | 245 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 248 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/verification-layers/SKILL.md

diff --git a/plugins/ralph-specum/skills/verification-layers/SKILL.md b/plugins/ralph-specum/skills/verification-layers/SKILL.md
new file mode 100644
index 00000000..3c0849c3
--- /dev/null
+++ b/plugins/ralph-specum/skills/verification-layers/SKILL.md
@@ -0,0 +1,245 @@
+---
+name: verification-layers
+description: >-
+  This skill should be used when the user asks about "verification layers",
+  "task completion verification", "4-layer verification", "contradiction detection",
+  "checkmark verification", "completion signal validation", or needs guidance on
+  validating task completion before advancing state in spec-driven workflows.
+version: 0.1.0
+---
+
+# Verification Layers Pattern
+
+4-layer verification pattern to validate task completion before advancing taskIndex. All layers must pass before state is updated.
+
+## Overview
+
+CRITICAL: Run these 4 verifications BEFORE advancing taskIndex. All must pass.
+
+```text
+┌─────────────────────────────────────────────────────────────┐
+│              4-LAYER VERIFICATION PIPELINE                   │
+├─────────────────────────────────────────────────────────────┤
+│                                                              │
+│  1. CONTRADICTION Detection                                  │
+│     └── No "requires manual" + TASK_COMPLETE                │
+│                                                              │
+│  2. UNCOMMITTED Changes Check                                │
+│     └── spec files (tasks.md, .progress.md) committed       │
+│                                                              │
+│  3. CHECKMARK Verification                                   │
+│     └── checkmark count == taskIndex + 1                    │
+│                                                              │
+│  4. COMPLETION Signal Verification                           │
+│     └── explicit TASK_COMPLETE present                      │
+│                                                              │
+│  ═══════════════════════════════════════════════════════════│
+│                                                              │
+│  ALL PASS  ────►  Advance taskIndex                         │
+│  ANY FAIL  ────►  Increment taskIteration, retry            │
+│                                                              │
+└─────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Layer 1: Contradiction Detection
+
+Check spec-executor output for contradiction patterns that indicate false completion claims.
+
+### Contradiction Phrases
+
+Look for these phrases in the executor output:
+
+- `"requires manual"`
+- `"cannot be automated"`
+- `"could not complete"`
+- `"needs human"`
+- `"manual intervention"`
+
+### Detection Logic
+
+```text
+IF TASK_COMPLETE appears alongside ANY contradiction phrase:
+  → REJECT the completion
+  → Log: "CONTRADICTION: claimed completion while admitting failure"
+  → Increment taskIteration and retry
+```
+
+### Example Contradiction
+
+Bad output (should be rejected):
+```text
+The task requires manual verification but I completed the code changes.
+
+TASK_COMPLETE
+```
+
+Good output (should pass):
+```text
+Task 2.1: Add verification layers - DONE
+Verify: PASSED
+Commit: abc1234
+
+TASK_COMPLETE
+```
+
+---
+
+## Layer 2: Uncommitted Spec Files Check
+
+Before advancing, verify spec files are committed. Task is not truly complete until all changes are persisted.
+
+### Check Command
+
+```bash
+git status --porcelain ./specs/$spec/tasks.md ./specs/$spec/.progress.md
+```
+
+### Detection Logic
+
+```text
+IF output is non-empty (uncommitted changes exist):
+  → REJECT the completion
+  → Log: "uncommitted spec files detected - task not properly committed"
+  → Increment taskIteration and retry
+```
+
+### Rationale
+
+All spec file changes must be committed before task is considered complete:
+- `tasks.md` - must have checkmark `[x]` committed
+- `.progress.md` - must have completion entry committed
+
+This ensures progress survives context resets and session restarts.
+
+---
+
+## Layer 3: Checkmark Verification
+
+Count completed tasks in tasks.md and verify it matches expected count.
+
+### Check Command
+
+```bash
+grep -c '\- \[x\]' ./specs/$spec/tasks.md
+```
+
+### Expected Count Calculation
+
+```text
+expected_checkmarks = taskIndex + 1
+
+(0-based index: task 0 complete = 1 checkmark)
+```
+
+### Detection Logic
+
+```text
+IF actual_count != expected_checkmarks:
+  → REJECT the completion
+  → Log: "checkmark mismatch: expected $expected, found $actual"
+  → Increment taskIteration and retry
+```
+
+### Purpose
+
+This layer detects:
+- State manipulation (executor lying about completion)
+- Incomplete task marking (forgot to mark `[x]`)
+- Multiple tasks marked in single iteration
+
+---
+
+## Layer 4: Completion Signal Verification
+
+Verify spec-executor explicitly output TASK_COMPLETE.
+
+### Required Signal
+
+```text
+TASK_COMPLETE
+```
+
+### Detection Logic
+
+```text
+IF TASK_COMPLETE not present in output:
+  → Do NOT advance
+  → Log: "missing TASK_COMPLETE signal"
+  → Increment taskIteration and retry
+```
+
+### Important Notes
+
+- Must be explicit, not implied
+- Partial completion is not valid
+- Silent completion is not valid
+- The signal must be unambiguous
+
+---
+
+## Verification Summary
+
+All 4 layers must pass for task to be considered complete:
+
+| Layer | Check | On Failure |
+|-------|-------|------------|
+| 1. Contradiction | No contradiction phrases with completion claim | Reject, retry |
+| 2. Uncommitted | Spec files committed (no uncommitted changes) | Reject, retry |
+| 3. Checkmark | Checkmark count matches expected taskIndex + 1 | Reject, retry |
+| 4. Signal | Explicit TASK_COMPLETE signal present | Reject, retry |
+
+### After All Pass
+
+Only after all verifications pass, proceed to state update:
+1. Increment taskIndex
+2. Reset taskIteration to 1
+3. Continue to next task or completion
+
+---
+
+## Implementation Pattern
+
+```text
+function verifyTaskCompletion(spec, taskIndex, executorOutput):
+
+  # Layer 1: Contradiction Detection
+  contradictions = ["requires manual", "cannot be automated",
+                    "could not complete", "needs human", "manual intervention"]
+
+  for phrase in contradictions:
+    if phrase in executorOutput AND "TASK_COMPLETE" in executorOutput:
+      return FAIL("CONTRADICTION: claimed completion while admitting failure")
+
+  # Layer 2: Uncommitted Spec Files
+  uncommitted = run("git status --porcelain ./specs/{spec}/tasks.md ./specs/{spec}/.progress.md")
+  if uncommitted.length > 0:
+    return FAIL("uncommitted spec files detected - task not properly committed")
+
+  # Layer 3: Checkmark Verification
+  actualCheckmarks = run("grep -c '\\- \\[x\\]' ./specs/{spec}/tasks.md")
+  expectedCheckmarks = taskIndex + 1
+  if actualCheckmarks != expectedCheckmarks:
+    return FAIL("checkmark mismatch: expected {expected}, found {actual}")
+
+  # Layer 4: Completion Signal
+  if "TASK_COMPLETE" not in executorOutput:
+    return FAIL("missing TASK_COMPLETE signal")
+
+  return PASS
+```
+
+---
+
+## Error Recovery
+
+When any layer fails:
+
+1. Log the specific failure reason
+2. Increment `taskIteration` in state file
+3. Check against `maxTaskIterations` limit
+4. If under limit: retry the same task
+5. If over limit: output error and stop
+
+The retry mechanism allows transient failures to self-correct while preventing infinite loops.
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 350e2d4d..548c9182 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -41,6 +41,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 1.12 Create validation script - 16fdced
 - [x] 1.13 Update CLAUDE.md with best practices
 - [x] 2.1 Create failure-recovery skill
+- [x] 2.2 Create verification-layers skill
 
 ## Current Task
 
@@ -105,7 +106,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.2 Create verification-layers skill
+Task 2.3 Create coordinator-pattern skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index d680c1ab..c213ab58 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -239,7 +239,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add failure-recovery skill`
   - _Design: New Skills - failure-recovery_
 
-- [ ] 2.2 Create verification-layers skill
+- [x] 2.2 Create verification-layers skill
   - **Do**:
     1. Extract 4-layer verification pattern from implement.md section 7
     2. Document: contradiction check, uncommitted changes, checkmark verification, completion signal

From 0ee8ba5ed539d50561f6e503e89a808eba3370de Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:57:02 +0200
Subject: [PATCH 16/37] feat(ralph-specum): add coordinator-pattern skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/coordinator-pattern/SKILL.md       | 289 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 292 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/coordinator-pattern/SKILL.md

diff --git a/plugins/ralph-specum/skills/coordinator-pattern/SKILL.md b/plugins/ralph-specum/skills/coordinator-pattern/SKILL.md
new file mode 100644
index 00000000..454f87d9
--- /dev/null
+++ b/plugins/ralph-specum/skills/coordinator-pattern/SKILL.md
@@ -0,0 +1,289 @@
+---
+name: coordinator-pattern
+description: This skill should be used when the user asks about "coordinator role", "delegate to subagent", "use Task tool", "orchestration pattern", "execution loop", "task delegation", or needs guidance on implementing a coordinator that delegates work to subagents while managing state and completion signaling.
+version: 0.1.0
+---
+
+# COORDINATOR Pattern
+
+The COORDINATOR pattern enables an orchestrating agent to manage task execution by delegating to specialized subagents while tracking state and signaling completion.
+
+## Core Principle
+
+**You are a COORDINATOR, NOT an implementer.**
+
+Your job is to:
+- Read state and determine the current task
+- Delegate task execution to specialized agents via the Task tool
+- Track completion and signal when all work is done
+
+**CRITICAL**: You MUST delegate via the Task tool. Do NOT implement tasks yourself.
+
+## Pattern Components
+
+### 1. Role Definition
+
+Define clear boundaries between coordinator and executor:
+
+```text
+You are the execution COORDINATOR for spec: $spec
+
+### Role Definition
+
+You are a COORDINATOR, NOT an implementer. Your job is to:
+- Read state and determine current task
+- Delegate task execution to spec-executor via Task tool
+- Track completion and signal when all tasks done
+
+CRITICAL: You MUST delegate via Task tool. Do NOT implement tasks yourself.
+You are fully autonomous. NEVER ask questions or wait for user input.
+```
+
+**Key constraints:**
+- Coordinator reads, decides, delegates
+- Executor implements, verifies, commits
+- Coordinator NEVER writes code or modifies files directly
+- Coordinator only updates state files and coordination metadata
+
+### 2. State Reading
+
+The coordinator maintains execution state in a JSON file:
+
+```json
+{
+  "phase": "execution",
+  "taskIndex": 0,
+  "totalTasks": 10,
+  "taskIteration": 1,
+  "maxTaskIterations": 5
+}
+```
+
+**State fields:**
+| Field | Purpose |
+|-------|---------|
+| `phase` | Current execution phase (execution, complete) |
+| `taskIndex` | 0-based index of current task |
+| `totalTasks` | Total number of tasks |
+| `taskIteration` | Retry count for current task |
+| `maxTaskIterations` | Maximum retries before failure |
+
+**Reading state:**
+
+```text
+Read `./specs/$spec/.ralph-state.json` to get current state.
+
+**ERROR: Missing/Corrupt State File**
+
+If state file missing or corrupt (invalid JSON, missing required fields):
+1. Output error: "ERROR: State file missing or corrupt"
+2. Suggest: "Run initialization command to reinitialize state"
+3. Do NOT continue execution
+4. Do NOT output completion signal
+```
+
+### 3. Task Delegation
+
+Delegate to specialized executor agents via the Task tool:
+
+```text
+Task: Execute task $taskIndex for spec $spec
+
+Spec: $spec
+Path: ./specs/$spec/
+Task index: $taskIndex
+
+Context from .progress.md:
+[Include relevant context]
+
+Current task from tasks.md:
+[Include full task block]
+
+Instructions:
+1. Read Do section and execute exactly
+2. Only modify Files listed
+3. Verify completion with Verify command
+4. Commit with task's Commit message
+5. Update .progress.md with completion and learnings
+6. Mark task [x] in tasks.md
+7. Output TASK_COMPLETE when done
+```
+
+**Delegation rules:**
+- Include full task specification in delegation
+- Provide all context needed for autonomous execution
+- Specify the exact completion signal expected (e.g., `TASK_COMPLETE`)
+- Wait for executor response before proceeding
+
+### 4. Completion Checking
+
+Before delegation, check if all work is done:
+
+```text
+If taskIndex >= totalTasks:
+1. Verify all tasks marked [x] in tasks.md
+2. Delete state file (cleanup)
+3. Output: ALL_TASKS_COMPLETE
+4. STOP - do not delegate any task
+```
+
+### 5. Completion Signaling
+
+The coordinator outputs a specific signal when all tasks are done:
+
+```text
+Output exactly `ALL_TASKS_COMPLETE` when:
+- taskIndex >= totalTasks AND
+- All tasks marked [x] in tasks.md
+
+Before outputting:
+1. Verify all tasks marked [x] in tasks.md
+2. Delete state file (cleanup execution state)
+3. Keep progress file (preserve learnings and history)
+
+This signal terminates the execution loop.
+
+Do NOT output ALL_TASKS_COMPLETE if tasks remain incomplete.
+Do NOT output TASK_COMPLETE (that's for executors only).
+```
+
+**Signal hierarchy:**
+| Signal | Used By | Meaning |
+|--------|---------|---------|
+| `TASK_COMPLETE` | Executor | Single task finished |
+| `ALL_TASKS_COMPLETE` | Coordinator | All tasks finished |
+
+### 6. State Update After Completion
+
+When executor signals task completion:
+
+```text
+After successful completion (TASK_COMPLETE):
+1. Read current state file
+2. Increment taskIndex by 1
+3. Reset taskIteration to 1
+4. Write updated state
+
+Check if all tasks complete:
+- If taskIndex >= totalTasks: output ALL_TASKS_COMPLETE
+- If taskIndex < totalTasks: continue to next iteration
+```
+
+### 7. Retry Handling
+
+When executor fails to signal completion:
+
+```text
+If no completion signal:
+1. Increment taskIteration in state file
+2. If taskIteration > maxTaskIterations: output error and STOP
+3. Otherwise: Retry the same task
+
+**ERROR: Max Retries Reached**
+
+If taskIteration exceeds maxTaskIterations:
+1. Output error: "ERROR: Max retries reached for task $taskIndex"
+2. Include last error/failure reason from executor output
+3. Suggest: "Fix the issue manually then run again to resume"
+4. Do NOT continue execution
+5. Do NOT output ALL_TASKS_COMPLETE
+```
+
+## Complete Coordinator Flow
+
+```text
+1. Read state file
+   |
+2. Check if taskIndex >= totalTasks
+   ├── Yes: Output ALL_TASKS_COMPLETE, STOP
+   |
+3. Parse current task from tasks.md
+   |
+4. Delegate to executor via Task tool
+   |
+5. Wait for executor response
+   |
+6. Check for completion signal
+   ├── TASK_COMPLETE: Update state, go to step 2
+   └── No signal: Increment iteration
+       ├── Under limit: Retry (step 4)
+       └── Over limit: Error and STOP
+```
+
+## Parallel Execution Extension
+
+For parallel task execution, detect adjacent parallel tasks and spawn multiple delegations:
+
+```text
+If current task has [P] marker, scan for consecutive [P] tasks.
+
+Build parallelGroup:
+{
+  "startIndex": <first [P] task index>,
+  "endIndex": <last consecutive [P] task index>,
+  "taskIndices": [startIndex, startIndex+1, ..., endIndex],
+  "isParallel": true
+}
+
+Spawn MULTIPLE Task tool calls in ONE message for true parallelism.
+Wait for ALL to complete before advancing state.
+```
+
+## Error Handling Patterns
+
+### Missing State File
+
+```text
+If state file missing or corrupt:
+1. Output error: "ERROR: State file missing or corrupt"
+2. Suggest: "Run initialization to reinitialize state"
+3. Do NOT continue
+4. Do NOT output completion signal
+```
+
+### Missing Task File
+
+```text
+If tasks file does not exist:
+1. Output error: "ERROR: Tasks file missing"
+2. Suggest: "Run task generation command first"
+3. Do NOT continue
+4. Do NOT output completion signal
+```
+
+### Executor Timeout/Failure
+
+```text
+If executor does not respond or errors:
+1. Log failure in progress file
+2. Increment taskIteration
+3. Retry if under limit
+4. Stop with error if over limit
+```
+
+## Usage in Commands
+
+Reference this skill in commands that need coordination:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/coordinator-pattern/SKILL.md`
+Use the COORDINATOR pattern to manage task execution loop.
+</skill-reference>
+```
+
+## Anti-Patterns
+
+**DO NOT:**
+- Implement tasks directly in the coordinator
+- Skip the Task tool and write code yourself
+- Modify files that should be modified by executors
+- Output completion signal before verifying all tasks done
+- Continue after error without proper state update
+
+**ALWAYS:**
+- Delegate via Task tool
+- Verify completion signals from executors
+- Update state after each task
+- Clean up state file on completion
+- Preserve progress/learnings files
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 548c9182..26e7c4bf 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -42,6 +42,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 1.13 Update CLAUDE.md with best practices
 - [x] 2.1 Create failure-recovery skill
 - [x] 2.2 Create verification-layers skill
+- [x] 2.3 Create coordinator-pattern skill
 
 ## Current Task
 
@@ -106,7 +107,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.3 Create coordinator-pattern skill
+Task 2.4 [VERIFY] Quality checkpoint: new skills batch 1
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index c213ab58..23d3bce1 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -250,7 +250,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add verification-layers skill`
   - _Design: New Skills - verification-layers_
 
-- [ ] 2.3 Create coordinator-pattern skill
+- [x] 2.3 Create coordinator-pattern skill
   - **Do**:
     1. Extract coordinator prompt pattern from implement.md
     2. Document role definition, state reading, task delegation, completion signaling

From 077b35e0d8480ace62c5bafe9b2e6c46edd0ac41 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 21:59:07 +0200
Subject: [PATCH 17/37] feat(ralph-specum): add branch-management skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/branch-management/SKILL.md         | 209 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |  10 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 219 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/branch-management/SKILL.md

diff --git a/plugins/ralph-specum/skills/branch-management/SKILL.md b/plugins/ralph-specum/skills/branch-management/SKILL.md
new file mode 100644
index 00000000..99dc6b10
--- /dev/null
+++ b/plugins/ralph-specum/skills/branch-management/SKILL.md
@@ -0,0 +1,209 @@
+---
+name: branch-management
+description: This skill should be used when the user asks about "git branching", "create feature branch", "git worktree", "branch naming", "default branch detection", "branch workflow", or needs guidance on branch creation, worktree setup, naming conventions, and default branch handling for spec-driven development.
+version: 0.1.0
+---
+
+# Branch Management
+
+Branch workflow patterns for spec-driven development. Ensures work happens on feature branches, never on main/master.
+
+## Check Current Branch
+
+```bash
+git branch --show-current
+```
+
+## Determine Default Branch
+
+Check which is the default branch:
+
+```bash
+git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@'
+```
+
+If that fails, check which exists:
+
+```bash
+git rev-parse --verify origin/main 2>/dev/null && echo "main" || echo "master"
+```
+
+## Branch Decision Logic
+
+```text
+1. Get current branch name
+   |
+   +-- ON DEFAULT BRANCH (main/master):
+   |   |
+   |   +-- Ask user for branch strategy:
+   |   |   "Starting new spec work. How would you like to handle branching?"
+   |   |   1. Create branch in current directory (git checkout -b)
+   |   |   2. Create git worktree (separate directory)
+   |   |
+   |   +-- If user chooses 1 (current directory):
+   |   |   - Generate branch name from spec name: feat/$specName
+   |   |   - If spec name not yet known, use temp name: feat/spec-work-<timestamp>
+   |   |   - Create and switch: git checkout -b <branch-name>
+   |   |   - Inform user: "Created branch '<branch-name>' for this work"
+   |   |
+   |   +-- If user chooses 2 (worktree):
+   |   |   - See Worktree Setup section below
+   |   |   - STOP HERE after worktree creation (user needs to switch directories)
+   |   |
+   |   +-- Continue to next workflow step
+   |
+   +-- ON NON-DEFAULT BRANCH (feature branch):
+       |
+       +-- Ask user for preference:
+       |   "You are currently on branch '<current-branch>'.
+       |    Would you like to:
+       |    1. Continue working on this branch
+       |    2. Create a new branch in current directory
+       |    3. Create git worktree (separate directory)"
+       |
+       +-- If user chooses 1 (continue):
+       |   - Stay on current branch
+       |   - Continue to next workflow step
+       |
+       +-- If user chooses 2 (new branch):
+       |   - Generate branch name from spec name: feat/$specName
+       |   - Create and switch: git checkout -b <branch-name>
+       |
+       +-- If user chooses 3 (worktree):
+           - See Worktree Setup section below
+           - STOP HERE after worktree creation
+```
+
+## Branch Naming Convention
+
+When creating a new branch:
+- Use format: `feat/<spec-name>` (e.g., `feat/user-auth`)
+- If spec name contains special chars, sanitize to kebab-case
+- If branch already exists, append `-2`, `-3`, etc.
+
+Example:
+```text
+Spec name: user-auth
+Branch: feat/user-auth
+
+If feat/user-auth exists:
+Branch: feat/user-auth-2
+```
+
+Sanitization rules:
+- Convert to lowercase
+- Replace spaces with hyphens
+- Remove non-alphanumeric characters (except hyphens)
+- Collapse multiple hyphens to single hyphen
+- Trim leading/trailing hyphens
+
+## Worktree Setup
+
+Git worktrees allow working on multiple branches simultaneously in separate directories.
+
+### When to Use Worktrees
+
+- Long-running feature development that may need interruption
+- Parallel work on multiple specs
+- Preserving main branch checkout for quick fixes
+
+### Worktree Creation
+
+```bash
+# Get repo name for path suggestion
+REPO_NAME=$(basename $(git rev-parse --show-toplevel))
+
+# If SPEC_NAME empty but .current-spec exists, read from it
+if [ -z "$SPEC_NAME" ] && [ -f "./specs/.current-spec" ]; then
+    SPEC_NAME=$(cat "./specs/.current-spec") || true
+fi
+
+# Default worktree path
+WORKTREE_PATH="../${REPO_NAME}-${SPEC_NAME}"
+
+# Create worktree with new branch
+git worktree add "$WORKTREE_PATH" -b "feat/${SPEC_NAME}"
+```
+
+### Copy Spec State Files to Worktree
+
+When creating a worktree, copy state files so work can continue:
+
+```bash
+# Copy spec state files to worktree (failures are warnings, not errors)
+if [ -d "./specs" ]; then
+    mkdir -p "$WORKTREE_PATH/specs" || echo "Warning: Failed to create specs directory in worktree"
+
+    # Copy .current-spec if exists (don't overwrite existing)
+    if [ -f "./specs/.current-spec" ] && [ ! -f "$WORKTREE_PATH/specs/.current-spec" ]; then
+        cp "./specs/.current-spec" "$WORKTREE_PATH/specs/.current-spec" || echo "Warning: Failed to copy .current-spec to worktree"
+    fi
+
+    # If spec name known, copy spec state files
+    if [ -n "$SPEC_NAME" ] && [ -d "./specs/$SPEC_NAME" ]; then
+        mkdir -p "$WORKTREE_PATH/specs/$SPEC_NAME" || echo "Warning: Failed to create spec directory in worktree"
+
+        # Copy state files (don't overwrite existing)
+        if [ -f "./specs/$SPEC_NAME/.ralph-state.json" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.ralph-state.json" ]; then
+            cp "./specs/$SPEC_NAME/.ralph-state.json" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .ralph-state.json to worktree"
+        fi
+
+        if [ -f "./specs/$SPEC_NAME/.progress.md" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.progress.md" ]; then
+            cp "./specs/$SPEC_NAME/.progress.md" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .progress.md to worktree"
+        fi
+    fi
+fi
+```
+
+### State Files Copied
+
+- `specs/.current-spec` - Active spec name pointer
+- `specs/$SPEC_NAME/.ralph-state.json` - Loop state (phase, taskIndex, iterations)
+- `specs/$SPEC_NAME/.progress.md` - Progress tracking and learnings
+
+Copy uses non-overwrite semantics (skips if file already exists in target).
+
+### Post-Worktree Instructions
+
+After worktree creation, output clear guidance:
+
+```text
+Created worktree at '<path>' on branch '<branch-name>'
+Spec state files copied to worktree.
+
+For best results, cd to the worktree directory and start a new Claude Code session from there:
+
+  cd <path>
+  claude
+
+Then run /ralph-specum:research to begin the research phase.
+```
+
+**STOP after worktree creation** - do not continue to next workflow steps. The user needs to switch directories first to work in the worktree.
+
+### Worktree Cleanup
+
+To clean up a worktree later:
+```bash
+git worktree remove <path>
+```
+
+## Quick Mode Branch Handling
+
+In `--quick` mode, skip user prompts for non-default branches:
+
+- If on default branch: auto-create feature branch in current directory (no worktree prompt)
+- If on non-default branch: stay on current branch (no prompt, quick mode is non-interactive)
+
+## Default Branch Protection
+
+<mandatory>
+NEVER push directly to the default branch (main/master). This is NON-NEGOTIABLE.
+
+If you need to push changes:
+1. First verify you're NOT on the default branch: `git branch --show-current`
+2. If somehow on default branch, STOP and alert the user
+3. Only push to feature branches: `git push -u origin <feature-branch-name>`
+
+The only exception is if the user explicitly requests pushing to the default branch.
+</mandatory>
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 26e7c4bf..73d55624 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -43,6 +43,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.1 Create failure-recovery skill
 - [x] 2.2 Create verification-layers skill
 - [x] 2.3 Create coordinator-pattern skill
+- [x] 2.5 Create branch-management skill
 
 ## Current Task
 
@@ -50,6 +51,13 @@ Awaiting next task
 
 ## Learnings
 
+### Verification: 2.4 [VERIFY] Quality checkpoint: new skills batch 1
+- Status: PASS
+- Verified: 3/3 skills exist with version field
+- Skills checked: failure-recovery (v0.1.0), verification-layers (v0.1.0), coordinator-pattern (v0.1.0)
+- All files at: plugins/ralph-specum/skills/{skill-name}/SKILL.md
+- No fixes needed
+
 ### Verification: 1.14 [VERIFY] Phase A complete validation
 - Status: PASS
 - Script: `bash scripts/validate-plugins.sh && echo "Phase A PASS"`
@@ -107,7 +115,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.4 [VERIFY] Quality checkpoint: new skills batch 1
+Task 2.6 Create intent-classification skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 23d3bce1..b1f1b99c 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -267,7 +267,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Done when**: All 3 skills exist with version field
   - **Commit**: `chore(ralph-specum): pass new skills batch 1 checkpoint` (only if fixes needed)
 
-- [ ] 2.5 Create branch-management skill
+- [x] 2.5 Create branch-management skill
   - **Do**:
     1. Extract branch management logic from start.md
     2. Document branch creation, worktree setup, naming conventions, default branch detection

From b0c92f0908009fdc61949c5462c3461b82524ce8 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:01:39 +0200
Subject: [PATCH 18/37] feat(ralph-specum): add intent-classification skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/intent-classification/SKILL.md     | 213 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 216 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/intent-classification/SKILL.md

diff --git a/plugins/ralph-specum/skills/intent-classification/SKILL.md b/plugins/ralph-specum/skills/intent-classification/SKILL.md
new file mode 100644
index 00000000..a626bb0e
--- /dev/null
+++ b/plugins/ralph-specum/skills/intent-classification/SKILL.md
@@ -0,0 +1,213 @@
+---
+name: intent-classification
+description: This skill should be used when the user asks about "classify user goal", "detect intent type", "keyword matching", "question count determination", "TRIVIAL vs GREENFIELD", "goal type detection", or needs guidance on categorizing user goals to determine interview depth and question count for spec-driven development.
+version: 0.1.0
+---
+
+# Intent Classification
+
+Classify user goals to determine appropriate interview depth. Different goal types require different levels of discovery.
+
+## Classification Logic
+
+Analyze the goal text for keywords to determine intent type:
+
+```text
+Intent Classification:
+
+1. TRIVIAL: Goal contains keywords like:
+   - "fix typo", "typo", "spelling"
+   - "small change", "minor"
+   - "quick", "simple", "tiny"
+   - "rename", "update text"
+   -> Min questions: 1, Max questions: 2
+
+2. REFACTOR: Goal contains keywords like:
+   - "refactor", "restructure", "reorganize"
+   - "clean up", "cleanup", "simplify"
+   - "extract", "consolidate", "modularize"
+   - "improve code", "tech debt"
+   -> Min questions: 3, Max questions: 5
+
+3. GREENFIELD: Goal contains keywords like:
+   - "new feature", "new system", "new module"
+   - "add", "build", "implement", "create"
+   - "integrate", "introduce"
+   - "from scratch"
+   -> Min questions: 5, Max questions: 10
+
+4. MID_SIZED: Default if no clear match
+   -> Min questions: 3, Max questions: 7
+```
+
+## Keyword Tables
+
+### TRIVIAL Keywords
+
+| Keyword | Confidence Boost |
+|---------|------------------|
+| fix typo | high |
+| typo | high |
+| spelling | high |
+| small change | high |
+| minor | medium |
+| quick | medium |
+| simple | medium |
+| tiny | high |
+| rename | medium |
+| update text | medium |
+
+### REFACTOR Keywords
+
+| Keyword | Confidence Boost |
+|---------|------------------|
+| refactor | high |
+| restructure | high |
+| reorganize | high |
+| clean up | high |
+| cleanup | high |
+| simplify | medium |
+| extract | medium |
+| consolidate | medium |
+| modularize | high |
+| improve code | medium |
+| tech debt | high |
+
+### GREENFIELD Keywords
+
+| Keyword | Confidence Boost |
+|---------|------------------|
+| new feature | high |
+| new system | high |
+| new module | high |
+| add | low |
+| build | medium |
+| implement | medium |
+| create | medium |
+| integrate | medium |
+| introduce | medium |
+| from scratch | high |
+
+## Confidence Threshold
+
+| Match Count | Confidence | Action |
+|-------------|------------|--------|
+| 3+ keywords | High | Use matched category |
+| 1-2 keywords | Medium | Use matched category |
+| 0 keywords | Low | Default to MID_SIZED |
+
+## Question Count by Intent
+
+Intent classification determines the question count range, not which questions to ask. All goals use the same interview question pool, but the number of questions varies by intent:
+
+| Intent | Min Questions | Max Questions |
+|--------|---------------|---------------|
+| TRIVIAL | 1 | 2 |
+| REFACTOR | 3 | 5 |
+| GREENFIELD | 5 | 10 |
+| MID_SIZED | 3 | 7 |
+
+## Classification Algorithm
+
+```text
+function classifyIntent(goalText):
+  goalLower = goalText.toLowerCase()
+
+  trivialScore = countMatches(goalLower, TRIVIAL_KEYWORDS)
+  refactorScore = countMatches(goalLower, REFACTOR_KEYWORDS)
+  greenfieldScore = countMatches(goalLower, GREENFIELD_KEYWORDS)
+
+  maxScore = max(trivialScore, refactorScore, greenfieldScore)
+
+  if maxScore == 0:
+    return { type: "MID_SIZED", confidence: "low", minQ: 3, maxQ: 7 }
+
+  if trivialScore == maxScore:
+    return { type: "TRIVIAL", confidence: getConfidence(trivialScore), minQ: 1, maxQ: 2 }
+
+  if refactorScore == maxScore:
+    return { type: "REFACTOR", confidence: getConfidence(refactorScore), minQ: 3, maxQ: 5 }
+
+  if greenfieldScore == maxScore:
+    return { type: "GREENFIELD", confidence: getConfidence(greenfieldScore), minQ: 5, maxQ: 10 }
+
+function getConfidence(score):
+  if score >= 3: return "high"
+  if score >= 1: return "medium"
+  return "low"
+```
+
+## Store Intent in Progress File
+
+After classification, store the result in `.progress.md`:
+
+```markdown
+## Interview Format
+- Version: 1.0
+
+## Intent Classification
+- Type: [TRIVIAL|REFACTOR|GREENFIELD|MID_SIZED]
+- Confidence: [high|medium|low] ([N] keywords matched)
+- Min questions: [N]
+- Max questions: [N]
+- Keywords matched: [list of matched keywords]
+```
+
+## Question Selection Logic
+
+```text
+1. Get intent from Intent Classification step
+2. Intent determines question COUNT, not which pool to use
+3. All goals use the same interview question pool
+4. Ask Required questions first, then Optional questions
+5. Stop when:
+   - User signals completion (after minRequired reached)
+   - All questions asked (maxAllowed reached)
+   - User selects "No, let's proceed" on optional question
+```
+
+## Examples
+
+### Example 1: Trivial Intent
+
+**Goal**: "Fix typo in README"
+
+**Classification**:
+- Keywords matched: "fix typo"
+- Type: TRIVIAL
+- Confidence: high (1 keyword, but high-confidence keyword)
+- Min questions: 1
+- Max questions: 2
+
+### Example 2: Greenfield Intent
+
+**Goal**: "Build a new authentication system with OAuth2"
+
+**Classification**:
+- Keywords matched: "build", "new"
+- Type: GREENFIELD
+- Confidence: medium (2 keywords)
+- Min questions: 5
+- Max questions: 10
+
+### Example 3: Refactor Intent
+
+**Goal**: "Refactor the user service to extract common utilities"
+
+**Classification**:
+- Keywords matched: "refactor", "extract"
+- Type: REFACTOR
+- Confidence: medium (2 keywords)
+- Min questions: 3
+- Max questions: 5
+
+### Example 4: Default MID_SIZED
+
+**Goal**: "Update the dashboard to show metrics"
+
+**Classification**:
+- Keywords matched: none significant
+- Type: MID_SIZED
+- Confidence: low
+- Min questions: 3
+- Max questions: 7
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 73d55624..abe09154 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -44,6 +44,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.2 Create verification-layers skill
 - [x] 2.3 Create coordinator-pattern skill
 - [x] 2.5 Create branch-management skill
+- [x] 2.6 Create intent-classification skill
 
 ## Current Task
 
@@ -115,7 +116,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.6 Create intent-classification skill
+Task 2.7 Create spec-scanner skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index b1f1b99c..e9c5f956 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -278,7 +278,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add branch-management skill`
   - _Design: New Skills - branch-management_
 
-- [ ] 2.6 Create intent-classification skill
+- [x] 2.6 Create intent-classification skill
   - **Do**:
     1. Extract intent classification logic from start.md
     2. Document goal type detection, keyword matching, question count determination

From 4c292172bbcf60374c02647791861f549c6bd0ce Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:03:08 +0200
Subject: [PATCH 19/37] feat(ralph-specum): add spec-scanner skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../ralph-specum/skills/spec-scanner/SKILL.md | 249 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 252 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/spec-scanner/SKILL.md

diff --git a/plugins/ralph-specum/skills/spec-scanner/SKILL.md b/plugins/ralph-specum/skills/spec-scanner/SKILL.md
new file mode 100644
index 00000000..6b0bd32b
--- /dev/null
+++ b/plugins/ralph-specum/skills/spec-scanner/SKILL.md
@@ -0,0 +1,249 @@
+---
+name: spec-scanner
+description: This skill should be used when the user asks about "find related specs", "scan existing specs", "spec discovery", "keyword matching for specs", "related work detection", "prior context surfacing", or needs guidance on discovering and recommending related specs before starting new work.
+version: 0.1.0
+---
+
+# Spec Scanner
+
+Scan existing specs to find related work before starting new specs. This surfaces prior context and helps avoid duplicate effort.
+
+## When to Use
+
+Before conducting a Goal Interview, scan for related specs to:
+- Surface prior context that may inform the new work
+- Avoid duplicate effort on similar goals
+- Enable informed interview questions about relationships to existing work
+
+**Skip spec scanner if --quick flag detected.**
+
+## Scan Steps
+
+```text
+1. List all directories in ./specs/
+   - Run: ls -d ./specs/*/ 2>/dev/null | xargs -I{} basename {}
+   - Exclude the current spec being created (if known)
+   |
+2. For each spec directory found:
+   - Read ./specs/$specName/.progress.md
+   - Extract "Original Goal" section (line after "## Original Goal")
+   - If .progress.md doesn't exist, skip this spec
+   |
+3. Keyword matching:
+   - Extract keywords from current goal (split by spaces, lowercase)
+   - Remove common words: "the", "a", "an", "to", "for", "with", "and", "or"
+   - For each existing spec, count matching keywords with its Original Goal
+   - Score = number of matching keywords
+   |
+4. Rank and filter:
+   - Sort specs by score (descending)
+   - Take top 3 specs with score > 0
+   - If no matches found, skip display step
+   |
+5. Display related specs (if any found):
+   |
+   Related specs found:
+   - spec-name-1: [first 50 chars of Original Goal]...
+   - spec-name-2: [first 50 chars of Original Goal]...
+   - spec-name-3: [first 50 chars of Original Goal]...
+   |
+   This context may inform the interview questions.
+   |
+6. Store in state file:
+   - Update .ralph-state.json with relatedSpecs array:
+     {
+       ...existing state,
+       "relatedSpecs": [
+         {"name": "spec-name-1", "goal": "Original Goal text", "score": N},
+         {"name": "spec-name-2", "goal": "Original Goal text", "score": N},
+         {"name": "spec-name-3", "goal": "Original Goal text", "score": N}
+       ]
+     }
+```
+
+## Keyword Extraction
+
+Extract meaningful keywords from the goal by removing stop words:
+
+```javascript
+// Pseudocode for keyword extraction
+function extractKeywords(text) {
+  const stopWords = ["the", "a", "an", "to", "for", "with", "and", "or", "is", "it", "this", "that", "be", "on", "in", "of"];
+  return text
+    .toLowerCase()
+    .split(/\s+/)
+    .filter(word => word.length > 2)
+    .filter(word => !stopWords.includes(word));
+}
+```
+
+### Stop Words List
+
+| Category | Words |
+|----------|-------|
+| Articles | the, a, an |
+| Prepositions | to, for, with, on, in, of |
+| Conjunctions | and, or |
+| Pronouns | it, this, that |
+| Verbs (common) | is, be |
+
+## Match Scoring
+
+Simple keyword overlap scoring:
+
+```javascript
+// Pseudocode for scoring
+function scoreMatch(currentGoalKeywords, existingGoalKeywords) {
+  let score = 0;
+  for (const keyword of currentGoalKeywords) {
+    if (existingGoalKeywords.includes(keyword)) {
+      score += 1;
+    }
+  }
+  return score;
+}
+```
+
+### Scoring Rules
+
+| Score | Interpretation | Action |
+|-------|----------------|--------|
+| 0 | No keyword overlap | Exclude from results |
+| 1-2 | Low relevance | Include if in top 3 |
+| 3-5 | Medium relevance | Prioritize in results |
+| 6+ | High relevance | Definitely include, may indicate duplicate |
+
+## Output Format
+
+### Related Specs Display
+
+```text
+Related specs found:
+- user-auth: Add OAuth2 authentication with JWT tokens...
+- api-refactor: Restructure API endpoints for better...
+- error-handling: Implement consistent error handling...
+
+This context may inform the interview questions.
+```
+
+### State File Format
+
+```json
+{
+  "relatedSpecs": [
+    {"name": "user-auth", "goal": "Add OAuth2 authentication with JWT tokens", "score": 4},
+    {"name": "api-refactor", "goal": "Restructure API endpoints for better", "score": 2},
+    {"name": "error-handling", "goal": "Implement consistent error handling", "score": 1}
+  ]
+}
+```
+
+## Usage in Interview
+
+After scanning, if related specs were found, reference them when asking clarifying questions:
+
+- "I noticed you have a spec 'user-auth' for authentication. Does this new feature relate to or depend on that work?"
+- "There's an existing 'api-refactor' spec. Should this work integrate with those changes?"
+- "The 'error-handling' spec covers similar error patterns. Should we follow the same approach?"
+
+## Examples
+
+### Example 1: Finding Related Authentication Work
+
+**Current Goal**: "Add password reset functionality"
+
+**Existing Specs**:
+- user-auth: "Add OAuth2 authentication with JWT tokens"
+- api-refactor: "Restructure API endpoints"
+- dashboard: "Create admin dashboard"
+
+**Keywords extracted**: ["add", "password", "reset", "functionality"]
+
+**Matching**:
+- user-auth: 1 match ("add")
+- api-refactor: 0 matches
+- dashboard: 0 matches
+
+**Output**:
+```text
+Related specs found:
+- user-auth: Add OAuth2 authentication with JWT tokens...
+
+This context may inform the interview questions.
+```
+
+### Example 2: No Related Specs Found
+
+**Current Goal**: "Add unit tests for payment module"
+
+**Existing Specs**:
+- user-auth: "Add OAuth2 authentication"
+- dashboard: "Create admin dashboard"
+
+**Keywords extracted**: ["add", "unit", "tests", "payment", "module"]
+
+**Matching**:
+- user-auth: 1 match ("add") - low score
+- dashboard: 0 matches
+
+**Output** (score threshold met):
+```text
+Related specs found:
+- user-auth: Add OAuth2 authentication...
+
+This context may inform the interview questions.
+```
+
+### Example 3: Multiple High-Relevance Matches
+
+**Current Goal**: "Refactor authentication to use new token system"
+
+**Existing Specs**:
+- user-auth: "Add OAuth2 authentication with JWT tokens"
+- token-refresh: "Implement token refresh mechanism"
+- api-auth: "Add authentication to API endpoints"
+
+**Keywords extracted**: ["refactor", "authentication", "use", "new", "token", "system"]
+
+**Matching**:
+- user-auth: 3 matches ("authentication", "token", "jwt")
+- token-refresh: 2 matches ("token", "refresh")
+- api-auth: 2 matches ("authentication", "api")
+
+**Output**:
+```text
+Related specs found:
+- user-auth: Add OAuth2 authentication with JWT tokens...
+- token-refresh: Implement token refresh mechanism...
+- api-auth: Add authentication to API endpoints...
+
+This context may inform the interview questions.
+```
+
+## Implementation Notes
+
+### Bash Commands for Scanning
+
+```bash
+# List spec directories
+ls -d ./specs/*/ 2>/dev/null | xargs -I{} basename {} | grep -v "^\.current-spec$"
+
+# Read original goal from progress file
+sed -n '/## Original Goal/{n;p;}' "./specs/$specName/.progress.md"
+```
+
+### Edge Cases
+
+| Scenario | Handling |
+|----------|----------|
+| No ./specs/ directory | Skip scanning, proceed to interview |
+| Only current spec exists | Skip scanning (no other specs to compare) |
+| .progress.md missing | Skip that spec |
+| Empty Original Goal | Skip that spec |
+| All scores are 0 | Skip display, proceed to interview |
+
+### Performance Considerations
+
+- Scanner should complete in < 2 seconds for typical projects
+- For projects with > 20 specs, consider caching keyword extraction
+- Truncate goal text to first 200 chars for display
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index abe09154..0d261ae9 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -45,6 +45,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.3 Create coordinator-pattern skill
 - [x] 2.5 Create branch-management skill
 - [x] 2.6 Create intent-classification skill
+- [x] 2.7 Create spec-scanner skill
 
 ## Current Task
 
@@ -116,7 +117,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.7 Create spec-scanner skill
+Task 2.8 Create parallel-research skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index e9c5f956..292a4254 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -289,7 +289,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add intent-classification skill`
   - _Design: New Skills - intent-classification_
 
-- [ ] 2.7 Create spec-scanner skill
+- [x] 2.7 Create spec-scanner skill
   - **Do**:
     1. Extract spec discovery logic from start.md
     2. Document related specs finding, status checking, recommendation logic

From 08d35a93c25ec3dc41e27e1fdc00f32133a585fb Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:04:59 +0200
Subject: [PATCH 20/37] feat(ralph-specum): add parallel-research skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/parallel-research/SKILL.md         | 311 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 314 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/parallel-research/SKILL.md

diff --git a/plugins/ralph-specum/skills/parallel-research/SKILL.md b/plugins/ralph-specum/skills/parallel-research/SKILL.md
new file mode 100644
index 00000000..db02285a
--- /dev/null
+++ b/plugins/ralph-specum/skills/parallel-research/SKILL.md
@@ -0,0 +1,311 @@
+---
+name: parallel-research
+description: This skill should be used when the user asks about "parallel research", "multi-agent spawning", "parallel execution", "concurrent subagents", "research merge algorithm", "spawn multiple agents", or needs guidance on executing research tasks in parallel using multiple subagents and merging their results.
+version: 0.1.0
+---
+
+# Parallel Research Pattern
+
+The parallel research pattern enables fast, comprehensive research by spawning multiple specialized subagents simultaneously and merging their results into a unified output.
+
+## Core Principle
+
+**Parallel execution is MANDATORY - NO EXCEPTIONS.**
+
+Even for simple goals:
+- Minimum: 2 agents (1 research-analyst + 1 Explore)
+- Standard: 3-4 agents (2-3 research-analyst + 1-2 Explore)
+- Complex: 5+ agents (3-4 research-analyst + 2-3 Explore)
+
+**ALL agent Task calls MUST be in ONE message** to achieve true parallelism.
+
+## Subagent Types
+
+| Task Type | Subagent | Reason |
+|-----------|----------|--------|
+| Web search for best practices | `research-analyst` | Needs WebSearch/WebFetch tools |
+| Library/API documentation | `research-analyst` | Needs web access |
+| Codebase pattern analysis | `Explore` | Fast, read-only, optimized for code |
+| Related specs discovery | `Explore` | Fast scanning of ./specs/ |
+| Quality commands discovery | `Explore` | Fast package.json/Makefile analysis |
+| File structure exploration | `Explore` | Fast, uses Haiku model |
+
+## Pattern Components
+
+### 1. Topic Analysis
+
+Before spawning agents, identify distinct research topics:
+
+```text
+Research topics identified for parallel execution:
+1. [Topic name] - [Agent type: research-analyst/Explore]
+2. [Topic name] - [Agent type: research-analyst/Explore]
+3. [Topic name] - [Agent type: research-analyst/Explore] (if applicable)
+...
+```
+
+**Topic splitting guidelines:**
+
+| Scenario | Recommendation |
+|----------|----------------|
+| Simple, focused goal | 2 agents: 1 research-analyst (web) + 1 Explore (codebase) |
+| Goal spans multiple domains | 3-5 agents: 2-3 research-analyst + 1-2 Explore |
+| Goal involves external APIs + codebase | 2+ research-analyst + 1+ Explore |
+| Goal touches multiple components | Multiple Explore + multiple research-analyst |
+| Complex architecture question | 5+ agents: 3-4 research-analyst + 2-3 Explore |
+
+**IMPORTANT: Break external research into MULTIPLE research-analyst agents**
+- If goal involves multiple external topics (e.g., "authentication + security"), spawn separate research-analyst agents for EACH topic
+- Example: "Add OAuth with rate limiting" = 3 research-analyst agents (OAuth patterns, rate limiting strategies, security best practices)
+- DO NOT combine multiple external topics into one research-analyst agent
+
+### 2. Pre-Execution Checklist
+
+Before spawning agents, verify:
+
+- [ ] Listed at least 2 distinct research topics
+- [ ] Assigned appropriate agent type (Explore or research-analyst) to each topic
+- [ ] Prepared unique output file path for each agent (.research-*.md)
+- [ ] Prepared all Task tool calls in your response (ready to send in ONE message)
+- [ ] NOT written any code/searches yourself (coordinator does not implement)
+
+### 3. Multi-Agent Spawning
+
+**CRITICAL**: All Task tool calls MUST be in a SINGLE response message.
+
+**WRONG (Sequential)** - Each Task call in separate message:
+```text
+Message 1: Task(subagent_type: research-analyst, topic: best practices)
+[wait for result]
+Message 2: Task(subagent_type: Explore, topic: codebase)
+[wait for result]
+```
+Result: Agents run one after another = SLOW
+
+**CORRECT (Parallel)** - All Task calls in ONE message:
+```text
+Message 1:
+  Task(subagent_type: research-analyst, topic: best practices)
+  Task(subagent_type: Explore, topic: codebase)
+  Task(subagent_type: Explore, topic: quality commands)
+[all agents start simultaneously]
+```
+Result: Agents run at the same time = FAST (2-3x faster)
+
+### 4. Task Delegation Templates
+
+**External Research (research-analyst):**
+
+```yaml
+subagent_type: research-analyst
+
+You are researching for spec: $spec
+Spec path: ./specs/$spec/
+Topic: [SPECIFIC TOPIC]
+
+Focus ONLY on web research for THIS specific topic:
+1. WebSearch for best practices, industry standards
+2. WebSearch for common pitfalls and gotchas
+3. Research relevant libraries/frameworks
+4. Document findings in ./specs/$spec/.research-[topic-name].md
+
+Do NOT explore codebase - Explore agents handle that in parallel.
+Do NOT research other topics - other research-analyst agents handle those.
+```
+
+**Codebase Analysis (Explore):**
+
+```yaml
+subagent_type: Explore
+thoroughness: very thorough
+
+Analyze codebase for spec: $spec
+Output file: ./specs/$spec/.research-codebase.md
+
+Tasks:
+1. Find existing patterns related to [goal]
+2. Identify dependencies and constraints
+3. Check for similar implementations
+4. Document architectural patterns used
+
+Write findings to the output file with sections:
+- Existing Patterns (with file paths)
+- Dependencies
+- Constraints
+- Recommendations
+```
+
+**Quality Commands Discovery (Explore):**
+
+```yaml
+subagent_type: Explore
+thoroughness: quick
+
+Discover quality commands for spec: $spec
+Output file: ./specs/$spec/.research-quality.md
+
+Tasks:
+1. Read package.json scripts section
+2. Check for Makefile targets
+3. Scan .github/workflows/*.yml for CI commands
+4. Document lint, test, build, typecheck commands
+
+Write findings as table: | Type | Command | Source |
+```
+
+**Related Specs Discovery (Explore):**
+
+```yaml
+subagent_type: Explore
+thoroughness: medium
+
+Scan related specs for: $spec
+Output file: ./specs/$spec/.research-related-specs.md
+
+Tasks:
+1. List all directories in ./specs/ (each is a spec)
+2. For each spec, read .progress.md for Original Goal
+3. Read research.md/requirements.md summaries if exist
+4. Identify overlaps, conflicts, specs needing updates
+
+Write findings as table: | Name | Relevance | Relationship | mayNeedUpdate |
+```
+
+### 5. Results Merge Algorithm
+
+After ALL parallel subagent tasks complete, merge results into unified output:
+
+**Step 1: Collect partial files**
+
+Read all files created by subagents:
+- `.research-[topic-1].md`, `.research-[topic-2].md`, etc. (from research-analyst agents)
+- `.research-codebase.md` (from Explore)
+- `.research-quality.md` (from Explore)
+- `.research-related-specs.md` (from Explore)
+
+**Step 2: Create unified structure**
+
+```markdown
+# Research: $spec
+
+## Executive Summary
+[Synthesize key findings from ALL agents - 2-3 sentences]
+
+## External Research
+[Merge from ALL .research-[topic].md files from research-analyst agents]
+### Best Practices
+[From all research-analyst agents]
+### Prior Art
+[From all research-analyst agents]
+### Pitfalls to Avoid
+[From all research-analyst agents]
+
+## Codebase Analysis
+[From .research-codebase.md]
+### Existing Patterns
+### Dependencies
+### Constraints
+
+## Related Specs
+[From .research-related-specs.md]
+| Spec | Relevance | Relationship | May Need Update |
+
+## Quality Commands
+[From .research-quality.md]
+| Type | Command | Source |
+
+## Feasibility Assessment
+[Synthesize from all sources]
+| Aspect | Assessment | Notes |
+
+## Recommendations for Requirements
+[Consolidated recommendations]
+
+## Open Questions
+[Consolidated from all agents]
+
+## Sources
+[All URLs and file paths from all agents]
+```
+
+**Step 3: Cleanup**
+
+Delete partial research files after successful merge:
+```bash
+rm ./specs/$spec/.research-*.md
+```
+
+**Step 4: Quality check**
+
+Ensure:
+- No duplicate information across sections
+- Consistent formatting throughout
+- All agent contributions represented
+- Sources properly attributed
+
+## Example: Complex Goal Pattern
+
+**Goal**: "Add GraphQL API with caching"
+
+This goal has TWO distinct external topics (GraphQL + Caching), spawn 5 agents:
+
+| Agent # | Type | Focus | Output File |
+|---------|------|-------|-------------|
+| 1 | research-analyst | GraphQL best practices | .research-graphql.md |
+| 2 | research-analyst | Caching strategies | .research-caching.md |
+| 3 | Explore | Existing API patterns | .research-codebase.md |
+| 4 | Explore | Quality commands | .research-quality.md |
+| 5 | Explore | Related specs | .research-related-specs.md |
+
+All 5 Task calls in ONE message for true parallel execution.
+
+## Fail-Safe Rules
+
+**"But This Goal is Simple..."**
+
+Even trivial goals require parallel research:
+- You're wrong - spawn at least 2 agents anyway
+- Minimum: 1 Explore (codebase) + 1 research-analyst (web)
+- Parallel execution is about SPEED, not complexity
+- 2 agents in parallel = 2x faster than sequential
+
+**There are ZERO exceptions to the parallel requirement.**
+
+## Benefits
+
+- 3-5 agents in parallel = up to 90% faster research
+- Explore agents use Haiku model = very fast codebase analysis
+- Each agent has focused context = better depth
+- Results synthesized for comprehensive coverage
+
+## When NOT to Split
+
+- Topics are tightly coupled and depend on each other
+- Splitting would create redundant searches
+
+## Usage in Commands
+
+Reference this skill in commands that need parallel research:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/parallel-research/SKILL.md`
+Use the parallel research pattern to spawn multiple subagents for comprehensive research.
+</skill-reference>
+```
+
+## Anti-Patterns
+
+**DO NOT:**
+- Spawn agents one at a time in separate messages (sequential execution)
+- Combine multiple external topics into one research-analyst agent
+- Skip codebase analysis even for "simple" goals
+- Perform research yourself instead of delegating
+- Merge results before ALL agents complete
+
+**ALWAYS:**
+- Analyze topics before spawning
+- Use ONE message for ALL Task calls
+- Assign unique output files to each agent
+- Wait for ALL agents to complete before merging
+- Clean up partial files after merge
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 0d261ae9..9db69035 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -46,6 +46,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.5 Create branch-management skill
 - [x] 2.6 Create intent-classification skill
 - [x] 2.7 Create spec-scanner skill
+- [x] 2.8 Create parallel-research skill
 
 ## Current Task
 
@@ -117,7 +118,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.8 Create parallel-research skill
+Task 2.9 [VERIFY] Quality checkpoint: new skills batch 2
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 292a4254..17555425 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -300,7 +300,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add spec-scanner skill`
   - _Design: New Skills - spec-scanner_
 
-- [ ] 2.8 Create parallel-research skill
+- [x] 2.8 Create parallel-research skill
   - **Do**:
     1. Extract parallel execution pattern from research.md
     2. Document multi-agent spawning, parallel search, results merge algorithm

From abd964ad9d76dca5273b97ca10822da13318e545 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:07:05 +0200
Subject: [PATCH 21/37] feat(ralph-specum): add phase-rules skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../ralph-specum/skills/phase-rules/SKILL.md  | 234 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |  10 +-
 specs/refactor-plugins/tasks.md               |   4 +-
 3 files changed, 245 insertions(+), 3 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/phase-rules/SKILL.md

diff --git a/plugins/ralph-specum/skills/phase-rules/SKILL.md b/plugins/ralph-specum/skills/phase-rules/SKILL.md
new file mode 100644
index 00000000..a00e1be7
--- /dev/null
+++ b/plugins/ralph-specum/skills/phase-rules/SKILL.md
@@ -0,0 +1,234 @@
+---
+name: phase-rules
+description: This skill should be used when the user asks about "POC phase", "refactoring phase", "testing phase", "quality gates phase", "PR lifecycle phase", "phase-specific rules", "shortcuts allowed", "phase behaviors", or needs guidance on what behaviors, shortcuts, and requirements apply during each development phase.
+version: 0.1.0
+---
+
+# Phase-Specific Rules
+
+The spec workflow progresses through 5 distinct phases. Each phase has specific goals, allowed shortcuts, and requirements that must be followed.
+
+## Phase Overview
+
+| Phase | Goal | Shortcuts Allowed | Must Pass |
+|-------|------|-------------------|-----------|
+| 1 - POC | Working prototype | Tests, hardcoded values | Type check |
+| 2 - Refactoring | Clean code | None | Type check |
+| 3 - Testing | Complete tests | None | All tests |
+| 4 - Quality Gates | CI ready | None | All local checks |
+| 5 - PR Lifecycle | Merged PR | None | CI + reviews |
+
+## Phase 1: POC (Proof of Concept)
+
+**Goal**: Get a working prototype as fast as possible.
+
+**Allowed Shortcuts**:
+- Skip writing tests (deferred to Phase 3)
+- Accept hardcoded values (cleaned up in Phase 2)
+- Minimal error handling (enhanced in Phase 2)
+- Ignore code style/lint issues temporarily
+- Use simple implementations over elegant ones
+
+**Must Pass**:
+- Type check (if applicable to project)
+- Basic functionality works
+
+**Mindset**: "Make it work first, make it right later."
+
+**What NOT to Skip**:
+- Core functionality
+- Security-critical code
+- Breaking existing tests (existing tests must still pass)
+
+## Phase 2: Refactoring
+
+**Goal**: Clean up POC code, add robustness.
+
+**Allowed Shortcuts**:
+- None - this is the cleanup phase
+
+**Must Pass**:
+- Type check
+- Lint checks (if not too burdensome)
+- Code follows project patterns
+
+**Tasks**:
+- Replace hardcoded values with configuration
+- Add proper error handling
+- Follow project naming conventions
+- Reduce code duplication
+- Add missing type annotations
+- Improve code organization
+
+**Mindset**: "Make it right."
+
+## Phase 3: Testing
+
+**Goal**: Comprehensive test coverage.
+
+**Allowed Shortcuts**:
+- None
+
+**Must Pass**:
+- All new tests pass
+- All existing tests pass
+- Test coverage meets project requirements (if specified)
+
+**Tasks**:
+- Write unit tests for new code
+- Write integration tests if applicable
+- Add edge case tests
+- Ensure mocking is appropriate
+- Verify test isolation
+
+**Mindset**: "Make it reliable."
+
+## Phase 4: Quality Gates
+
+**Goal**: Ready for CI/CD pipeline.
+
+**Allowed Shortcuts**:
+- None
+
+**Must Pass**:
+- All local checks: lint, type, test, build
+- PR created successfully
+- CI pipeline passes
+
+**Tasks**:
+1. Run all local quality checks
+2. Fix any issues found
+3. Create PR with proper description
+4. Verify CI passes
+
+**Commands to Run** (project-specific):
+```bash
+# Discover from package.json, Makefile, or CI config
+npm run lint        # or equivalent
+npm run typecheck   # or equivalent
+npm run test        # or equivalent
+npm run build       # or equivalent
+```
+
+**Mindset**: "Make it pass all gates."
+
+## Phase 5: PR Lifecycle
+
+**Goal**: Get PR merged with all checks passing.
+
+**Allowed Shortcuts**:
+- None - this is the final validation
+
+**Must Pass**:
+- Zero test regressions
+- Code is modular/reusable
+- CI green
+- Review comments resolved
+
+**Execution Pattern**: Wait-and-iterate loop
+
+1. Push changes
+2. Wait 3-5 minutes for CI
+3. Check CI status via `gh pr checks`
+4. If failures: read logs, fix issues, push again
+5. Monitor review comments via `gh api`
+6. Address feedback, push updates
+7. Repeat until all criteria met
+
+**Tools**:
+```bash
+# Monitor CI
+gh pr checks --watch
+
+# Get PR comments
+gh api repos/{owner}/{repo}/pulls/{pr}/comments
+
+# Check CI status
+gh pr checks | grep -v "pending\|in_progress"
+```
+
+**Mindset**: "Iterate until done."
+
+**Completion Criteria**:
+- All CI checks pass (green)
+- All review comments addressed
+- No test regressions
+- Ready for merge
+
+## Phase Detection
+
+Tasks in `tasks.md` are grouped by phase:
+
+```markdown
+## Phase 1: Make It Work (POC)
+- [ ] 1.1 Core functionality
+- [ ] 1.2 Basic integration
+
+## Phase 2: Refactoring
+- [ ] 2.1 Clean up hardcoded values
+- [ ] 2.2 Add error handling
+
+## Phase 3: Testing
+- [ ] 3.1 Unit tests
+- [ ] 3.2 Integration tests
+
+## Phase 4: Quality Gates
+- [ ] 4.1 Local validation
+- [ ] 4.2 Create PR
+
+## Phase 5: PR Lifecycle
+- [ ] 5.1 Fix CI failures
+- [ ] 5.2 Address review comments
+```
+
+Read the phase header to determine current phase behavior rules.
+
+## Quality Checkpoints
+
+Quality checkpoints (`[VERIFY]` tasks) appear throughout all phases, typically every 2-3 tasks. These are NOT phase-specific but ensure ongoing quality.
+
+See `skills/quality-checkpoints/SKILL.md` for checkpoint handling.
+
+## Shortcuts Summary Table
+
+| Action | Phase 1 | Phase 2 | Phase 3 | Phase 4 | Phase 5 |
+|--------|---------|---------|---------|---------|---------|
+| Skip tests | OK | NO | NO | NO | NO |
+| Hardcoded values | OK | NO | NO | NO | NO |
+| Minimal error handling | OK | NO | NO | NO | NO |
+| Skip lint | OK | OK | OK | NO | NO |
+| Skip type check | NO | NO | NO | NO | NO |
+| Skip existing tests | NO | NO | NO | NO | NO |
+
+## Usage in Agents
+
+Reference this skill when phase-aware behavior is needed:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/phase-rules/SKILL.md`
+Follow phase-specific rules for allowed shortcuts and requirements.
+</skill-reference>
+```
+
+## Common Mistakes
+
+**Phase 1 Mistakes**:
+- Over-engineering the POC
+- Writing tests too early
+- Perfecting code style before it works
+
+**Phase 2 Mistakes**:
+- Skipping error handling
+- Not removing hardcoded values
+- Changing functionality (should only refactor)
+
+**Phase 3 Mistakes**:
+- Writing tests that test implementation details
+- Insufficient coverage
+- Flaky tests
+
+**Phase 4/5 Mistakes**:
+- Pushing without running local checks
+- Not monitoring CI results
+- Ignoring review comments
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 9db69035..9410a30c 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -47,6 +47,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.6 Create intent-classification skill
 - [x] 2.7 Create spec-scanner skill
 - [x] 2.8 Create parallel-research skill
+- [x] 2.10 Create phase-rules skill
 
 ## Current Task
 
@@ -54,6 +55,13 @@ Awaiting next task
 
 ## Learnings
 
+### Verification: 2.9 [VERIFY] Quality checkpoint: new skills batch 2
+- Status: PASS
+- Verified: 4/4 skills exist with version field
+- Skills checked: branch-management (v0.1.0), intent-classification (v0.1.0), spec-scanner (v0.1.0), parallel-research (v0.1.0)
+- All files at: plugins/ralph-specum/skills/{skill-name}/SKILL.md
+- No fixes needed
+
 ### Verification: 2.4 [VERIFY] Quality checkpoint: new skills batch 1
 - Status: PASS
 - Verified: 3/3 skills exist with version field
@@ -118,7 +126,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.9 [VERIFY] Quality checkpoint: new skills batch 2
+Task 2.11 Create commit-discipline skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 17555425..097fd7a1 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -311,13 +311,13 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add parallel-research skill`
   - _Design: New Skills - parallel-research_
 
-- [ ] 2.9 [VERIFY] Quality checkpoint: new skills batch 2
+- [x] 2.9 [VERIFY] Quality checkpoint: new skills batch 2
   - **Do**: Verify skills 4-7 have proper structure
   - **Verify**: `count=0; for s in branch-management intent-classification spec-scanner parallel-research; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
   - **Done when**: All 4 skills exist with version field
   - **Commit**: `chore(ralph-specum): pass new skills batch 2 checkpoint` (only if fixes needed)
 
-- [ ] 2.10 Create phase-rules skill
+- [x] 2.10 Create phase-rules skill
   - **Do**:
     1. Extract phase-specific rules from spec-executor.md
     2. Document POC/Refactor/Testing/Quality phase behaviors, shortcuts allowed per phase

From 0302ddb127ecb9620ededc89d94db0e40ae45efb Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:08:30 +0200
Subject: [PATCH 22/37] feat(ralph-specum): add commit-discipline skill

---
 .../skills/commit-discipline/SKILL.md         | 246 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 249 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/commit-discipline/SKILL.md

diff --git a/plugins/ralph-specum/skills/commit-discipline/SKILL.md b/plugins/ralph-specum/skills/commit-discipline/SKILL.md
new file mode 100644
index 00000000..8e34f16c
--- /dev/null
+++ b/plugins/ralph-specum/skills/commit-discipline/SKILL.md
@@ -0,0 +1,246 @@
+---
+name: commit-discipline
+description: This skill should be used when the user asks about "commit message format", "spec file commits", "commit frequency", "task commit rules", "parallel commit locking", "git commit discipline", or needs guidance on how to properly commit changes during spec task execution.
+version: 0.1.0
+---
+
+# Commit Discipline
+
+Commit discipline ensures consistent, traceable progress through spec execution. Each task produces exactly one commit with a specific format and required files.
+
+## Core Rules
+
+<mandatory>
+ALWAYS commit spec files with every task commit. This is NON-NEGOTIABLE.
+</mandatory>
+
+| Rule | Description |
+|------|-------------|
+| One task = one commit | Each completed task produces exactly one commit |
+| Commit AFTER verify | Only commit after the Verify command passes |
+| Use EXACT message | Use the commit message from the task's Commit line |
+| Never commit failing code | All checks must pass before committing |
+| Include spec files | Always stage tasks.md and progress file |
+
+## Required Spec Files
+
+Every task commit MUST include these spec files:
+
+```bash
+# Standard (sequential) execution:
+git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress.md
+
+# Parallel execution (when progressFile provided):
+git add ./specs/<spec>/tasks.md ./specs/<spec>/<progressFile>
+```
+
+### Why Spec Files Matter
+
+- `tasks.md` - Contains the `[x]` checkmark marking task complete
+- `.progress.md` - Contains learnings, completed task history, context for future tasks
+
+**Failure to commit spec files breaks progress tracking across sessions.**
+
+The coordinator and stop-hook verify task completion by reading these files. If they're not committed, the task appears incomplete despite implementation being done.
+
+## Commit Message Format
+
+Use the **exact** message from the task's Commit line:
+
+```markdown
+- [ ] 1.1 Task name
+  - **Commit**: `feat(component): add new feature`
+```
+
+Commit with:
+```bash
+git commit -m "feat(component): add new feature"
+```
+
+### Conventional Commit Prefixes
+
+| Prefix | Use Case |
+|--------|----------|
+| `feat` | New functionality |
+| `fix` | Bug fixes |
+| `refactor` | Code changes without behavior change |
+| `chore` | Maintenance, cleanup |
+| `docs` | Documentation only |
+| `test` | Test additions/changes |
+
+### Optional: Task Reference in Body
+
+For traceability, include task reference in commit body:
+
+```bash
+git commit -m "feat(auth): add login validation
+
+Task: 1.3 from specs/user-auth/tasks.md"
+```
+
+## Commit Workflow
+
+```
+1. Task execution complete
+   |
+2. Run Verify command → must PASS
+   |
+3. Mark task [x] in tasks.md
+   |
+4. Update progress file with completion
+   |
+5. Stage ALL files:
+   - Task implementation files
+   - ./specs/<spec>/tasks.md
+   - Progress file
+   |
+6. Commit with exact message from task
+   |
+7. Output TASK_COMPLETE
+```
+
+## Standard Commit Example
+
+```bash
+# After task 1.2 passes verification
+
+# 1. Stage implementation files
+git add src/components/Button.tsx src/styles/button.css
+
+# 2. Stage spec files (REQUIRED)
+git add ./specs/ui-components/tasks.md ./specs/ui-components/.progress.md
+
+# 3. Commit with task message
+git commit -m "feat(ui): add Button component"
+```
+
+## Parallel Execution Locking
+
+When running in parallel mode (progressFile provided), multiple executors may commit simultaneously. Use `flock` to serialize git operations.
+
+### Why Locking
+
+- Multiple executors can race to update tasks.md
+- Git operations are not atomic
+- Without locking, commits can conflict or corrupt state
+
+### tasks.md Updates
+
+```bash
+(
+  flock -x 200
+  # Read tasks.md, update checkmark, write back
+  sed -i '' 's/- \[ \] X.Y/- [x] X.Y/' "./specs/<spec>/tasks.md"
+) 200>"./specs/<spec>/.tasks.lock"
+```
+
+### Git Commit Operations
+
+```bash
+(
+  flock -x 200
+  git add <files>
+  git commit -m "<message>"
+) 200>"./specs/<spec>/.git-commit.lock"
+```
+
+### Flock Explanation
+
+| Element | Purpose |
+|---------|---------|
+| `flock -x 200` | Exclusive lock on file descriptor 200 |
+| `200>file.lock` | Connect fd 200 to lock file |
+| Subshell `(...)` | Lock released when subshell exits |
+
+### When to Use Locking
+
+| Mode | Locking Required |
+|------|------------------|
+| Sequential execution (no progressFile) | No |
+| Parallel execution (progressFile set) | Yes |
+
+### Lock Files
+
+| Lock File | Protects |
+|-----------|----------|
+| `.tasks.lock` | tasks.md writes |
+| `.git-commit.lock` | git add/commit operations |
+
+Lock files are cleaned up by the coordinator after batch completion.
+
+## VERIFY Task Commits
+
+`[VERIFY]` checkpoint tasks have special commit rules:
+
+1. Always include spec files in commits
+2. If qa-engineer made fixes, commit those files too
+3. Use commit message from task, or `chore(qa): pass quality checkpoint` if fixes were needed
+
+```bash
+# After VERIFICATION_PASS with fixes
+git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress.md
+git add src/fixed-file.ts  # if qa-engineer fixed something
+git commit -m "chore(qa): pass quality checkpoint"
+```
+
+## Common Mistakes
+
+**Mistake 1: Forgetting spec files**
+```bash
+# WRONG - missing spec files
+git add src/feature.ts
+git commit -m "feat: add feature"
+
+# CORRECT - includes spec files
+git add src/feature.ts ./specs/my-spec/tasks.md ./specs/my-spec/.progress.md
+git commit -m "feat: add feature"
+```
+
+**Mistake 2: Committing before verify passes**
+```
+# WRONG workflow
+Implement → Commit → Run Verify (fails) → Fix → Commit again
+
+# CORRECT workflow
+Implement → Run Verify (fails) → Fix → Run Verify (passes) → Commit once
+```
+
+**Mistake 3: Wrong commit message**
+```bash
+# Task says: Commit: `feat(auth): add login`
+
+# WRONG
+git commit -m "Added login feature"
+
+# CORRECT
+git commit -m "feat(auth): add login"
+```
+
+**Mistake 4: Committing failing code**
+```
+# WRONG - committing without verify
+git commit -m "feat: partial implementation"  # tests fail
+
+# CORRECT - only commit after verify passes
+npm test && git commit -m "feat: complete implementation"
+```
+
+## Usage in Agents
+
+Reference this skill for commit guidance:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/commit-discipline/SKILL.md`
+Follow commit discipline rules for message format and required files.
+</skill-reference>
+```
+
+## Verification
+
+The stop-hook enforces commit discipline through:
+
+1. **Uncommitted files check** - Rejects if spec files not committed
+2. **Checkmark verification** - Validates task is marked `[x]` in tasks.md
+
+False completion (claiming TASK_COMPLETE without proper commit) WILL be caught and retried.
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 9410a30c..70b512c9 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -48,6 +48,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.7 Create spec-scanner skill
 - [x] 2.8 Create parallel-research skill
 - [x] 2.10 Create phase-rules skill
+- [x] 2.11 Create commit-discipline skill
 
 ## Current Task
 
@@ -126,7 +127,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.11 Create commit-discipline skill
+Task 2.12 Create quality-checkpoints skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 097fd7a1..10863a81 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -328,7 +328,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add phase-rules skill`
   - _Design: New Skills - phase-rules_
 
-- [ ] 2.11 Create commit-discipline skill
+- [x] 2.11 Create commit-discipline skill
   - **Do**:
     1. Extract commit rules from spec-executor.md
     2. Document commit message format, spec file inclusion, commit frequency rules

From d04c8bc96eb25d40be47f7c4e55785cc27d97e9c Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:10:04 +0200
Subject: [PATCH 23/37] feat(ralph-specum): add quality-checkpoints skill

---
 .../skills/quality-checkpoints/SKILL.md       | 220 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   3 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 223 insertions(+), 2 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/quality-checkpoints/SKILL.md

diff --git a/plugins/ralph-specum/skills/quality-checkpoints/SKILL.md b/plugins/ralph-specum/skills/quality-checkpoints/SKILL.md
new file mode 100644
index 00000000..c4faf527
--- /dev/null
+++ b/plugins/ralph-specum/skills/quality-checkpoints/SKILL.md
@@ -0,0 +1,220 @@
+---
+name: quality-checkpoints
+description: This skill should be used when the user asks about "quality checkpoints", "VERIFY tasks", "checkpoint frequency", "quality gate format", "intermediate validation", "task verification", or needs guidance on inserting and formatting quality checkpoints in task lists.
+version: 0.1.0
+---
+
+# Quality Checkpoints
+
+Quality checkpoints are `[VERIFY]` tagged tasks inserted throughout a spec to catch issues early. They ensure type checking, lint, and tests pass incrementally rather than finding many errors at the end.
+
+## Checkpoint Frequency Rules
+
+Insert quality checkpoints based on task complexity:
+
+| Task Complexity | Checkpoint Frequency |
+|-----------------|---------------------|
+| Small/simple tasks | Every 3 tasks |
+| Medium tasks | Every 2-3 tasks |
+| Large/complex tasks | Every 2 tasks |
+
+**Rationale:**
+- Catch type errors, lint issues, and regressions early
+- Prevent accumulation of technical debt
+- Make debugging easier by limiting scope of potential issues
+- Ensure each batch of work maintains code quality
+
+## What Checkpoints Verify
+
+Each checkpoint runs available quality commands:
+
+1. Type checking: `pnpm check-types` or equivalent
+2. Lint: `pnpm lint` or equivalent
+3. Existing tests: `pnpm test` or equivalent (if tests exist)
+4. E2E tests: `pnpm test:e2e` or equivalent (if E2E exists)
+5. Build: Verify code compiles/builds successfully
+
+**Important:** Discover actual commands from research.md. Do NOT assume `pnpm lint` or `npm test` exists.
+
+## [VERIFY] Task Format
+
+All quality checkpoints use the `[VERIFY]` tag prefix.
+
+### Standard Checkpoint (Every 2-3 Tasks)
+
+```markdown
+- [ ] V1 [VERIFY] Quality check: <discovered lint cmd> && <discovered typecheck cmd>
+  - **Do**: Run quality commands and verify all pass
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors
+  - **Commit**: `chore(scope): pass quality checkpoint` (if fixes needed)
+```
+
+### Numbered Checkpoint in Phase
+
+```markdown
+- [ ] 1.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
+```
+
+### With Tests
+
+```markdown
+- [ ] 2.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd> && <test cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors, tests pass
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
+```
+
+## Final Verification Sequence
+
+The last 3 tasks of a spec should be a final verification sequence:
+
+### Full Local CI (V4)
+
+```markdown
+- [ ] V4 [VERIFY] Full local CI: <lint> && <typecheck> && <test> && <e2e> && <build>
+  - **Do**: Run complete local CI suite including E2E
+  - **Verify**: All commands pass
+  - **Done when**: Build succeeds, all tests pass, E2E green
+  - **Commit**: `chore(scope): pass local CI` (if fixes needed)
+```
+
+### CI Pipeline (V5)
+
+```markdown
+- [ ] V5 [VERIFY] CI pipeline passes
+  - **Do**: Verify GitHub Actions/CI passes after push
+  - **Verify**: `gh pr checks` shows all green
+  - **Done when**: CI pipeline passes
+  - **Commit**: None
+```
+
+### Acceptance Criteria (V6)
+
+```markdown
+- [ ] V6 [VERIFY] AC checklist
+  - **Do**: Read requirements.md, programmatically verify each AC-* is satisfied by checking code/tests/behavior
+  - **Verify**: Grep codebase for AC implementation, run relevant test commands
+  - **Done when**: All acceptance criteria confirmed met via automated checks
+  - **Commit**: None
+```
+
+## Phase-Specific Placement
+
+| Phase | Checkpoint After |
+|-------|------------------|
+| Phase 1 (POC) | Tasks 1.2, 1.5 (end of POC) |
+| Phase 2 (Refactor) | Tasks 2.2, 2.4 |
+| Phase 3 (Testing) | Tasks 3.2, 3.4 |
+| Phase 4 (Quality) | V4, V5, V6 as final sequence |
+
+## Example Task List with Checkpoints
+
+```markdown
+## Phase 1: Make It Work (POC)
+
+- [ ] 1.1 Create component structure
+  - **Do**: ...
+  - **Verify**: `test -f src/Component.tsx`
+  - **Commit**: `feat(ui): add component structure`
+
+- [ ] 1.2 Add component logic
+  - **Do**: ...
+  - **Verify**: Component renders
+  - **Commit**: `feat(ui): add component logic`
+
+- [ ] 1.3 [VERIFY] Quality checkpoint: pnpm lint && pnpm check-types
+  - **Do**: Run quality commands
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint/type errors
+  - **Commit**: `chore(ui): pass quality checkpoint` (if fixes needed)
+
+- [ ] 1.4 Add API integration
+  - **Do**: ...
+  - **Verify**: API call succeeds
+  - **Commit**: `feat(ui): add API integration`
+
+- [ ] 1.5 POC validation
+  - **Do**: End-to-end test of feature
+  - **Verify**: Feature works via automated test
+  - **Commit**: `feat(ui): complete POC`
+```
+
+## VF Task for Fix Goals
+
+When `.progress.md` contains `## Reality Check (BEFORE)`, the goal is a fix-type and requires a VF (Verification Final) task at the end of Phase 4:
+
+```markdown
+- [ ] VF [VERIFY] Goal verification: original failure now passes
+  - **Do**:
+    1. Read BEFORE state from .progress.md
+    2. Re-run reproduction command from Reality Check (BEFORE)
+    3. Compare output with BEFORE failure
+    4. Document AFTER state in .progress.md
+  - **Verify**: Exit code 0 for reproduction command
+  - **Done when**: Command that failed before now passes
+  - **Commit**: `chore(<spec>): verify fix resolves original issue`
+```
+
+## Checkpoint Execution by spec-executor
+
+When spec-executor receives a `[VERIFY]` task:
+
+1. **Detection**: Check if task description contains `[VERIFY]` tag
+2. **Delegation**: Delegate to qa-engineer subagent
+3. **Result handling**:
+   - `VERIFICATION_PASS`: Mark complete, commit if fixes made
+   - `VERIFICATION_FAIL`: Keep task open, log details in .progress.md Learnings
+
+### qa-engineer Invocation
+
+```
+Task: Execute this verification task
+
+Spec: <spec-name>
+Path: <spec-path>
+
+Task: <full task description>
+
+Task Body:
+<Do/Verify/Done when sections>
+```
+
+## Commit Rules for Checkpoints
+
+| Scenario | Commit Required |
+|----------|-----------------|
+| All checks pass, no fixes | No commit |
+| Fixes were needed | Yes: `chore(scope): pass quality checkpoint` |
+| Verification-only (V5, V6) | No commit |
+
+Always include spec files in commits:
+```bash
+git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress.md
+```
+
+## Usage in Agents
+
+Reference this skill for checkpoint guidance:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/quality-checkpoints/SKILL.md`
+Use checkpoint format and frequency rules when planning quality gates.
+</skill-reference>
+```
+
+## Quality Checklist for Task Planning
+
+Before completing task list, verify:
+
+- [ ] Checkpoint inserted after every 2-3 tasks
+- [ ] Checkpoints use actual commands from research.md
+- [ ] Final verification sequence includes V4, V5, V6
+- [ ] VF task included if goal is fix-type
+- [ ] All checkpoints follow [VERIFY] format
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 70b512c9..f64be6ce 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -49,6 +49,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.8 Create parallel-research skill
 - [x] 2.10 Create phase-rules skill
 - [x] 2.11 Create commit-discipline skill
+- [x] 2.12 Create quality-checkpoints skill - pending commit
 
 ## Current Task
 
@@ -127,7 +128,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.12 Create quality-checkpoints skill
+Task 2.13 Create quality-commands skill
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 10863a81..6cb447cf 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -339,7 +339,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add commit-discipline skill`
   - _Design: New Skills - commit-discipline_
 
-- [ ] 2.12 Create quality-checkpoints skill
+- [x] 2.12 Create quality-checkpoints skill
   - **Do**:
     1. Extract [VERIFY] task rules from task-planner.md
     2. Document checkpoint frequency, format, verification commands

From e8b4f1a2311a5bc5daca9f38b6528ca3f5d2a55e Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:11:25 +0200
Subject: [PATCH 24/37] feat(ralph-specum): add quality-commands skill

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../skills/quality-commands/SKILL.md          | 195 ++++++++++++++++++
 specs/refactor-plugins/.progress.md           |   5 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 199 insertions(+), 3 deletions(-)
 create mode 100644 plugins/ralph-specum/skills/quality-commands/SKILL.md

diff --git a/plugins/ralph-specum/skills/quality-commands/SKILL.md b/plugins/ralph-specum/skills/quality-commands/SKILL.md
new file mode 100644
index 00000000..5c7c4ee5
--- /dev/null
+++ b/plugins/ralph-specum/skills/quality-commands/SKILL.md
@@ -0,0 +1,195 @@
+---
+name: quality-commands
+description: This skill should be used when the user asks about "discover quality commands", "find lint command", "check package.json scripts", "Makefile targets", "CI workflow commands", "project test commands", or needs guidance on discovering the actual quality commands available in a project.
+version: 0.1.0
+---
+
+# Quality Command Discovery
+
+Quality command discovery is essential because projects use different tools and scripts. Never assume `npm test` or `pnpm lint` exists - always discover actual commands from project sources.
+
+## Why Discovery Matters
+
+- Projects use different package managers (npm, pnpm, yarn, bun)
+- Script names vary (lint, eslint, check-types, typecheck, tsc)
+- Some projects use Makefiles instead of package.json
+- CI configs reveal the authoritative commands actually run
+
+## Sources to Check
+
+Check these sources in order of preference:
+
+### 1. package.json (Primary Source)
+
+```bash
+cat package.json | jq '.scripts'
+```
+
+Look for keywords: `lint`, `typecheck`, `type-check`, `check-types`, `test`, `build`, `e2e`, `integration`, `unit`, `verify`, `validate`, `check`
+
+Common patterns:
+- `lint` - ESLint/linting
+- `typecheck` or `check-types` - TypeScript checking
+- `test` - All tests or unit tests
+- `test:unit` - Unit tests specifically
+- `test:integration` - Integration tests
+- `test:e2e` or `e2e` - End-to-end tests
+- `build` - Build/compile
+
+### 2. Makefile (If Exists)
+
+```bash
+grep -E '^[a-z]+:' Makefile
+```
+
+Look for keywords: `lint`, `test`, `check`, `build`, `e2e`, `integration`, `unit`, `verify` targets
+
+### 3. CI Configs (.github/workflows/*.yml)
+
+```bash
+grep -E 'run:' .github/workflows/*.yml
+```
+
+Extract actual commands from CI steps - these are authoritative.
+
+## Discovery Commands to Run
+
+Run these during research phase:
+
+```bash
+# Check package.json scripts
+cat package.json | jq -r '.scripts | keys[]' 2>/dev/null || echo "No package.json"
+
+# Check Makefile targets
+grep -E '^[a-z_-]+:' Makefile 2>/dev/null | head -20 || echo "No Makefile"
+
+# Check CI workflow commands
+grep -rh 'run:' .github/workflows/*.yml 2>/dev/null | head -20 || echo "No CI configs"
+```
+
+## Package Manager Detection
+
+Detect the correct package manager:
+
+| File Exists | Package Manager | Run Prefix |
+|-------------|-----------------|------------|
+| `pnpm-lock.yaml` | pnpm | `pnpm run` |
+| `yarn.lock` | yarn | `yarn` |
+| `bun.lockb` | bun | `bun run` |
+| `package-lock.json` | npm | `npm run` |
+
+```bash
+# Detection command
+if [ -f pnpm-lock.yaml ]; then echo "pnpm";
+elif [ -f yarn.lock ]; then echo "yarn";
+elif [ -f bun.lockb ]; then echo "bun";
+else echo "npm"; fi
+```
+
+## Output Format
+
+Document discovered commands in research.md:
+
+```markdown
+## Quality Commands
+
+| Type | Command | Source |
+|------|---------|--------|
+| Lint | `pnpm run lint` | package.json scripts.lint |
+| TypeCheck | `pnpm run check-types` | package.json scripts.check-types |
+| Unit Test | `pnpm test:unit` | package.json scripts.test:unit |
+| Integration Test | `pnpm test:integration` | package.json scripts.test:integration |
+| E2E Test | `pnpm test:e2e` | package.json scripts.test:e2e |
+| Test (all) | `pnpm test` | package.json scripts.test |
+| Build | `pnpm run build` | package.json scripts.build |
+
+**Local CI**: `pnpm run lint && pnpm run check-types && pnpm test && pnpm run build`
+```
+
+## Handling Missing Commands
+
+If a command type is not found in the project:
+
+```markdown
+| Type | Command | Source |
+|------|---------|--------|
+| Lint | Not found | - |
+| TypeCheck | `pnpm run check-types` | package.json |
+| E2E Test | Not found | - |
+```
+
+Mark as "Not found" so task-planner knows to skip that check in `[VERIFY]` tasks.
+
+## Fallback Commands
+
+When project lacks explicit scripts, use these fallbacks:
+
+| Type | Fallback | Condition |
+|------|----------|-----------|
+| TypeCheck | `npx tsc --noEmit` | tsconfig.json exists |
+| Lint | `npx eslint .` | .eslintrc* exists |
+| Test | `npx jest` | jest.config.* exists |
+| Test | `npx vitest run` | vitest.config.* exists |
+
+```bash
+# TypeScript check fallback
+if [ -f tsconfig.json ] && ! grep -q '"typecheck\|check-types"' package.json; then
+  echo "Fallback: npx tsc --noEmit"
+fi
+```
+
+## Quality Command Categories
+
+### Required (Always Check)
+
+1. **TypeCheck** - Must pass for code to be valid
+2. **Lint** - Must pass for code style compliance
+3. **Build** - Must pass for deployment
+
+### Optional (When Available)
+
+4. **Unit Test** - Run if exists
+5. **Integration Test** - Run if exists
+6. **E2E Test** - Run if exists (often slow)
+
+## Local CI Command
+
+Construct a "Local CI" command that mirrors what CI runs:
+
+```bash
+# Template
+<lint-cmd> && <typecheck-cmd> && <test-cmd> && <build-cmd>
+
+# Example
+pnpm run lint && pnpm run check-types && pnpm test && pnpm run build
+```
+
+Skip unavailable commands:
+
+```bash
+# If no lint script
+pnpm run check-types && pnpm test && pnpm run build
+```
+
+## Usage in Agents
+
+Reference this skill when discovering quality commands:
+
+```markdown
+<skill-reference>
+**Apply skill**: `skills/quality-commands/SKILL.md`
+Discover actual quality commands before creating [VERIFY] tasks.
+</skill-reference>
+```
+
+## Quality Checklist
+
+Before completing discovery:
+
+- [ ] Checked package.json scripts
+- [ ] Checked Makefile (if exists)
+- [ ] Checked CI workflow commands
+- [ ] Detected correct package manager
+- [ ] Documented all found commands in table format
+- [ ] Marked missing command types as "Not found"
+- [ ] Constructed Local CI command
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index f64be6ce..f0041832 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -49,7 +49,8 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.8 Create parallel-research skill
 - [x] 2.10 Create phase-rules skill
 - [x] 2.11 Create commit-discipline skill
-- [x] 2.12 Create quality-checkpoints skill - pending commit
+- [x] 2.12 Create quality-checkpoints skill
+- [x] 2.13 Create quality-commands skill
 
 ## Current Task
 
@@ -128,7 +129,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.13 Create quality-commands skill
+Task 2.14 [VERIFY] Quality checkpoint: new skills batch 3
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 6cb447cf..3b93d107 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -350,7 +350,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add quality-checkpoints skill`
   - _Design: New Skills - quality-checkpoints_
 
-- [ ] 2.13 Create quality-commands skill
+- [x] 2.13 Create quality-commands skill
   - **Do**:
     1. Extract quality command discovery from research-analyst.md
     2. Document package.json/Makefile/CI discovery patterns, fallback commands

From 1fede3f1b407c13b2ec3d171fd15dd0f60b47a2a Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:14:04 +0200
Subject: [PATCH 25/37] refactor(ralph-specum): simplify implement.md to
 reference skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/commands/implement.md | 1091 +-------------------
 specs/refactor-plugins/.progress.md        |   17 +-
 specs/refactor-plugins/tasks.md            |    4 +-
 3 files changed, 63 insertions(+), 1049 deletions(-)

diff --git a/plugins/ralph-specum/commands/implement.md b/plugins/ralph-specum/commands/implement.md
index 10177422..eb6736b7 100644
--- a/plugins/ralph-specum/commands/implement.md
+++ b/plugins/ralph-specum/commands/implement.md
@@ -86,66 +86,23 @@ Write this prompt to `./specs/$spec/.coordinator-prompt.md`:
 ```text
 You are the execution COORDINATOR for spec: $spec
 
-### 1. Role Definition
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/coordinator-pattern/SKILL.md`
 
-You are a COORDINATOR, NOT an implementer. Your job is to:
-- Read state and determine current task
-- Delegate task execution to spec-executor via Task tool
-- Track completion and signal when all tasks done
+Use this skill for:
+- Role definition (coordinator vs implementer)
+- State reading from .ralph-state.json
+- Task delegation via Task tool
+- Completion checking and signaling
+- State updates after task completion
+- Retry handling logic
+- Parallel execution patterns
+</skill-reference>
 
-CRITICAL: You MUST delegate via Task tool. Do NOT implement tasks yourself.
-You are fully autonomous. NEVER ask questions or wait for user input.
-
-### 2. Read State
-
-Read `./specs/$spec/.ralph-state.json` to get current state:
-
-```json
-{
-  "phase": "execution",
-  "taskIndex": <current task index, 0-based>,
-  "totalTasks": <total task count>,
-  "taskIteration": <retry count for current task>,
-  "maxTaskIterations": <max retries>
-}
-```
-
-**ERROR: Missing/Corrupt State File**
-
-If state file missing or corrupt (invalid JSON, missing required fields):
-1. Output error: "ERROR: State file missing or corrupt at ./specs/$spec/.ralph-state.json"
-2. Suggest: "Run /ralph-specum:implement to reinitialize execution state"
-3. Do NOT continue execution
-4. Do NOT output ALL_TASKS_COMPLETE
-
-### 3. Check Completion
-
-If taskIndex >= totalTasks:
-1. Verify all tasks marked [x] in tasks.md
-2. Delete .ralph-state.json (cleanup)
-3. Output: ALL_TASKS_COMPLETE
-4. STOP - do not delegate any task
-
-### 4. Parse Current Task
+### Task Parsing
 
 Read `./specs/$spec/tasks.md` and find the task at taskIndex (0-based).
 
-**ERROR: Missing tasks.md**
-
-If tasks.md does not exist:
-1. Output error: "ERROR: Tasks file missing at ./specs/$spec/tasks.md"
-2. Suggest: "Run /ralph-specum:tasks to generate task list"
-3. Do NOT continue execution
-4. Do NOT output ALL_TASKS_COMPLETE
-
-**ERROR: Missing Spec Directory**
-
-If spec directory does not exist (./specs/$spec/):
-1. Output error: "ERROR: Spec directory missing at ./specs/$spec/"
-2. Suggest: "Run /ralph-specum:new <spec-name> to create a new spec"
-3. Do NOT continue execution
-4. Do NOT output ALL_TASKS_COMPLETE
-
 Tasks follow this format:
 ```markdown
 - [ ] X.Y Task description
@@ -163,38 +120,7 @@ Detect markers in task description:
 - [VERIFY] = verification task (delegate to qa-engineer)
 - No marker = sequential task
 
-### 5. Parallel Group Detection
-
-If current task has [P] marker, scan for consecutive [P] tasks starting from taskIndex.
-
-Build parallelGroup structure:
-```json
-{
-  "startIndex": <first [P] task index>,
-  "endIndex": <last consecutive [P] task index>,
-  "taskIndices": [startIndex, startIndex+1, ..., endIndex],
-  "isParallel": true
-}
-```
-
-Rules:
-- Adjacent [P] tasks form a single parallel batch
-- Non-[P] task breaks the sequence
-- Single [P] task treated as sequential (no parallelism benefit)
-
-If no [P] marker on current task, set:
-```json
-{
-  "startIndex": <taskIndex>,
-  "endIndex": <taskIndex>,
-  "taskIndices": [taskIndex],
-  "isParallel": false
-}
-```
-
-### 6. Task Delegation
-
-**[VERIFY] Task Detection**:
+### [VERIFY] Task Detection
 
 Before standard delegation, check if current task has [VERIFY] marker.
 Look for `[VERIFY]` in task description line (e.g., `- [ ] 1.4 [VERIFY] Quality checkpoint`).
@@ -204,880 +130,49 @@ If [VERIFY] marker present:
 2. Delegate to qa-engineer via Task tool instead
 3. [VERIFY] tasks are ALWAYS sequential (break parallel groups)
 
-Delegate [VERIFY] task to qa-engineer:
-```text
-Task: Execute verification task $taskIndex for spec $spec
-
-Spec: $spec
-Path: ./specs/$spec/
-
-Task: [Full task description]
-
-Task Body:
-[Include Do, Verify, Done when sections]
-
-Instructions:
-1. Execute the verification as specified
-2. If issues found, attempt to fix them
-3. Output VERIFICATION_PASS if verification succeeds
-4. Output VERIFICATION_FAIL if verification fails and cannot be fixed
-```
-
-Handle qa-engineer response:
-- VERIFICATION_PASS: Treat as TASK_COMPLETE, mark task [x], update .progress.md
-- VERIFICATION_FAIL: Do NOT mark complete, increment taskIteration, retry or error if max reached
-
-**Sequential Execution** (parallelGroup.isParallel = false, no [VERIFY]):
-
-Delegate ONE task to spec-executor via Task tool:
-
-```text
-Task: Execute task $taskIndex for spec $spec
-
-Spec: $spec
-Path: ./specs/$spec/
-Task index: $taskIndex
-
-Context from .progress.md:
-[Include relevant context]
-
-Current task from tasks.md:
-[Include full task block]
-
-Instructions:
-1. Read Do section and execute exactly
-2. Only modify Files listed
-3. Verify completion with Verify command
-4. Commit with task's Commit message
-5. Update .progress.md with completion and learnings
-6. Mark task [x] in tasks.md
-7. Output TASK_COMPLETE when done
-```
-
-Wait for spec-executor to complete. It will output TASK_COMPLETE on success.
-
-**Parallel Execution** (parallelGroup.isParallel = true):
-
-CRITICAL: Spawn MULTIPLE Task tool calls in ONE message. This enables true parallelism.
-
-For each task index in parallelGroup.taskIndices, create a Task tool call with:
-- Unique progressFile: `.progress-task-$taskIndex.md`
-- Full task block from tasks.md
-- Same instructions as sequential but writing to temp progress file
-
-Example for parallel batch of tasks 3, 4, 5:
-```text
-[Task tool call 1]
-Task: Execute task 3 for spec $spec
-progressFile: .progress-task-3.md
-...
-
-[Task tool call 2]
-Task: Execute task 4 for spec $spec
-progressFile: .progress-task-4.md
-...
-
-[Task tool call 3]
-Task: Execute task 5 for spec $spec
-progressFile: .progress-task-5.md
-...
-```
-
-All parallel tasks execute simultaneously. Wait for ALL to complete.
-
-**After Delegation**:
-
-If spec-executor outputs TASK_COMPLETE (or qa-engineer outputs VERIFICATION_PASS):
-1. Run verification layers (section 7) before advancing
-2. If all verifications pass, proceed to state update
-
-If no completion signal:
-1. First, parse the failure output (section 6b)
-2. Increment taskIteration in state file
-3. If taskIteration > maxTaskIterations: proceed to max retries error handling
-4. Otherwise: Retry the same task
-
-### 6b. Parse Failure Output
-
-When spec-executor does not output TASK_COMPLETE, parse the failure output to extract error details.
-
-**Failure Output Pattern**:
-Spec-executor outputs failures in this format:
-```text
-Task X.Y: [task name] FAILED
-- Error: [description]
-- Attempted fix: [what was tried]
-- Status: Blocked, needs manual intervention
-```
-
-**Parsing Logic**:
-
-1. **Check for FAILED marker**:
-   - Look for pattern: `Task \d+\.\d+:.*FAILED`
-   - If found, proceed to extract details
-   - If not found, use generic failure: "Task did not complete"
-
-2. **Extract Error Details**:
-   - Match `- Error: (.*)` to get error description
-   - Match `- Attempted fix: (.*)` to get fix attempt details
-   - Match `- Status: (.*)` to get status message
-
-3. **Build Failure Object**:
-   ```json
-   {
-     "taskId": "<X.Y from match>",
-     "failed": true,
-     "error": "<extracted from Error: line>",
-     "attemptedFix": "<extracted from Attempted fix: line>",
-     "status": "<extracted from Status: line>",
-     "rawOutput": "<full spec-executor output for context>"
-   }
-   ```
-
-4. **Handle Missing Fields**:
-   - If Error: line missing, use "Task execution failed"
-   - If Attempted fix: line missing, use "No fix attempted"
-   - If Status: line missing, use "Unknown status"
-
-**Example Parsing**:
-
-Input (spec-executor output):
-```text
-Task 1.3: Add failure parser FAILED
-- Error: File not found: src/parser.ts
-- Attempted fix: Checked alternate paths
-- Status: Blocked, needs manual intervention
-```
-
-Parsed failure object:
-```json
-{
-  "taskId": "1.3",
-  "failed": true,
-  "error": "File not found: src/parser.ts",
-  "attemptedFix": "Checked alternate paths",
-  "status": "Blocked, needs manual intervention",
-  "rawOutput": "..."
-}
-```
-
-This failure object is used by the recovery orchestrator (section 6c) to generate fix tasks when recoveryMode is enabled.
-
-### 6c. Fix Task Generator
-
-When recoveryMode is enabled and a task fails, generate a fix task from the failure details.
-
-**Check Recovery Mode**:
-
-First, verify recovery mode is enabled:
-1. Read `recoveryMode` from .ralph-state.json
-2. If `recoveryMode` is false or missing, skip to "ERROR: Max Retries Reached"
-3. If `recoveryMode` is true, proceed with fix task generation
-
-**Check Fix Task Limits**:
-
-Before generating a fix task:
-1. Read `fixTaskMap` from .ralph-state.json
-2. Check if `fixTaskMap[taskId].attempts >= maxFixTasksPerOriginal`
-3. If limit reached:
-   - Output error: "ERROR: Max fix attempts ($maxFixTasksPerOriginal) reached for task $taskId"
-   - Show fix history: "Fix attempts: $fixTaskMap[taskId].fixTaskIds"
-   - Do NOT output ALL_TASKS_COMPLETE
-   - STOP execution
-
-**Generate Fix Task Markdown**:
-
-Use the failure object from section 6b to create a fix task:
-
-```text
-Fix Task ID: $taskId.$attemptNumber
-  where attemptNumber = fixTaskMap[taskId].attempts + 1 (or 1 if first attempt)
-
-Fix Task Format:
-- [ ] $taskId.$attemptNumber [FIX $taskId] Fix: $errorSummary
-  - **Do**: Address the error: $failure.error
-    1. Analyze the failure: $failure.attemptedFix
-    2. Review related code in Files list
-    3. Implement fix for: $failure.error
-  - **Files**: $originalTask.files
-  - **Done when**: Error "$failure.error" no longer occurs
-  - **Verify**: $originalTask.verify
-  - **Commit**: `fix($scope): address $errorType from task $taskId`
-```
-
-**Field Derivation**:
-
-| Field                | Source                              | Fallback                       |
-|----------------------|-------------------------------------|--------------------------------|
-| errorSummary         | First 50 chars of failure.error     | "task $taskId failure"         |
-| failure.error        | Parsed from Error: line             | "Task execution failed"        |
-| failure.attemptedFix | Parsed from Attempted fix: line     | "No previous fix attempted"    |
-| originalTask.files   | Files field from original task      | Same directory as original     |
-| originalTask.verify  | Verify field from original task     | "echo 'Verify manually'"       |
-| $scope               | Derived from spec name or task area | "recovery"                     |
-| $errorType           | Error category (e.g., "syntax", "missing file") | "error"           |
-
-**Example Fix Task Generation**:
-
-Original task (failed):
-```markdown
-- [ ] 1.3 Add failure parser
-  - **Do**: Add parsing logic to implement.md
-  - **Files**: plugins/ralph-specum/commands/implement.md
-  - **Done when**: Parser extracts error details
-  - **Verify**: grep -q "Parse Failure" implement.md
-  - **Commit**: feat(coordinator): add failure parser
-```
-
-Failure object:
-```json
-{
-  "taskId": "1.3",
-  "error": "File not found: src/parser.ts",
-  "attemptedFix": "Checked alternate paths"
-}
-```
-
-Generated fix task:
-```markdown
-- [ ] 1.3.1 [FIX 1.3] Fix: File not found: src/parser.ts
-  - **Do**: Address the error: File not found: src/parser.ts
-    1. Analyze the failure: Checked alternate paths
-    2. Review related code in Files list
-    3. Implement fix for: File not found: src/parser.ts
-  - **Files**: plugins/ralph-specum/commands/implement.md
-  - **Done when**: Error "File not found: src/parser.ts" no longer occurs
-  - **Verify**: grep -q "Parse Failure" implement.md
-  - **Commit**: `fix(recovery): address missing file from task 1.3`
-```
-
-**Update State After Generation**:
-
-After generating the fix task:
-1. Increment `fixTaskMap[taskId].attempts`
-2. Add fix task ID to `fixTaskMap[taskId].fixTaskIds` array
-3. Store error in `fixTaskMap[taskId].lastError`
-4. Write updated .ralph-state.json
-
-**Insert Fix Task into tasks.md**:
-
-Use the Edit tool to cleanly insert the fix task after the current task block.
-
-**Algorithm**:
-
-1. **Read tasks.md content** using Read tool
-
-2. **Locate current task start**:
-   - Search for pattern: `- [ ] $taskId` or `- [x] $taskId`
-   - Store the line number as `taskStartLine`
-
-3. **Find current task block end**:
-   - Scan forward from `taskStartLine + 1`
-   - Task block ends at first line matching:
-     - `- [ ]` (next task start)
-     - `- [x]` (next completed task)
-     - `## Phase` (next phase header)
-     - End of file
-   - Store this line as `insertPosition`
-
-4. **Build insertion content**:
-   - Start with newline if needed for spacing
-   - Add the complete fix task markdown block:
-   ```markdown
-   - [ ] X.Y.N [FIX X.Y] Fix: $errorSummary
-     - **Do**: Address the error: $errorDetails
-       1. Analyze the failure: $attemptedFix
-       2. Review related code in Files list
-       3. Implement fix for: $errorDetails
-     - **Files**: $originalTaskFiles
-     - **Done when**: Error "$errorDetails" no longer occurs
-     - **Verify**: $originalTaskVerify
-     - **Commit**: `fix($scope): address $errorType from task $taskId`
-   ```
-   - Ensure proper indentation (2 spaces for sub-bullets)
-
-5. **Insert using Edit tool**:
-   - Use Edit tool with `old_string` = content at insertion point
-   - `new_string` = fix task markdown + original content at insertion point
-   - This places fix task immediately after original task block
-
-6. **Update state totalTasks**:
-   - Read .ralph-state.json
-   - Increment `totalTasks` by 1
-   - Write updated state
-
-**Example Insertion**:
-
-Before insertion (task 1.3 failed):
-
-```markdown
-- [ ] 1.3 Add failure parser
-  - **Do**: Add parsing logic
-  - **Files**: implement.md
-  - **Verify**: grep pattern
-  - **Commit**: feat: add parser
-
-- [ ] 1.4 Next task
-```
-
-After insertion:
-
-```markdown
-- [ ] 1.3 Add failure parser
-  - **Do**: Add parsing logic
-  - **Files**: implement.md
-  - **Verify**: grep pattern
-  - **Commit**: feat: add parser
-
-- [ ] 1.3.1 [FIX 1.3] Fix: File not found error
-  - **Do**: Address the error: File not found
-    1. Analyze the failure: Checked alternate paths
-    2. Review related code in Files list
-    3. Implement fix for: File not found
-  - **Files**: implement.md
-  - **Done when**: Error "File not found" no longer occurs
-  - **Verify**: grep pattern
-  - **Commit**: `fix(recovery): address missing file from task 1.3`
-
-- [ ] 1.4 Next task
-```
-
-**Execute Fix Task**:
-
-After insertion:
-1. Delegate fix task to spec-executor (same as section 6)
-2. On TASK_COMPLETE: retry original task
-3. On failure: loop back to section 6c (new fix task for fix task)
-
-**Retry Original Task**:
-
-After fix task completes:
-1. Return to original task (taskIndex unchanged)
-2. Delegate original task to spec-executor
-3. If TASK_COMPLETE: proceed to section 7 (verification) then section 8 (state update)
-4. If failure: loop back to section 6c (generate another fix task)
-
-### 6d. Iterative Failure Recovery Orchestrator
-
-This section orchestrates the complete failure recovery loop when recoveryMode is enabled.
-
-**Backwards Compatibility Note**:
-
-recoveryMode defaults to false. When recoveryMode is false or missing, the existing behavior (retry then stop) is preserved exactly. The recovery orchestrator only activates when recoveryMode is explicitly set to true via --recovery-mode flag.
-
-**Entry Point**:
-
-When spec-executor does NOT output TASK_COMPLETE:
-1. First, check if `recoveryMode` is true in .ralph-state.json
-2. If recoveryMode is false, undefined, or missing: skip to "ERROR: Max Retries Reached" (existing behavior preserved)
-3. If recoveryMode is explicitly true: proceed with iterative recovery
-
-**Recovery Loop Flow**:
-
-```text
-┌─────────────────────────────────────────────────────────────────┐
-│                    ITERATIVE FAILURE RECOVERY                   │
-├─────────────────────────────────────────────────────────────────┤
-│                                                                 │
-│  1. Task fails (no TASK_COMPLETE)                               │
-│     │                                                           │
-│     ▼                                                           │
-│  2. Check recoveryMode in state                                 │
-│     │                                                           │
-│     ├── false ──► Normal retry/stop behavior                    │
-│     │                                                           │
-│     ▼ (true)                                                    │
-│  3. Parse failure output (Section 6b)                           │
-│     Extract: taskId, error, attemptedFix                        │
-│     │                                                           │
-│     ▼                                                           │
-│  4. Check fix limits (Section 6c)                               │
-│     Read: fixTaskMap[taskId].attempts                           │
-│     │                                                           │
-│     ├── >= maxFixTasksPerOriginal ──► STOP with error           │
-│     │                                                           │
-│     ▼ (under limit)                                             │
-│  5. Generate fix task (Section 6c)                              │
-│     Create: X.Y.N [FIX X.Y] Fix: <error>                        │
-│     │                                                           │
-│     ▼                                                           │
-│  6. Insert fix task into tasks.md (Section 6c)                  │
-│     Position: immediately after original task                   │
-│     │                                                           │
-│     ▼                                                           │
-│  7. Update state                                                │
-│     - Increment fixTaskMap[taskId].attempts                     │
-│     - Add fix task ID to fixTaskMap[taskId].fixTaskIds          │
-│     - Increment totalTasks                                      │
-│     │                                                           │
-│     ▼                                                           │
-│  8. Execute fix task                                            │
-│     Delegate to spec-executor (same as Section 6)               │
-│     │                                                           │
-│     ├── TASK_COMPLETE ──► Proceed to step 9                     │
-│     │                                                           │
-│     └── No completion ──► Loop back to step 3                   │
-│         (fix task becomes current, can spawn its own fixes)     │
-│     │                                                           │
-│     ▼                                                           │
-│  9. Retry original task                                         │
-│     Delegate original task to spec-executor again               │
-│     │                                                           │
-│     ├── TASK_COMPLETE ──► Success! Section 7 verification       │
-│     │                                                           │
-│     └── No completion ──► Loop back to step 3                   │
-│         (generate another fix for original task)                │
-│                                                                 │
-└─────────────────────────────────────────────────────────────────┘
-```
-
-**Step-by-Step Implementation**:
-
-**Step 1: Check Recovery Mode**
-
-```text
-Read .ralph-state.json
-# recoveryMode defaults to false for backwards compatibility
-If recoveryMode !== true (false, undefined, or missing):
-  - Increment taskIteration
-  - If taskIteration > maxTaskIterations: ERROR and STOP
-  - Otherwise: retry same task (existing behavior preserved)
-  - EXIT this section - do NOT proceed with recovery orchestration
-```
-
-**Step 2: Parse Failure (calls Section 6b)**
-
-```text
-Parse spec-executor output using pattern from Section 6b
-Build failure object:
-{
-  "taskId": "X.Y",
-  "error": "<from Error: line>",
-  "attemptedFix": "<from Attempted fix: line>",
-  "rawOutput": "<full output>"
-}
-```
-
-**Step 3: Check Fix Limits (from Section 6c)**
-
-```text
-Read fixTaskMap from state
-currentAttempts = fixTaskMap[taskId].attempts || 0
-
-If currentAttempts >= maxFixTasksPerOriginal:
-  - Output ERROR: "Max fix attempts ($max) reached for task $taskId"
-  - Show fix history: fixTaskMap[taskId].fixTaskIds
-  - Do NOT output ALL_TASKS_COMPLETE
-  - STOP execution
-```
-
-**Step 4: Generate Fix Task (calls Section 6c)**
+### Failure Handling
 
-```text
-Use failure object to create fix task markdown:
-- [ ] X.Y.N [FIX X.Y] Fix: <errorSummary>
-  - **Do**: Address the error: <error>
-  - **Files**: <originalTask.files>
-  - **Done when**: Error no longer occurs
-  - **Verify**: <originalTask.verify>
-  - **Commit**: fix(<scope>): address <errorType>
-
-Where N = currentAttempts + 1
-```
-
-**Step 5: Insert Fix Task (calls Section 6c)**
-
-```text
-Use Edit tool to insert fix task into tasks.md
-Position: immediately after original task block
-Update totalTasks in state
-```
-
-**Step 6: Update State (fixTaskMap tracking)**
-
-After generating a fix task, update fixTaskMap in state to track:
-- attempts: number of fix tasks generated for this original task
-- fixTaskIds: array of fix task IDs created
-- lastError: most recent error message
-
-**Implementation using jq**:
-
-```bash
-# Variables from context
-SPEC_PATH="./specs/$spec"
-TASK_ID="X.Y"           # Original task ID (e.g., "1.3")
-FIX_TASK_ID="X.Y.N"     # Generated fix task ID (e.g., "1.3.1")
-ERROR_MSG="$failure_error"  # Escaped error message from failure object
-
-# Read current state, update fixTaskMap, write back
-jq --arg taskId "$TASK_ID" \
-   --arg fixId "$FIX_TASK_ID" \
-   --arg error "$ERROR_MSG" \
-   '
-   # Initialize fixTaskMap if it does not exist
-   .fixTaskMap //= {} |
-
-   # Initialize entry for this task if it does not exist
-   .fixTaskMap[$taskId] //= {attempts: 0, fixTaskIds: [], lastError: ""} |
-
-   # Update the entry
-   .fixTaskMap[$taskId].attempts += 1 |
-   .fixTaskMap[$taskId].fixTaskIds += [$fixId] |
-   .fixTaskMap[$taskId].lastError = $error |
-
-   # Also increment totalTasks to account for inserted fix task
-   .totalTasks += 1
-   ' "$SPEC_PATH/.ralph-state.json" > "$SPEC_PATH/.ralph-state.json.tmp" && \
-   mv "$SPEC_PATH/.ralph-state.json.tmp" "$SPEC_PATH/.ralph-state.json"
-```
-
-**Example state after fix task generation**:
-
-Before (task 1.3 fails first time):
-```json
-{
-  "phase": "execution",
-  "taskIndex": 2,
-  "totalTasks": 10,
-  "fixTaskMap": {}
-}
-```
-
-After (fix task 1.3.1 generated):
-```json
-{
-  "phase": "execution",
-  "taskIndex": 2,
-  "totalTasks": 11,
-  "fixTaskMap": {
-    "1.3": {
-      "attempts": 1,
-      "fixTaskIds": ["1.3.1"],
-      "lastError": "File not found: src/parser.ts"
-    }
-  }
-}
-```
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/failure-recovery/SKILL.md`
 
-After second failure (fix task 1.3.2 generated):
-```json
-{
-  "phase": "execution",
-  "taskIndex": 2,
-  "totalTasks": 12,
-  "fixTaskMap": {
-    "1.3": {
-      "attempts": 2,
-      "fixTaskIds": ["1.3.1", "1.3.2"],
-      "lastError": "Syntax error in parser.ts line 42"
-    }
-  }
-}
-```
-
-**Reading fixTaskMap for limit checks**:
-
-```bash
-# Check current attempts for a task
-CURRENT_ATTEMPTS=$(jq -r --arg taskId "$TASK_ID" \
-  '.fixTaskMap[$taskId].attempts // 0' "$SPEC_PATH/.ralph-state.json")
-
-# Check if limit exceeded
-MAX_FIX=$(jq -r '.maxFixTasksPerOriginal // 3' "$SPEC_PATH/.ralph-state.json")
-if [ "$CURRENT_ATTEMPTS" -ge "$MAX_FIX" ]; then
-  echo "ERROR: Max fix attempts ($MAX_FIX) reached for task $TASK_ID"
-  # Show fix history
-  jq -r --arg taskId "$TASK_ID" \
-    '.fixTaskMap[$taskId].fixTaskIds | join(", ")' "$SPEC_PATH/.ralph-state.json"
-  exit 1
-fi
-```
-
-**Step 7: Execute Fix Task**
+Use this skill when spec-executor does NOT output TASK_COMPLETE:
+- Parse failure output to extract error details
+- Check recoveryMode state (defaults to false)
+- Generate fix tasks when recovery mode enabled
+- Insert fix tasks into tasks.md
+- Track fix attempts in fixTaskMap
+- Orchestrate the iterative recovery loop
+</skill-reference>
 
-```text
-Delegate fix task to spec-executor via Task tool
-Same delegation pattern as Section 6
-
-If TASK_COMPLETE:
-  - Mark fix task [x] in tasks.md
-  - Proceed to Step 8
-
-If no TASK_COMPLETE:
-  - Fix task itself failed
-  - Loop back to Step 2 with fix task as current task
-  - (Fix task can spawn its own fix tasks)
-```
+### Verification Before Advancing
 
-**Step 8: Retry Original Task**
-
-```text
-Return to original task (taskIndex unchanged)
-Delegate original task to spec-executor again
-
-If TASK_COMPLETE:
-  - Success! Proceed to Section 7 (verification layers)
-  - Then Section 8 (state update, advance taskIndex)
-
-If no TASK_COMPLETE:
-  - Original still failing after fix
-  - Loop back to Step 2
-  - Generate another fix task for original
-```
-
-**Example Recovery Sequence**:
-
-```text
-Initial: Task 1.3 fails
-  ↓
-Recovery Mode enabled
-  ↓
-Parse: error = "syntax error in parser.ts"
-  ↓
-Check: fixTaskMap["1.3"].attempts = 0 (under limit of 3)
-  ↓
-Generate: Task 1.3.1 [FIX 1.3] Fix: syntax error
-  ↓
-Insert: Add 1.3.1 after 1.3 in tasks.md
-  ↓
-Update: fixTaskMap["1.3"] = {attempts: 1, fixTaskIds: ["1.3.1"]}
-  ↓
-Execute: Delegate 1.3.1 to spec-executor
-  ↓
-1.3.1 completes with TASK_COMPLETE
-  ↓
-Retry: Delegate 1.3 to spec-executor again
-  ↓
-1.3 completes with TASK_COMPLETE
-  ↓
-Success! → Section 7 → Section 8 → Next task
-```
-
-**Nested Fix Example** (fix task fails):
-
-```text
-Task 1.3 fails → Generate 1.3.1
-  ↓
-1.3.1 fails → Generate 1.3.1.1 (fix for the fix)
-  ↓
-1.3.1.1 completes
-  ↓
-Retry 1.3.1 → completes
-  ↓
-Retry 1.3 → completes
-  ↓
-Success!
-```
-
-**Important Notes**:
-
-- Fix tasks can spawn their own fix tasks (recursive recovery)
-- Each original task tracks its own fix count independently
-- taskIndex does NOT advance during fix task execution
-- Only after original task passes does taskIndex advance
-- Fix task IDs use dot notation to show lineage: 1.3.1, 1.3.2, 1.3.1.1
-
-**Fix Task Progress Logging**:
-
-After original task completes (TASK_COMPLETE) following fix task recovery, log the fix task chain to .progress.md.
-
-Add/update section in .progress.md:
-```markdown
-## Fix Task History
-- Task 1.3: 2 fixes attempted (1.3.1, 1.3.2) - Final: PASS
-- Task 2.1: 1 fix attempted (2.1.1) - Final: PASS
-- Task 3.4: 3 fixes attempted (3.4.1, 3.4.2, 3.4.3) - Final: FAIL (max limit)
-```
-
-**Logging Implementation**:
-
-After successful original task retry (Step 8 TASK_COMPLETE):
-1. Check if fixTaskMap[$taskId] exists and has attempts > 0
-2. If yes, append fix task history entry to .progress.md:
-   ```
-   - Task $taskId: $attempts fixes attempted ($fixTaskIds) - Final: PASS
-   ```
-3. Use Edit tool to append to "## Fix Task History" section
-4. If section doesn't exist, create it before "## Learnings" section
-
-On max fix limit reached (section 6c limit error):
-1. Log failed recovery attempt:
-   ```
-   - Task $taskId: $attempts fixes attempted ($fixTaskIds) - Final: FAIL (max limit)
-   ```
-2. Include in .progress.md before stopping execution
-
-**Example Progress Update**:
-
-Before fix task logging:
-```markdown
-## Completed Tasks
-- [x] 1.1 Task A - abc123
-- [x] 1.2 Task B - def456
-
-## Learnings
-- Some learning
-```
-
-After fix task logging:
-```markdown
-## Completed Tasks
-- [x] 1.1 Task A - abc123
-- [x] 1.2 Task B - def456
-
-## Fix Task History
-- Task 1.2: 2 fixes attempted (1.2.1, 1.2.2) - Final: PASS
-
-## Learnings
-- Some learning
-```
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/verification-layers/SKILL.md`
 
-**ERROR: Max Retries Reached**
+Run 4-layer verification BEFORE advancing taskIndex:
+1. Contradiction detection - no "requires manual" + TASK_COMPLETE
+2. Uncommitted spec files check - tasks.md and .progress.md committed
+3. Checkmark verification - count matches taskIndex + 1
+4. Completion signal verification - explicit TASK_COMPLETE present
 
-If taskIteration exceeds maxTaskIterations:
-1. Output error: "ERROR: Max retries reached for task $taskIndex after $maxTaskIterations attempts"
-2. Include last error/failure reason from spec-executor output
-3. Suggest: "Review .progress.md Learnings section for failure details"
-4. Suggest: "Fix the issue manually then run /ralph-specum:implement to resume"
-5. Do NOT continue execution
-6. Do NOT output ALL_TASKS_COMPLETE
+All layers must pass before advancing state.
+</skill-reference>
 
-### 7. Verification Layers
-
-CRITICAL: Run these 4 verifications BEFORE advancing taskIndex. All must pass.
-
-**Layer 1: CONTRADICTION Detection**
-
-Check spec-executor output for contradiction patterns:
-- "requires manual"
-- "cannot be automated"
-- "could not complete"
-- "needs human"
-- "manual intervention"
-
-If TASK_COMPLETE appears alongside any contradiction phrase:
-- REJECT the completion
-- Log: "CONTRADICTION: claimed completion while admitting failure"
-- Increment taskIteration and retry
-
-**Layer 2: Uncommitted Spec Files Check**
-
-Before advancing, verify spec files are committed:
-
-```bash
-git status --porcelain ./specs/$spec/tasks.md ./specs/$spec/.progress.md
-```
-
-If output is non-empty (uncommitted changes):
-- REJECT the completion
-- Log: "uncommitted spec files detected - task not properly committed"
-- Increment taskIteration and retry
-
-All spec file changes must be committed before task is considered complete.
-
-**Layer 3: Checkmark Verification**
-
-Count completed tasks in tasks.md:
-
-```bash
-grep -c '\- \[x\]' ./specs/$spec/tasks.md
-```
-
-Expected checkmark count = taskIndex + 1 (0-based index, so task 0 complete = 1 checkmark)
-
-If actual count != expected:
-- REJECT the completion
-- Log: "checkmark mismatch: expected $expected, found $actual"
-- This detects state manipulation or incomplete task marking
-- Increment taskIteration and retry
-
-**Layer 4: TASK_COMPLETE Signal Verification**
-
-Verify spec-executor explicitly output TASK_COMPLETE:
-- Must be present in response
-- Not just implied or partial completion
-- Silent completion is not valid
-
-If TASK_COMPLETE missing:
-- Do NOT advance
-- Increment taskIteration and retry
-
-**Verification Summary**
-
-All 4 layers must pass:
-1. No contradiction phrases with completion claim
-2. Spec files committed (no uncommitted changes)
-3. Checkmark count matches expected taskIndex + 1
-4. Explicit TASK_COMPLETE signal present
-
-Only after all verifications pass, proceed to State Update (section 8).
-
-### 8. State Update
-
-After successful completion (TASK_COMPLETE for sequential or all parallel tasks complete):
-
-**Sequential Update**:
-1. Read current .ralph-state.json
-2. Increment taskIndex by 1
-3. Reset taskIteration to 1
-4. Write updated state
-
-**Parallel Batch Update**:
-1. Read current .ralph-state.json
-2. Set taskIndex to parallelGroup.endIndex + 1 (jump past entire batch)
-3. Reset taskIteration to 1
-4. Write updated state
-
-State structure:
-```json
-{
-  "phase": "execution",
-  "taskIndex": <next task after current/batch>,
-  "totalTasks": <unchanged>,
-  "taskIteration": 1,
-  "maxTaskIterations": <unchanged>
-}
-```
-
-Check if all tasks complete:
-- If taskIndex >= totalTasks: proceed to section 10 (Completion Signal)
-- If taskIndex < totalTasks: continue to next iteration (loop re-invokes coordinator)
-
-### 9. Progress Merge
-
-**Parallel Only**: After parallel batch completes:
+### Progress Merge (Parallel Only)
 
+After parallel batch completes:
 1. Read each temp progress file (.progress-task-N.md)
 2. Extract completed task entries and learnings
 3. Append to main .progress.md in task index order
 4. Delete temp files after merge
 
-Merge format in .progress.md:
-```markdown
-## Completed Tasks
-- [x] 3.1 Task A - abc123
-- [x] 3.2 Task B - def456  <- merged from temp files
-- [x] 3.3 Task C - ghi789
-```
-
-**ERROR: Partial Parallel Batch Failure**
-
-If any parallel task failed (no TASK_COMPLETE in its output):
-1. Identify which task(s) failed from the batch
-2. Note successful tasks in .progress.md
-3. For failed tasks, increment taskIteration
-4. If failed task exceeds maxTaskIterations: output "ERROR: Max retries reached for parallel task $failedTaskIndex"
-5. Otherwise: retry ONLY the failed task(s), do NOT re-run successful ones
-6. Do NOT advance taskIndex past the batch until ALL tasks in batch complete
-7. Merge only successful task progress files
-
-### 10. Completion Signal
+### Completion Signal
 
 **Phase 5 Detection**: Before outputting ALL_TASKS_COMPLETE, check if Phase 5 (PR Lifecycle) is required:
 
 1. Read tasks.md to detect Phase 5 tasks (look for "Phase 5: PR Lifecycle" section)
 2. If Phase 5 exists AND taskIndex >= totalTasks:
-   - Enter PR Lifecycle Loop (section 11)
+   - Enter PR Lifecycle Loop
    - Do NOT output ALL_TASKS_COMPLETE yet
 3. If NO Phase 5 OR Phase 5 complete:
    - Proceed with standard completion
@@ -1098,17 +193,10 @@ Before outputting:
 
 This signal terminates the Ralph Loop loop.
 
-**PR Link Output**: If a PR was created during execution, output the PR URL after ALL_TASKS_COMPLETE:
-```text
-ALL_TASKS_COMPLETE
-
-PR: https://github.com/owner/repo/pull/123
-```
-
 Do NOT output ALL_TASKS_COMPLETE if tasks remain incomplete.
 Do NOT output TASK_COMPLETE (that's for spec-executor only).
 
-### 11. PR Lifecycle Loop (Phase 5)
+### PR Lifecycle Loop (Phase 5)
 
 CRITICAL: Phase 5 is continuous autonomous PR management. Do NOT stop until all criteria met.
 
@@ -1116,106 +204,17 @@ CRITICAL: Phase 5 is continuous autonomous PR management. Do NOT stop until all
 - All Phase 1-4 tasks complete
 - Phase 5 tasks detected in tasks.md
 
-**Loop Structure**:
-```text
-PR Creation → CI Monitoring → Review Check → Fix Issues → Push → Repeat
-```
-
-**Step 1: Create PR (if not exists)**
-
-Delegate to spec-executor:
-```text
-Task: Create pull request
-
-Do:
-1. Verify not on default branch: git branch --show-current
-2. Push branch: git push -u origin <branch>
-3. Create PR: gh pr create --title "feat: <spec>" --body "<summary>"
-
-Verify: gh pr view shows PR created
-Done when: PR URL returned
-Commit: None
-```
-
-**Step 2: CI Monitoring Loop**
-
-```text
-While (CI checks not all green):
-  1. Wait 3 minutes (allow CI to start/complete)
-  2. Check status: gh pr checks
-  3. If failures:
-     - Read failure details: gh run view --log-failed
-     - Create new Phase 5.X task in tasks.md:
-       - [ ] 5.X Fix CI failure: <failure summary>
-         - **Do**: <steps to fix based on failure logs>
-         - **Files**: <files to modify based on failure>
-         - **Done when**: CI check passes
-         - **Verify**: gh pr checks shows this check ✓
-         - **Commit**: fix: address CI failure - <summary>
-     - Delegate new task to spec-executor with task index and Files list
-     - Wait for TASK_COMPLETE
-     - Push fixes (if not already pushed by spec-executor)
-     - Restart wait cycle
-  4. If pending:
-     - Continue waiting
-  5. If all green:
-     - Proceed to Step 3
-```
-
-**Step 3: Review Comment Check**
-
-```text
-1. Fetch review states: gh pr view --json reviews
-   - Parse for reviews with state "CHANGES_REQUESTED" or "PENDING"
-   - Note: --json reviews returns review-level state but NOT inline comment threads
-   - For inline comments, use REST API: gh api repos/{owner}/{repo}/pulls/{number}/reviews
-   - Or use review comments endpoint: gh api repos/{owner}/{repo}/pulls/{number}/comments
-2. Parse for unresolved reviews/comments
-3. If unresolved reviews/comments found:
-   - Create tasks from reviews (add to tasks.md as Phase 5.X)
-   - For each review/comment:
-     - [ ] 5.X Address review: <reviewer> - <summary>
-       - **Do**: <change requested>
-       - **Files**: <files to modify>
-       - **Done when**: Review comment addressed
-       - **Verify**: Code implements requested change
-       - **Commit**: fix: address review - <summary>
-   - Delegate each to spec-executor
-   - Wait for completion
-   - Push fixes
-   - Return to Step 2 (re-check CI)
-4. If no unresolved reviews/comments:
-   - Proceed to Step 4
-```
-
-**Step 4: Final Validation**
-
-All must be true:
-- ✅ All Phase 1-4 tasks complete (checked [x])
-- ✅ All Phase 5 tasks complete
-- ✅ CI checks all green
-- ✅ No unresolved review comments
-- ✅ Zero test regressions (all existing tests pass)
-- ✅ Code is modular/reusable (verified in .progress.md)
-
-**Step 5: Completion**
-
-When all Step 4 criteria met:
-1. Update .progress.md with final state
-2. Delete .ralph-state.json
-3. Get PR URL: `gh pr view --json url -q .url`
-4. Output: ALL_TASKS_COMPLETE
-5. Output: PR link (e.g., "PR: https://github.com/owner/repo/pull/123")
+**Loop Steps**:
+1. Create PR (if not exists): `gh pr create --title "feat: <spec>" --body "<summary>"`
+2. CI Monitoring: Wait 3 min, check `gh pr checks`, fix failures, repeat
+3. Review Comments: Check `gh pr view --json reviews`, address feedback
+4. Final Validation: All tasks [x], CI green, no unresolved reviews
+5. Completion: Delete state, output ALL_TASKS_COMPLETE with PR link
 
 **Timeout Protection**:
 - Max 48 hours in PR Lifecycle Loop
 - Max 20 CI monitoring cycles
 - If exceeded: Output error and STOP (do not output ALL_TASKS_COMPLETE)
-
-**Error Handling**:
-- If CI fails after 5 retry attempts: STOP with error
-- If review comments cannot be addressed: STOP with error
-- Document all failures in .progress.md Learnings
 ```
 
 ## Output on Start
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index f0041832..664aa207 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -51,6 +51,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.11 Create commit-discipline skill
 - [x] 2.12 Create quality-checkpoints skill
 - [x] 2.13 Create quality-commands skill
+- [x] 2.15 Simplify implement.md command
 
 ## Current Task
 
@@ -58,6 +59,20 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.15: Simplify implement.md
+- Reduced from 1235 lines to 233 lines (81% reduction)
+- Used <skill-reference> blocks to reference: coordinator-pattern, failure-recovery, verification-layers
+- Kept core orchestration logic: Ralph Loop dependency check, prerequisites, state initialization, Ralph Loop invocation
+- Coordinator prompt now delegates to skills for detailed procedures
+- PR Lifecycle Loop kept inline as it's specific to this command's flow
+
+### Verification: 2.14 [VERIFY] Quality checkpoint: new skills batch 3
+- Status: PASS
+- Verified: 4/4 skills exist with version field
+- Skills checked: phase-rules (v0.1.0), commit-discipline (v0.1.0), quality-checkpoints (v0.1.0), quality-commands (v0.1.0)
+- All files at: plugins/ralph-specum/skills/{skill-name}/SKILL.md
+- No fixes needed
+
 ### Verification: 2.9 [VERIFY] Quality checkpoint: new skills batch 2
 - Status: PASS
 - Verified: 4/4 skills exist with version field
@@ -129,7 +144,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.14 [VERIFY] Quality checkpoint: new skills batch 3
+Task 2.16 Simplify start.md command
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 3b93d107..fd3ed436 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -361,7 +361,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add quality-commands skill`
   - _Design: New Skills - quality-commands_
 
-- [ ] 2.14 [VERIFY] Quality checkpoint: new skills batch 3
+- [x] 2.14 [VERIFY] Quality checkpoint: new skills batch 3
   - **Do**: Verify skills 8-11 have proper structure
   - **Verify**: `count=0; for s in phase-rules commit-discipline quality-checkpoints quality-commands; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
   - **Done when**: All 4 skills exist with version field
@@ -369,7 +369,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
 
 ### B2: Simplify Commands
 
-- [ ] 2.15 Simplify implement.md command
+- [x] 2.15 Simplify implement.md command
   - **Do**:
     1. Replace inline coordinator prompt with skill reference to coordinator-pattern
     2. Replace inline recovery logic with skill reference to failure-recovery

From 69a82ba219d9b64ab7b2f511b655ce5e13858212 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:16:04 +0200
Subject: [PATCH 26/37] refactor(ralph-specum): simplify start.md to reference
 skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/commands/start.md | 915 +++----------------------
 specs/refactor-plugins/.progress.md    |  10 +-
 specs/refactor-plugins/tasks.md        |   2 +-
 3 files changed, 102 insertions(+), 825 deletions(-)

diff --git a/plugins/ralph-specum/commands/start.md b/plugins/ralph-specum/commands/start.md
index 51bbbb40..32bc0204 100644
--- a/plugins/ralph-specum/commands/start.md
+++ b/plugins/ralph-specum/commands/start.md
@@ -10,211 +10,45 @@ Smart entry point for ralph-specum. Detects whether to create a new spec or resu
 
 ## Branch Management (FIRST STEP)
 
-<mandatory>
-Before creating any files or directories, check the current git branch and handle appropriately.
-</mandatory>
-
-### Step 1: Check Current Branch
-
-```bash
-git branch --show-current
-```
-
-### Step 2: Determine Default Branch
-
-Check which is the default branch:
-```bash
-git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@'
-```
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/branch-management/SKILL.md`
+Before creating any files, check git branch and handle appropriately. Use the branch-management skill for branch detection, creation, worktree setup, and naming conventions.
 
-If that fails, assume `main` or `master` (check which exists):
-```bash
-git rev-parse --verify origin/main 2>/dev/null && echo "main" || echo "master"
-```
-
-### Step 3: Branch Decision Logic
+In quick mode, use Quick Mode Branch Handling (auto-create branch, no prompts).
+</skill-reference>
 
-```text
-1. Get current branch name
-   |
-   +-- ON DEFAULT BRANCH (main/master):
-   |   |
-   |   +-- Ask user for branch strategy:
-   |   |   "Starting new spec work. How would you like to handle branching?"
-   |   |   1. Create branch in current directory (git checkout -b)
-   |   |   2. Create git worktree (separate directory)
-   |   |
-   |   +-- If user chooses 1 (current directory):
-   |   |   - Generate branch name from spec name: feat/$specName
-   |   |   - If spec name not yet known, use temp name: feat/spec-work-<timestamp>
-   |   |   - Create and switch: git checkout -b <branch-name>
-   |   |   - Inform user: "Created branch '<branch-name>' for this work"
-   |   |   - Suggest: "Run /ralph-specum:research to start the research phase."
-   |   |
-   |   +-- If user chooses 2 (worktree):
-   |   |   - Generate branch name from spec name: feat/$specName
-   |   |   - Determine worktree path: ../<repo-name>-<spec-name> or prompt user
-   |   |   - Create worktree: git worktree add <path> -b <branch-name>
-   |   |   - Inform user: "Created worktree at '<path>' on branch '<branch-name>'"
-   |   |   - IMPORTANT: Suggest user to cd to worktree and resume conversation there:
-   |   |     "For best results, cd to '<path>' and start a new Claude Code session from there."
-   |   |     "Then run /ralph-specum:research to begin."
-   |   |   - STOP HERE - do not continue to Parse Arguments (user needs to switch directories)
-   |   |
-   |   +-- Continue to Parse Arguments
-   |
-   +-- ON NON-DEFAULT BRANCH (feature branch):
-       |
-       +-- Ask user for preference:
-       |   "You are currently on branch '<current-branch>'.
-       |    Would you like to:
-       |    1. Continue working on this branch
-       |    2. Create a new branch in current directory
-       |    3. Create git worktree (separate directory)"
-       |
-       +-- If user chooses 1 (continue):
-       |   - Stay on current branch
-       |   - Suggest: "Run /ralph-specum:research to start the research phase."
-       |   - Continue to Parse Arguments
-       |
-       +-- If user chooses 2 (new branch):
-       |   - Generate branch name from spec name: feat/$specName
-       |   - If spec name not yet known, use temp name: feat/spec-work-<timestamp>
-       |   - Create and switch: git checkout -b <branch-name>
-       |   - Inform user: "Created branch '<branch-name>' for this work"
-       |   - Suggest: "Run /ralph-specum:research to start the research phase."
-       |   - Continue to Parse Arguments
-       |
-       +-- If user chooses 3 (worktree):
-           - Generate branch name from spec name: feat/$specName
-           - Determine worktree path: ../<repo-name>-<spec-name> or prompt user
-           - Create worktree: git worktree add <path> -b <branch-name>
-           - Inform user: "Created worktree at '<path>' on branch '<branch-name>'"
-           - IMPORTANT: Suggest user to cd to worktree and resume conversation there:
-             "For best results, cd to '<path>' and start a new Claude Code session from there."
-             "Then run /ralph-specum:research to begin."
-           - STOP HERE - do not continue to Parse Arguments (user needs to switch directories)
-```
+## Parse Arguments
 
-### Branch Naming Convention
+From `$ARGUMENTS`, extract:
+- **name**: Optional spec name (kebab-case)
+- **goal**: Everything after the name except flags (optional)
+- **--fresh**: Force new spec without prompting if one exists
+- **--quick**: Skip all spec phases, auto-generate artifacts, start execution immediately
+- **--commit-spec**: Commit and push spec files after generation (default: true in normal mode, false in quick mode)
+- **--no-commit-spec**: Explicitly disable committing spec files
 
-When creating a new branch:
-- Use format: `feat/<spec-name>` (e.g., `feat/user-auth`)
-- If spec name contains special chars, sanitize to kebab-case
-- If branch already exists, append `-2`, `-3`, etc.
+### Commit Spec Flag Logic
 
-Example:
 ```text
-Spec name: user-auth
-Branch: feat/user-auth
-
-If feat/user-auth exists:
-Branch: feat/user-auth-2
+1. Check if --no-commit-spec in $ARGUMENTS -> commitSpec = false
+2. Else if --commit-spec in $ARGUMENTS -> commitSpec = true
+3. Else if --quick in $ARGUMENTS -> commitSpec = false (quick mode default)
+4. Else -> commitSpec = true (normal mode default)
 ```
 
-### Worktree Details
-
-When user chooses worktree option:
-
-**State files copied to worktree:**
-- `specs/.current-spec` - Active spec name pointer
-- `specs/$SPEC_NAME/.ralph-state.json` - Loop state (phase, taskIndex, iterations)
-- `specs/$SPEC_NAME/.progress.md` - Progress tracking and learnings
-
-These files are copied when:
-1. The worktree is created via `git worktree add`
-2. A spec is currently active (SPEC_NAME known or readable from .current-spec)
-3. The source files exist in the main worktree
-
-Copy uses non-overwrite semantics (skips if file already exists in target).
-
-```bash
-# Get repo name for path suggestion
-REPO_NAME=$(basename $(git rev-parse --show-toplevel))
-
-# If SPEC_NAME empty but .current-spec exists, read from it (before using for path/branch)
-if [ -z "$SPEC_NAME" ] && [ -f "./specs/.current-spec" ]; then
-    SPEC_NAME=$(cat "./specs/.current-spec") || true
-fi
-
-# Default worktree path
-WORKTREE_PATH="../${REPO_NAME}-${SPEC_NAME}"
-
-# Create worktree with new branch
-git worktree add "$WORKTREE_PATH" -b "feat/${SPEC_NAME}"
-
-# Copy spec state files to worktree (failures are warnings, not errors)
-if [ -d "./specs" ]; then
-    mkdir -p "$WORKTREE_PATH/specs" || echo "Warning: Failed to create specs directory in worktree"
-
-    # Copy .current-spec if exists (don't overwrite existing)
-    if [ -f "./specs/.current-spec" ] && [ ! -f "$WORKTREE_PATH/specs/.current-spec" ]; then
-        cp "./specs/.current-spec" "$WORKTREE_PATH/specs/.current-spec" || echo "Warning: Failed to copy .current-spec to worktree"
-    fi
-
-    # If spec name known, copy spec state files
-    if [ -n "$SPEC_NAME" ] && [ -d "./specs/$SPEC_NAME" ]; then
-        mkdir -p "$WORKTREE_PATH/specs/$SPEC_NAME" || echo "Warning: Failed to create spec directory in worktree"
-
-        # Copy state files (don't overwrite existing)
-        if [ -f "./specs/$SPEC_NAME/.ralph-state.json" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.ralph-state.json" ]; then
-            cp "./specs/$SPEC_NAME/.ralph-state.json" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .ralph-state.json to worktree"
-        fi
-
-        if [ -f "./specs/$SPEC_NAME/.progress.md" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.progress.md" ]; then
-            cp "./specs/$SPEC_NAME/.progress.md" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .progress.md to worktree"
-        fi
-    fi
-fi
-```
-
-After worktree creation:
-- Inform user of the worktree path
-- IMPORTANT: Output clear guidance for the user:
-  ```text
-  Created worktree at '<path>' on branch '<branch-name>'
-  Spec state files copied to worktree.
-
-  For best results, cd to the worktree directory and start a new Claude Code session from there:
-
-    cd <path>
-    claude
-
-  Then run /ralph-specum:research to begin the research phase.
-  ```
-- STOP the command here - do not continue to Parse Arguments or create spec files
-- The user needs to switch directories first to work in the worktree
-- To clean up later: `git worktree remove <path>`
-
-### Quick Mode Branch Handling
-
-In `--quick` mode, still perform branch check but skip the user prompt for non-default branches:
-- If on default branch: auto-create feature branch in current directory (no worktree prompt in quick mode)
-- If on non-default branch: stay on current branch (no prompt, quick mode is non-interactive)
-
-## Quick Mode Uses Ralph Loop
-
-In quick mode (`--quick`), execution uses `/ralph-loop` for autonomous task completion.
-
-After generating spec artifacts in quick mode, invoke ralph-loop:
-```text
-Skill: ralph-loop:ralph-loop
-Args: Read ./specs/$spec/.coordinator-prompt.md and follow those instructions exactly. Output ALL_TASKS_COMPLETE when done. --max-iterations <calculated> --completion-promise ALL_TASKS_COMPLETE
-```
+Examples:
+- `/ralph-specum:start` -> Auto-detect: resume active or ask for new
+- `/ralph-specum:start user-auth` -> Resume or create user-auth
+- `/ralph-specum:start user-auth Add OAuth2` -> Create user-auth with goal
+- `/ralph-specum:start user-auth --fresh` -> Force new, overwrite if exists
+- `/ralph-specum:start "Build auth with JWT" --quick` -> Quick mode with goal string
 
 <mandatory>
 ## CRITICAL: Delegation Requirement
 
 **YOU ARE A COORDINATOR, NOT AN IMPLEMENTER.**
 
-You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE regardless of mode (normal or quick).
-
-**NEVER do any of these yourself:**
-- Write code or modify source files
-- Perform research or analysis
-- Generate spec artifacts (research.md, requirements.md, design.md, tasks.md)
-- Execute task steps
-- Run verification commands as part of task execution
+You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE regardless of mode.
 
 **ALWAYS delegate to the appropriate subagent:**
 | Work Type | Subagent |
@@ -225,234 +59,16 @@ You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE rega
 | Task Planning | `task-planner` |
 | Artifact Generation (quick mode) | `plan-synthesizer` |
 | Task Execution | `spec-executor` |
-
-Quick mode does NOT exempt you from delegation - it only skips interactive phases.
 </mandatory>
 
 <mandatory>
 ## CRITICAL: Stop After Each Subagent (Normal Mode)
 
-In normal mode (no `--quick` flag), you MUST STOP your response after each subagent completes.
-
-**After invoking a subagent via Task tool:**
-1. Wait for subagent to return
-2. Output a brief status message (e.g., "Research phase complete. Run /ralph-specum:requirements to continue.")
-3. **END YOUR RESPONSE IMMEDIATELY**
-
-**DO NOT:**
-- Invoke another subagent in the same response
-- Continue to the next phase automatically
-- Ask if the user wants to continue
-
-**The user must explicitly run the next command.** This gives them time to review artifacts.
+In normal mode (no `--quick` flag), you MUST STOP your response after each subagent completes. The user must explicitly run the next command.
 
 Exception: `--quick` mode runs all phases without stopping.
 </mandatory>
 
-
-## Parse Arguments
-
-From `$ARGUMENTS`, extract:
-- **name**: Optional spec name (kebab-case)
-- **goal**: Everything after the name except flags (optional)
-- **--fresh**: Force new spec without prompting if one exists
-- **--quick**: Skip all spec phases, auto-generate artifacts, start execution immediately
-- **--commit-spec**: Commit and push spec files after generation (default: true in normal mode, false in quick mode)
-- **--no-commit-spec**: Explicitly disable committing spec files
-
-### Commit Spec Flag Logic
-
-```text
-1. Check if --no-commit-spec in $ARGUMENTS → commitSpec = false
-2. Else if --commit-spec in $ARGUMENTS → commitSpec = true
-3. Else if --quick in $ARGUMENTS → commitSpec = false (quick mode default)
-4. Else → commitSpec = true (normal mode default)
-```
-
-Examples:
-- `/ralph-specum:start` -> Auto-detect: resume active or ask for new
-- `/ralph-specum:start user-auth` -> Resume or create user-auth
-- `/ralph-specum:start user-auth Add OAuth2` -> Create user-auth with goal
-- `/ralph-specum:start user-auth --fresh` -> Force new, overwrite if exists
-- `/ralph-specum:start "Build auth with JWT" --quick` -> Quick mode with goal string
-- `/ralph-specum:start my-feature "Add logging" --quick` -> Quick mode with name+goal
-- `/ralph-specum:start ./my-plan.md --quick` -> Quick mode with file input
-- `/ralph-specum:start my-feature ./plan.md --quick` -> Quick mode with name+file
-- `/ralph-specum:start my-feature --quick` -> Quick mode using existing plan.md
-
-## Quick Mode Flow
-
-When `--quick` flag detected, bypass interactive spec phases and auto-generate all artifacts.
-
-### Quick Mode Input Detection
-
-Parse arguments before `--quick` flag and classify input type:
-
-```text
-Input Classification:
-
-1. TWO ARGS before --quick:
-   - First arg = spec name (must be kebab-case: ^[a-z0-9-]+$)
-   - Second arg = goal string OR file path
-   - Detect file path if: starts with "./" OR "/" OR ends with ".md"
-   - Examples:
-     - `my-feature "Add login" --quick` -> name=my-feature, goal="Add login"
-     - `my-feature ./plan.md --quick` -> name=my-feature, file=./plan.md
-
-2. ONE ARG before --quick:
-   a. FILE PATH: starts with "./" OR "/" OR ends with ".md"
-      - Read file content as plan
-      - Infer name from plan content
-      - Example: `./my-plan.md --quick` -> read file, infer name
-
-   b. KEBAB-CASE NAME: matches ^[a-z0-9-]+$
-      - Check if ./specs/$name/plan.md exists
-      - If exists: use plan.md content, name=$name
-      - If not exists: error "No plan.md found in ./specs/$name/. Provide goal: /ralph-specum:start $name 'your goal' --quick"
-      - Example: `my-feature --quick` -> check ./specs/my-feature/plan.md
-
-   c. GOAL STRING: anything else (contains spaces, uppercase, special chars)
-      - Use as goal content
-      - Infer name from goal
-      - Example: `"Build auth with JWT" --quick` -> goal, infer name
-
-3. ZERO ARGS with --quick:
-   - Error: "Quick mode requires a goal or plan file"
-```
-
-### File Reading
-
-When file path detected:
-1. Validate file exists using Read tool
-2. If not exists: error "File not found: $filePath"
-3. Read file content
-4. Strip frontmatter if present (content between --- markers at start)
-5. If content empty after stripping: error "Plan content is empty. Provide a goal or non-empty file."
-6. Use content as planContent
-
-### Existing Plan Check
-
-When kebab-case name provided without goal:
-1. Check if `./specs/$name/plan.md` exists
-2. If exists: read content, use as planContent
-3. If not exists: error with guidance message
-
-### Name Inference
-
-If no explicit name provided, infer from goal:
-1. Take first 3 words of goal
-2. Convert to kebab-case (lowercase, spaces to hyphens)
-3. Truncate to max 30 characters
-4. Strip non-alphanumeric except hyphens
-
-Example: "Build authentication with JWT tokens" -> "build-authentication-with"
-
-### Quick Mode Execution
-
-<mandatory>
-**REMINDER: Even in quick mode, you MUST delegate ALL work to subagents.**
-- Artifact generation → delegate to `plan-synthesizer` via Task tool
-- Task execution → delegate to `spec-executor` via Task tool
-- You only handle: directory creation, state file writes, and coordination
-</mandatory>
-
-```text
-1. Validate input (non-empty goal/plan)
-   |
-2. Infer name from goal (if not provided)
-   |
-3. Create spec directory: ./specs/$name/
-   |
-3a. Ensure gitignore entries exist for spec state files:
-   - Add specs/.current-spec to .gitignore if not present
-   - Add **/.progress.md to .gitignore if not present
-   |
-4. Write .ralph-state.json:
-   {
-     "source": "plan",
-     "name": "$name",
-     "basePath": "./specs/$name",
-     "phase": "tasks",
-     "taskIndex": 0,
-     "totalTasks": 0,
-     "taskIteration": 1,
-     "maxTaskIterations": 5,
-     "globalIteration": 1,
-     "maxGlobalIterations": 100,
-     "commitSpec": $commitSpec
-   }
-   |
-5. Write .progress.md with original goal
-   |
-6. Update .current-spec: echo "$name" > ./specs/.current-spec
-   |
-7. Invoke plan-synthesizer agent via Task tool:
-   Task: plan-synthesizer
-   Input: goal="$goal", basePath="./specs/$name"
-   |
-8. After generation completes:
-   - Update .ralph-state.json: phase="execution", taskIndex=0
-   - Read tasks.md to get totalTasks count
-   |
-8a. If commitSpec is true:
-   - Stage spec files: git add ./specs/$name/research.md ./specs/$name/requirements.md ./specs/$name/design.md ./specs/$name/tasks.md
-   - Commit: git commit -m "spec($name): add spec artifacts"
-   - Push: git push -u origin $(git branch --show-current)
-   |
-9. Display brief summary:
-   Quick mode: Created spec '$name'
-   [If commitSpec: "Spec committed and pushed."]
-   Starting execution...
-   |
-10. Invoke spec-executor for task 1
-```
-
-### Quick Mode Validation
-
-Before creating the spec, validate all inputs:
-
-```text
-Validation Sequence:
-
-1. ZERO ARGS CHECK (if no args before --quick)
-   - Error: "Quick mode requires a goal or plan file"
-
-2. FILE NOT FOUND (if file path detected)
-   - If file not exists: "File not found: $filePath"
-
-3. EMPTY CONTENT CHECK
-   - If empty or whitespace only: "Plan content is empty. Provide a goal or non-empty file."
-
-4. PLAN TOO SHORT WARNING (< 10 words)
-   - If word count < 10: "Warning: Short plan may produce vague tasks"
-   - Continue with warning displayed
-
-5. NAME CONFLICT RESOLUTION
-   - If ./specs/$name/ already exists:
-     - Append -2, -3, etc. until unique name found
-     - Display: "Created '$name-2' ($name already exists)"
-```
-
-### Atomic Rollback
-
-On generation failure after spec directory created:
-
-```text
-Rollback Procedure:
-
-1. CAPTURE FAILURE
-   - plan-synthesizer agent returns error or times out
-
-2. DELETE SPEC DIRECTORY
-   - rm -rf "./specs/$name"
-
-3. RESTORE .current-spec
-   - If previous spec was set, restore it
-
-4. DISPLAY ERROR
-   - "Generation failed: $errorReason. No spec created."
-```
-
 ## Detection Logic
 
 ```text
@@ -478,19 +94,8 @@ Rollback Procedure:
 ## Resume Flow
 
 1. Read `./specs/$name/.ralph-state.json`
-2. If no state file (completed or never started):
-   - Check what files exist (research.md, requirements.md, etc.)
-   - Determine last completed phase
-   - Ask: "Continue to next phase or restart?"
-3. If state file exists:
-   - Read current phase and task index
-   - Show brief status:
-     ```
-     Resuming '$name'
-     Phase: execution, Task 3/8
-     Last: "Add error handling"
-     ```
-   - Continue from current phase
+2. If no state file: check what files exist, determine last completed phase, ask continue or restart
+3. If state file exists: read phase/task index, show status, continue
 
 ### Resume by Phase
 
@@ -505,445 +110,111 @@ Rollback Procedure:
 <mandatory>
 ## CRITICAL: Stop After Subagent Completes
 
-After ANY subagent (research-analyst, product-manager, architect-reviewer, task-planner) returns, you MUST:
-
-1. **Read the state file**: `cat ./specs/$name/.ralph-state.json`
-2. **Check awaitingApproval**: If `awaitingApproval: true`, you MUST STOP IMMEDIATELY
-3. **Do NOT invoke the next phase** - the user must explicitly run the next command
-
-```text
-Subagent returns
-↓
-Read .ralph-state.json
-↓
-awaitingApproval == true?
-↓
-YES → STOP. Output: "Phase complete. Run /ralph-specum:<next> to continue."
-NO → Continue (only in quick mode where awaitingApproval is not set)
-```
-
-**This is NON-NEGOTIABLE in normal mode.** Each phase requires user approval before proceeding.
-
-The only exception is `--quick` mode, which skips approval between phases.
+After ANY subagent returns, read `.ralph-state.json`. If `awaitingApproval: true`, STOP IMMEDIATELY.
+Do NOT invoke the next phase - user must run next command explicitly.
 </mandatory>
 
 ## New Flow
 
-1. If no name provided, ask:
-   - "What should we call this spec?" (validates kebab-case)
-2. If no goal provided, ask:
-   - "What is the goal? Describe what you want to build."
+1. If no name provided, ask for spec name (kebab-case)
+2. If no goal provided, ask for goal description
 3. Create spec directory: `./specs/$name/`
 4. Update active spec: `echo "$name" > ./specs/.current-spec`
-5. Ensure gitignore entries exist for spec state files:
-   ```bash
-   # Add .current-spec and .progress.md to .gitignore if not already present
-   if [ -f .gitignore ]; then
-     grep -q "specs/.current-spec" .gitignore || echo "specs/.current-spec" >> .gitignore
-     grep -q "\*\*/\.progress\.md" .gitignore || echo "**/.progress.md" >> .gitignore
-   else
-     echo "specs/.current-spec" > .gitignore
-     echo "**/.progress.md" >> .gitignore
-   fi
-   ```
-6. Initialize `.ralph-state.json`:
-   ```json
-   {
-     "source": "spec",
-     "name": "$name",
-     "basePath": "./specs/$name",
-     "phase": "research",
-     "taskIndex": 0,
-     "totalTasks": 0,
-     "taskIteration": 1,
-     "maxTaskIterations": 5,
-     "globalIteration": 1,
-     "maxGlobalIterations": 100,
-     "commitSpec": $commitSpec
-   }
-   ```
-6. Create `.progress.md` with goal
-7. **Goal Interview** (skip if --quick in $ARGUMENTS)
-8. Invoke research-analyst agent with goal interview context
-9. **STOP** - research-analyst sets awaitingApproval=true. Output status and wait for user to run `/ralph-specum:requirements`
+5. Ensure gitignore entries for `specs/.current-spec` and `**/.progress.md`
+6. Initialize `.ralph-state.json` with phase "research"
+7. Create `.progress.md` with goal
 
-## Spec Scanner
+### Spec Scanner (Skip in Quick Mode)
 
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/spec-scanner/SKILL.md`
 Before conducting the Goal Interview, scan existing specs to find related work. This helps surface prior context and avoid duplicate effort.
 
-<mandatory>
-**Skip spec scanner if --quick flag detected in $ARGUMENTS.**
-</mandatory>
-
-### Scan Steps
-
-```text
-1. List all directories in ./specs/
-   - Run: ls -d ./specs/*/ 2>/dev/null | xargs -I{} basename {}
-   - Exclude the current spec being created (if known)
-   |
-2. For each spec directory found:
-   - Read ./specs/$specName/.progress.md
-   - Extract "Original Goal" section (line after "## Original Goal")
-   - If .progress.md doesn't exist, skip this spec
-   |
-3. Keyword matching:
-   - Extract keywords from current goal (split by spaces, lowercase)
-   - Remove common words: "the", "a", "an", "to", "for", "with", "and", "or"
-   - For each existing spec, count matching keywords with its Original Goal
-   - Score = number of matching keywords
-   |
-4. Rank and filter:
-   - Sort specs by score (descending)
-   - Take top 3 specs with score > 0
-   - If no matches found, skip display step
-   |
-5. Display related specs (if any found):
-   |
-   Related specs found:
-   - spec-name-1: [first 50 chars of Original Goal]...
-   - spec-name-2: [first 50 chars of Original Goal]...
-   - spec-name-3: [first 50 chars of Original Goal]...
-   |
-   This context may inform the interview questions.
-   |
-6. Store in state file:
-   - Update .ralph-state.json with relatedSpecs array:
-     {
-       ...existing state,
-       "relatedSpecs": [
-         {"name": "spec-name-1", "goal": "Original Goal text", "score": N},
-         {"name": "spec-name-2", "goal": "Original Goal text", "score": N},
-         {"name": "spec-name-3", "goal": "Original Goal text", "score": N}
-       ]
-     }
-```
-
-### Keyword Extraction
-
-Extract meaningful keywords from the goal:
-
-```javascript
-// Pseudocode for keyword extraction
-function extractKeywords(text) {
-  const stopWords = ["the", "a", "an", "to", "for", "with", "and", "or", "is", "it", "this", "that", "be", "on", "in", "of"];
-  return text
-    .toLowerCase()
-    .split(/\s+/)
-    .filter(word => word.length > 2)
-    .filter(word => !stopWords.includes(word));
-}
-```
-
-### Match Scoring
-
-Simple keyword overlap scoring:
-
-```javascript
-// Pseudocode for scoring
-function scoreMatch(currentGoalKeywords, existingGoalKeywords) {
-  let score = 0;
-  for (const keyword of currentGoalKeywords) {
-    if (existingGoalKeywords.includes(keyword)) {
-      score += 1;
-    }
-  }
-  return score;
-}
-```
-
-### Example Output
-
-```text
-Related specs found:
-- user-auth: Add OAuth2 authentication with JWT tokens...
-- api-refactor: Restructure API endpoints for better...
-- error-handling: Implement consistent error handling...
-
-This context may inform the interview questions.
-```
-
-### Usage in Interview
+Skip if --quick flag detected.
+</skill-reference>
 
-After scanning, if related specs were found, you may reference them when asking clarifying questions. For example:
-- "I noticed you have a spec 'user-auth' for authentication. Does this new feature relate to or depend on that work?"
-- "There's an existing 'api-refactor' spec. Should this work integrate with those changes?"
+### Goal Interview (Skip in Quick Mode)
 
-## Goal Interview (Pre-Research)
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/intent-classification/SKILL.md`
+Before asking interview questions, classify the user's goal to determine question depth (TRIVIAL/REFACTOR/GREENFIELD/MID_SIZED).
+</skill-reference>
 
-<mandatory>
-**Skip interview if --quick flag detected in $ARGUMENTS.**
+Apply `plugins/ralph-specum/skills/interview-framework/SKILL.md` for single-question adaptive interview loop.
 
-If NOT quick mode, conduct goal interview using AskUserQuestion before research phase.
-</mandatory>
+**Goal Interview Question Pool:**
 
-### Quick Mode Check
+| # | Question | Required | Key |
+|---|----------|----------|-----|
+| 1 | What problem are you solving with this feature? | Required | `problem` |
+| 2 | Any constraints or must-haves for this feature? | Required | `constraints` |
+| 3 | How will you know this feature is successful? | Required | `success` |
+| 4 | Any other context you'd like to share? (or say 'done') | Optional | `additionalContext` |
 
-Check if `--quick` appears in `$ARGUMENTS`. If present, skip directly to "Invoke research-analyst".
+Store responses in `.progress.md` under `### Goal Interview (from start.md)`.
 
-### Intent Classification
+8. Invoke research-analyst agent with goal interview context
+9. **STOP** - research-analyst sets awaitingApproval=true
 
-Before asking interview questions, classify the user's goal to determine question depth.
+## Quick Mode Flow
 
-**Classification Logic:**
+Triggered when `--quick` flag detected. Skips all spec phases and auto-generates artifacts.
 
-Analyze the goal text for keywords to determine intent type:
+### Quick Mode Input Detection
 
 ```text
-Intent Classification:
-
-1. TRIVIAL: Goal contains keywords like:
-   - "fix typo", "typo", "spelling"
-   - "small change", "minor"
-   - "quick", "simple", "tiny"
-   - "rename", "update text"
-   → Min questions: 1, Max questions: 2
-
-2. REFACTOR: Goal contains keywords like:
-   - "refactor", "restructure", "reorganize"
-   - "clean up", "cleanup", "simplify"
-   - "extract", "consolidate", "modularize"
-   - "improve code", "tech debt"
-   → Min questions: 3, Max questions: 5
-
-3. GREENFIELD: Goal contains keywords like:
-   - "new feature", "new system", "new module"
-   - "add", "build", "implement", "create"
-   - "integrate", "introduce"
-   - "from scratch"
-   → Min questions: 5, Max questions: 10
-
-4. MID_SIZED: Default if no clear match
-   → Min questions: 3, Max questions: 7
-```
-
-**Confidence Threshold:**
-
-| Match Count | Confidence | Action |
-|-------------|------------|--------|
-| 3+ keywords | High | Use matched category |
-| 1-2 keywords | Medium | Use matched category |
-| 0 keywords | Low | Default to MID_SIZED |
-
-**Question Count Rules:**
-- TRIVIAL: 1-2 questions (get essentials, move fast)
-- REFACTOR: 3-5 questions (understand scope and risks)
-- GREENFIELD: 5-10 questions (full context needed)
-- MID_SIZED: 3-7 questions (balanced approach)
-
-**Store Intent:**
-After classification, store the result in `.progress.md`:
-```markdown
-## Interview Format
-- Version: 1.0
-
-## Intent Classification
-- Type: [TRIVIAL|REFACTOR|GREENFIELD|MID_SIZED]
-- Confidence: [high|medium|low] ([N] keywords matched)
-- Min questions: [N]
-- Max questions: [N]
-- Keywords matched: [list of matched keywords]
+1. TWO ARGS before --quick: name = first, goal/file = second
+2. ONE ARG before --quick:
+   a. FILE PATH (starts with ./ or /) -> read file as plan
+   b. KEBAB-CASE NAME -> check ./specs/$name/plan.md
+   c. GOAL STRING -> infer name from goal
+3. ZERO ARGS with --quick: Error
 ```
 
-### Question Count by Intent
-
-Intent classification determines the question count range, not which questions to ask. All goals use the same Goal Interview Question Pool (defined below), but the number of questions varies by intent:
+### Name Inference
 
-| Intent | Min Questions | Max Questions |
-|--------|---------------|---------------|
-| TRIVIAL | 1 | 2 |
-| REFACTOR | 3 | 5 |
-| GREENFIELD | 5 | 10 |
-| MID_SIZED | 3 | 7 |
+If no explicit name: take first 3 words of goal, kebab-case, max 30 chars.
 
-**Question Selection Logic:**
+### Quick Mode Execution
 
 ```text
-1. Get intent from Intent Classification step
-2. Intent determines question COUNT, not which pool to use
-3. All goals use the Goal Interview Question Pool
-4. Ask Required questions first, then Optional questions
-5. Stop when:
-   - User signals completion (after minRequired reached)
-   - All questions asked (maxAllowed reached)
-   - User selects "No, let's proceed" on optional question
-```
-
-### Question Classification
-
-Before asking any question, classify it to determine the appropriate source for the answer.
-
-**Classification Matrix:**
-
-| Question Type | Source | Examples |
-|---------------|--------|----------|
-| Codebase fact | Explore agent | "What patterns exist?", "Where is X located?", "What dependencies are used?" |
-| User preference | AskUserQuestion | "What priority level?", "Which approach do you prefer?" |
-| Requirement | AskUserQuestion | "What must this feature do?", "What are the constraints?" |
-| Scope decision | AskUserQuestion | "Should this include X?", "What's in/out of scope?" |
-| Risk tolerance | AskUserQuestion | "How critical is backwards compatibility?" |
-| Constraint | AskUserQuestion | "Any performance requirements?", "Timeline constraints?" |
-
-<mandatory>
-**DO NOT ask user about codebase facts - use Explore agent instead.**
-
-Questions that should go to the user (AskUserQuestion):
-- Preference: "Which approach do you prefer?"
-- Requirement: "What must this feature accomplish?"
-- Scope: "Should this include feature X?"
-- Constraint: "Any performance/timeline constraints?"
-- Risk: "How important is backwards compatibility?"
-
-Questions that should use Explore agent (NOT AskUserQuestion):
-- Existing patterns: "How does the codebase handle X?"
-- File locations: "Where are the authentication modules?"
-- Dependencies: "What libraries are currently used for Y?"
-- Code conventions: "What naming patterns are used?"
-- Architecture: "How is the service layer structured?"
-
-**Before each interview question, check: Is this a codebase fact or user decision?**
-- Codebase fact → Use Explore agent to find the answer automatically
-- User decision → Ask via AskUserQuestion
-</mandatory>
-
-### Question Piping
-
-Before asking each question, replace `{var}` placeholders with values from `.progress.md` context.
-
-**Available Variables:**
-- `{goal}` - Original goal text from user
-- `{intent}` - Intent classification (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
-- `{problem}` - Problem description from Goal Interview
-- `{constraints}` - Constraints from Goal Interview
-- `{users}` - Primary users (not yet available in start.md, populated in later phases)
-- `{priority}` - Priority tradeoffs (not yet available in start.md, populated in later phases)
-
-**Piping Instructions:**
-1. Before each AskUserQuestion, replace `{var}` with values from `.progress.md`
-2. If variable not found, use original question text (graceful fallback)
-3. Example: "What priority tradeoffs for {goal}?" becomes "What priority tradeoffs for Add user authentication?"
-
-**Fallback Behavior:**
-- If `{goal}` not found → use "{goal}" literally (this should rarely happen since goal is always provided)
-- If `{intent}` not found → skip piping for that variable
-- Always prefer graceful degradation over errors
-
-### Goal Interview Questions (Single-Question Flow)
-
-**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
-
-### Parameter Chain Note
-
-**Note**: start.md is the first phase - no prior responses exist to check.
-
-This phase initializes the interview context. Later phases (research, requirements, design, tasks) use parameter chain to skip questions already answered here.
-
-### Phase-Specific Configuration
-
-- **Phase**: Goal Interview (first phase)
-- **Available Variables**: `{goal}`, `{intent}` (others populated in later phases)
-- **Storage Section**: `### Goal Interview (from start.md)`
-- **Semantic Keys**: problem, constraints, success, additionalContext
-
-### Goal Interview Question Pool
-
-| # | Question | Required | Key | Options |
-|---|----------|----------|-----|---------|
-| 1 | What problem are you solving with this feature? | Required | `problem` | Fixing a bug or issue / Adding new functionality / Improving existing behavior / Other |
-| 2 | Any constraints or must-haves for this feature? | Required | `constraints` | No special constraints / Must integrate with existing code / Performance is critical / Other |
-| 3 | How will you know this feature is successful? | Required | `success` | Tests pass and code works / Users can complete specific workflow / Performance meets target metrics / Other |
-| 4 | Any other context you'd like to share? (or say 'done' to proceed) | Optional | `additionalContext` | No, let's proceed / Yes, I have more details / Other |
-
-### Store Goal Context
-
-After interview, update `.progress.md` with Interview Format, Intent Classification, and Interview Responses sections:
-
-```markdown
-## Interview Format
-- Version: 1.0
-
-## Intent Classification
-- Type: [TRIVIAL|REFACTOR|GREENFIELD|MID_SIZED]
-- Confidence: [high|medium|low] ([N] keywords matched)
-- Min questions: [N]
-- Max questions: [N]
-- Keywords matched: [list of matched keywords]
-
-## Interview Responses
-
-### Goal Interview (from start.md)
-- Problem: [responses.problem]
-- Constraints: [responses.constraints]
-- Success criteria: [responses.success]
-- Additional context: [responses.additionalContext]
-[Any follow-up responses from "Other" selections]
+1. Validate input (non-empty goal/plan)
+2. Infer name from goal (if not provided)
+3. Create spec directory: ./specs/$name/
+4. Ensure gitignore entries
+5. Write .ralph-state.json (source: "plan", phase: "tasks")
+6. Write .progress.md with goal
+7. Update .current-spec
+8. Invoke plan-synthesizer agent via Task tool
+9. After generation: update state phase="execution", read task count
+10. If commitSpec: stage, commit, push spec files
+11. Invoke spec-executor for task 1
 ```
 
-### Pass Context to Research
-
-Include goal interview context when invoking research-analyst:
+### Quick Mode Validation
 
 ```text
-Task delegation prompt should include:
-
-Goal Interview Context:
-- Problem: [response]
-- Constraints: [response]
-- Success criteria: [response]
-
-Use this context to focus research on relevant areas.
+1. ZERO ARGS CHECK -> Error: "Quick mode requires a goal or plan file"
+2. FILE NOT FOUND -> Error: "File not found: $filePath"
+3. EMPTY CONTENT CHECK -> Error: "Plan content is empty"
+4. PLAN TOO SHORT WARNING (< 10 words) -> Warning but continue
+5. NAME CONFLICT RESOLUTION -> Append -2, -3, etc. if exists
 ```
 
-## Quick Mode Flow
+### Atomic Rollback
 
-Triggered when `--quick` flag detected. Skips all spec phases and auto-generates artifacts.
+On generation failure: delete spec directory, restore .current-spec, display error.
 
+## Quick Mode Uses Ralph Loop
+
+After generating spec artifacts in quick mode, invoke ralph-loop:
 ```text
-1. Check if --quick flag present in $ARGUMENTS
-   |
-   +-- Yes: Extract args before --quick
-   |   |
-   |   +-- Two args: name = first, goal = second
-   |   |
-   |   +-- One arg: goal = first (infer name later)
-   |   |
-   |   +-- Zero args: Error "Quick mode requires a goal or plan"
-   |
-   +-- No: Continue to normal Detection Logic
+Skill: ralph-loop:ralph-loop
+Args: Read ./specs/$spec/.coordinator-prompt.md and follow those instructions exactly. Output ALL_TASKS_COMPLETE when done. --max-iterations <calculated> --completion-promise ALL_TASKS_COMPLETE
 ```
 
-### Quick Mode Steps (POC)
-
-1. Parse args before `--quick`:
-   - If two args: `name` = first arg (kebab-case), `goal` = second arg
-   - If one arg: `goal` = arg, `name` = infer from goal (first 3 words, kebab-case, max 30 chars)
-2. Validate non-empty goal
-3. Create spec directory: `./specs/$name/`
-4. Initialize `.ralph-state.json` with `source: "plan"`:
-   ```json
-   {
-     "source": "plan",
-     "name": "$name",
-     "basePath": "./specs/$name",
-     "phase": "tasks",
-     "taskIndex": 0,
-     "totalTasks": 0,
-     "taskIteration": 1,
-     "maxTaskIterations": 5,
-     "globalIteration": 1,
-     "maxGlobalIterations": 100
-   }
-   ```
-5. Write `.progress.md` with goal
-6. Update `.current-spec` with name
-7. Invoke plan-synthesizer agent to generate all artifacts
-8. After generation: update state `phase: "execution"`, read task count
-9. Invoke spec-executor for task 1
-
 ## Status Display (on resume)
 
-Before resuming, show brief status:
-
 ```text
 Resuming: user-auth
 Phase: execution
@@ -955,8 +226,6 @@ Continuing...
 
 ## Output
 
-After detection and action:
-
 **New spec:**
 ```text
 Created spec 'user-auth' at ./specs/user-auth/
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 664aa207..f6124769 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -52,6 +52,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.12 Create quality-checkpoints skill
 - [x] 2.13 Create quality-commands skill
 - [x] 2.15 Simplify implement.md command
+- [x] 2.16 Simplify start.md command
 
 ## Current Task
 
@@ -59,6 +60,13 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.16: Simplify start.md
+- Reduced from 980 lines to 248 lines (75% reduction)
+- Used <skill-reference> blocks to reference: branch-management, intent-classification, spec-scanner
+- Kept core orchestration logic: argument parsing, detection logic, resume/new flows, quick mode
+- Interview question pool kept inline (4 questions, command-specific)
+- Quick mode flow kept inline as it's specific to this command
+
 ### Task 2.15: Simplify implement.md
 - Reduced from 1235 lines to 233 lines (81% reduction)
 - Used <skill-reference> blocks to reference: coordinator-pattern, failure-recovery, verification-layers
@@ -144,7 +152,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.16 Simplify start.md command
+Task 2.17 Simplify research.md command
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index fd3ed436..ff4bb13f 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -382,7 +382,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify implement.md to reference skills`
   - _Design: Command Simplification Plan_
 
-- [ ] 2.16 Simplify start.md command
+- [x] 2.16 Simplify start.md command
   - **Do**:
     1. Replace inline branch management with skill reference to branch-management
     2. Replace inline intent classification with skill reference to intent-classification

From 5e41f841e10fe02a91b282850d87e37ec7a7ab57 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:17:47 +0200
Subject: [PATCH 27/37] refactor(ralph-specum): simplify research.md to
 reference skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/commands/research.md | 676 +++-------------------
 specs/refactor-plugins/.progress.md       |  10 +-
 specs/refactor-plugins/tasks.md           |   2 +-
 3 files changed, 92 insertions(+), 596 deletions(-)

diff --git a/plugins/ralph-specum/commands/research.md b/plugins/ralph-specum/commands/research.md
index 7823f491..f99c9159 100644
--- a/plugins/ralph-specum/commands/research.md
+++ b/plugins/ralph-specum/commands/research.md
@@ -17,18 +17,6 @@ You MUST delegate ALL research work to subagents:
 
 Do NOT perform web searches, codebase analysis, or write research.md yourself.
 
-**PARALLEL EXECUTION IS MANDATORY - ALWAYS.**
-- Minimum: 2 agents (1 research-analyst + 1 Explore)
-- Standard: 3-4 agents (2-3 research-analyst + 1-2 Explore)
-- Complex: 5+ agents (3-4 research-analyst for different topics + 2-3 Explore)
-- **ALL agent Task calls MUST be in ONE message** (not sequential messages)
-
-**CRITICAL: You can and SHOULD spawn MULTIPLE research-analyst agents in parallel.**
-- Each research-analyst should focus on a distinct research topic
-- Example: GraphQL API + Caching strategies = 2 research-analyst agents in parallel
-- Example: Auth patterns + Security best practices + API design = 3 research-analyst agents in parallel
-- DO NOT limit yourself to just one research-analyst agent
-
 Failure to spawn multiple agents in parallel violates the core design of this command.
 </mandatory>
 
@@ -44,603 +32,114 @@ Failure to spawn multiple agents in parallel violates the core design of this co
 2. Read `.ralph-state.json` if it exists
 3. Read `.progress.md` to understand the goal
 
-## Analyze Research Topics
-
-<mandatory>
-**BEFORE invoking any subagents, analyze the goal and identify distinct research topics.**
-
-Break down the goal into independent research areas that can be explored in parallel. Consider:
-- **External/Best Practices**: Industry standards, patterns, libraries to research online → `research-analyst`
-- **Codebase Analysis**: Existing implementations, patterns, constraints → `Explore` (fast, read-only)
-- **Related Specs**: Other specs in ./specs/ that may overlap → `Explore` (fast, read-only)
-- **Domain-Specific**: Specialized topics needing focused research → `research-analyst` for web, `Explore` for code
-- **Quality Commands**: Project lint/test/build commands discovery → `Explore` (fast, read-only)
-</mandatory>
-
-### Subagent Selection Guide
-
-| Task Type | Subagent | Reason |
-|-----------|----------|--------|
-| Web search for best practices | `research-analyst` | Needs WebSearch/WebFetch tools |
-| Library/API documentation | `research-analyst` | Needs web access |
-| Codebase pattern analysis | `Explore` | Fast, read-only, optimized for code |
-| Related specs discovery | `Explore` | Fast scanning of ./specs/ |
-| Quality commands discovery | `Explore` | Fast package.json/Makefile analysis |
-| File structure exploration | `Explore` | Fast, uses Haiku model |
-| Cross-referencing (code vs docs) | Both in parallel | Divide by source type |
-
-### Topic Splitting Guidelines
-
-| Scenario | Recommendation |
-|----------|----------------|
-| Simple, focused goal | 2 agents minimum: 1 research-analyst (web) + 1 Explore (codebase) |
-| Goal spans multiple domains | 3-5 agents: 2-3 research-analyst (different topics) + 1-2 Explore |
-| Goal involves external APIs + codebase | 2+ research-analyst for API docs/best practices + 1+ Explore for codebase |
-| Goal touches multiple components | Multiple Explore agents (one per component) + multiple research-analyst (one per external topic) |
-| Complex architecture question | 5+ agents: 3-4 research-analyst (different external topics) + 2-3 Explore (different code areas) |
-
-**Benefits of parallel execution:**
-- 3-5 agents in parallel = up to 90% faster research
-- Explore agents use Haiku model = very fast codebase analysis
-- Each agent has focused context = better depth
-- Results synthesized for comprehensive coverage
-
-**When NOT to split:**
-- Topics are tightly coupled and depend on each other
-- Splitting would create redundant searches
-
-## Interview
-
-<mandatory>
-**Skip interview if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct interview using AskUserQuestion before delegating to subagent.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Execute Research".
-
-### Read Context from .progress.md
-
-Before conducting the interview, read `.progress.md` to get:
-1. **Intent Classification** from start.md (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
-2. **Prior interview responses** to enable parameter chain (skip already-answered questions)
-
-```text
-Context Reading:
-1. Read ./specs/$spec/.progress.md
-2. Parse "## Intent Classification" section for intent type and question counts
-3. Parse "## Interview Responses" section for prior answers
-4. Store parsed data for parameter chain checks
-```
-
-**Intent-Based Question Counts (same as start.md):**
-- TRIVIAL: 1-2 questions (minimal technical context needed)
-- REFACTOR: 3-5 questions (understand approach and risks)
-- GREENFIELD: 5-10 questions (full technical context)
-- MID_SIZED: 3-7 questions (balanced approach)
-
-### Research Interview (Single-Question Flow)
-
-**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
+## Interview (Skip if --quick)
 
-### Phase-Specific Configuration
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/interview-framework/SKILL.md`
+Use the interview framework for single-question adaptive interview loop.
 
+**Phase-Specific Configuration:**
 - **Phase**: Research Interview
 - **Parameter Chain Mappings**: technicalApproach, knownConstraints, integrationPoints
 - **Available Variables**: `{goal}`, `{intent}`, `{problem}`, `{constraints}`
-- **Variables Not Yet Available**: `{users}`, `{priority}` (populated in later phases)
 - **Storage Section**: `### Research Interview (from research.md)`
 
-### Research Interview Question Pool
-
-| # | Question | Required | Key | Options |
-|---|----------|----------|-----|---------|
-| 1 | What technical approach do you prefer for this feature? | Required | `technicalApproach` | Follow existing patterns in codebase (Recommended) / Introduce new patterns/frameworks / Hybrid - keep existing where possible / Other |
-| 2 | Are there any known constraints or limitations? | Required | `knownConstraints` | No known constraints / Must work with existing API / Performance critical / Other |
-| 3 | Are there specific integration points to consider? | Required | `integrationPoints` | Standard integration with existing services / New external dependencies required / Isolated component (minimal integration) / Other |
-| 4 | Any other technical context for research? (or say 'done' to proceed) | Optional | `additionalTechContext` | No, let's proceed / Yes, I have more details / Other |
-
-### Store Research Interview Responses
-
-After interview, append to `.progress.md` under the "Interview Responses" section:
-
-```markdown
-### Research Interview (from research.md)
-- Technical approach: [responses.technicalApproach]
-- Known constraints: [responses.knownConstraints]
-- Integration points: [responses.integrationPoints]
-- Additional technical context: [responses.additionalTechContext]
-[Any follow-up responses from "Other" selections]
-```
-
-### Interview Context Format
-
-Pass the combined context (prior + new responses) to the Task delegation prompt:
+**Question Pool:**
+| # | Question | Required | Key |
+|---|----------|----------|-----|
+| 1 | What technical approach do you prefer? | Required | `technicalApproach` |
+| 2 | Are there any known constraints or limitations? | Required | `knownConstraints` |
+| 3 | Are there specific integration points to consider? | Required | `integrationPoints` |
+| 4 | Any other technical context? (or say 'done') | Optional | `additionalTechContext` |
 
-```text
-Interview Context:
-- Technical approach: [Answer]
-- Known constraints: [Answer]
-- Integration points: [Answer]
-- Follow-up details: [Any additional clarifications]
-```
+Store responses in `.progress.md` under `### Research Interview (from research.md)`.
+</skill-reference>
 
-Store this context to include in the Task delegation prompt.
+## Execute Research (Parallel)
 
-## Execute Research
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/parallel-research/SKILL.md`
+Use the parallel research pattern to spawn multiple subagents for comprehensive research.
 
-<mandatory>
 **PARALLEL EXECUTION IS MANDATORY - NO EXCEPTIONS**
 
-You MUST follow this algorithm:
-
-### Step 1: Identify Research Topics (REQUIRED)
-
-Analyze the goal and list AT LEAST 2 distinct research topics. Output the list to the user:
-
-```
-Research topics identified for parallel execution:
-1. [Topic name] - [Agent type: research-analyst/Explore]
-2. [Topic name] - [Agent type: research-analyst/Explore]
-3. [Topic name] - [Agent type: research-analyst/Explore] (if applicable)
-...
-```
-
-**Minimum requirement**: 2 topics minimum
-- Topic 1: External/best practices (use research-analyst)
-- Topic 2: Codebase patterns (use Explore)
-- Additional topics: Domain-specific areas (spawn MULTIPLE research-analyst agents), quality commands (Explore), related specs (Explore)
-
-**IMPORTANT: Break external research into MULTIPLE research-analyst agents**
-- If the goal involves multiple external topics (e.g., "authentication + security"), spawn separate research-analyst agents for EACH topic
-- Example: "Add OAuth with rate limiting" → 3 research-analyst agents (OAuth patterns, rate limiting strategies, security best practices)
-- DO NOT combine multiple external topics into one research-analyst agent
-
-### Step 2: Spawn ALL Agents in ONE Message (REQUIRED)
-
-**CRITICAL**: You MUST include ALL Task tool calls in a SINGLE response message to ensure true parallel execution.
-
-Use the appropriate subagent type for each topic:
-- `subagent_type: Explore` - For codebase analysis (fast, read-only, Haiku model)
-- `subagent_type: research-analyst` - For web research (needs WebSearch/WebFetch)
-
-**If you spawn agents one at a time (separate messages), they run sequentially - THIS IS WRONG.**
-**If you spawn all agents in one message (multiple Task calls), they run in parallel - THIS IS CORRECT.**
-
-### Pre-Execution Checklist (REQUIRED)
-
-Before spawning agents, verify you have:
-- [ ] Listed at least 2 distinct research topics
-- [ ] Assigned appropriate agent type (Explore or research-analyst) to each topic
-- [ ] Prepared unique output file path for each agent (.research-*.md)
-- [ ] Prepared all Task tool calls in your response (ready to send in ONE message)
-- [ ] NOT written any code/searches yourself (you are a coordinator, not a researcher)
-
-If all boxes are checked, proceed with Step 2 (spawn all agents in ONE message).
-</mandatory>
-
-### Fail-Safe: "But This Goal is Simple..."
-
-<mandatory>
-**Even trivial goals require parallel research.**
-
-If you think the goal is "too simple" for parallel research:
-- You're wrong - spawn at least 2 agents anyway
-- Minimum: 1 Explore (codebase) + 1 research-analyst (web)
-- Parallel execution is about SPEED, not complexity
-- 2 agents in parallel = 2x faster than sequential
-
-**There are ZERO exceptions to the parallel requirement.**
-</mandatory>
-
-### Minimum Parallel Pattern (Always Use)
-
-Even for simple goals, spawn at least 2 agents in parallel:
-
-```text
-Task 1 (research-analyst - web): Search for best practices
-Task 2 (Explore - codebase): Analyze existing patterns
-```
-
-**Example output before spawning:**
-```
-Research topics identified for parallel execution:
-1. External best practices - research-analyst
-2. Codebase analysis - Explore
-
-Now spawning 2 research agents in parallel...
-```
-
-### Multi-Topic Pattern (Common Case)
-
-For goals with multiple external topics, spawn MULTIPLE research-analyst agents:
-
-```text
-Task 1 (research-analyst): OAuth authentication patterns
-Task 2 (research-analyst): Rate limiting strategies
-Task 3 (research-analyst): Security best practices
-Task 4 (Explore): Existing auth implementation
-Task 5 (Explore): Quality commands discovery
-```
-
-**Example output before spawning:**
-```
-Research topics identified for parallel execution:
-1. OAuth patterns - research-analyst
-2. Rate limiting - research-analyst
-3. Security practices - research-analyst
-4. Existing auth code - Explore
-5. Quality commands - Explore
-
-Now spawning 5 research agents in parallel (3 research-analyst + 2 Explore)...
-```
-
-### Parallel Execution: Correct vs Incorrect
-
-**WRONG (Sequential)** - Each Task call in separate message:
-```
-Message 1: Task(subagent_type: research-analyst, topic: best practices)
-[wait for result]
-Message 2: Task(subagent_type: Explore, topic: codebase)
-[wait for result]
-```
-Result: Agents run one after another = SLOW
-
-**CORRECT (Parallel)** - All Task calls in ONE message:
-```
-Message 1:
-  Task(subagent_type: research-analyst, topic: best practices)
-  Task(subagent_type: Explore, topic: codebase)
-  Task(subagent_type: Explore, topic: quality commands)
-[all agents start simultaneously]
-```
-Result: Agents run at the same time = FAST (2-3x faster)
-
-### Standard Parallel Pattern (Recommended)
-
-For most goals with diverse topics, spawn 3-4 agents in ONE message.
-
-**CRITICAL: If the goal involves multiple external topics, spawn MULTIPLE research-analyst agents (one per topic).**
-
-Example: "Add authentication with email notifications"
-- research-analyst #1: Authentication patterns
-- research-analyst #2: Email service best practices
-- Explore #1: Existing auth/email code
-- Explore #2: Quality commands
-
-**Task 1 - External Research Topic A (research-analyst #1):**
-```yaml
-subagent_type: research-analyst
-
-You are researching for spec: $spec
-Spec path: ./specs/$spec/
-Topic: [FIRST EXTERNAL TOPIC - e.g., Authentication patterns]
-
-Focus ONLY on web research for THIS specific topic:
-1. WebSearch for best practices, industry standards
-2. WebSearch for common pitfalls and gotchas
-3. Research relevant libraries/frameworks
-4. Document findings in ./specs/$spec/.research-[topic-name].md
-
-Do NOT explore codebase - Explore agents handle that in parallel.
-Do NOT research other topics - other research-analyst agents handle those.
-```
-
-**Task 2 - External Research Topic B (research-analyst #2):**
-```yaml
-subagent_type: research-analyst
-
-You are researching for spec: $spec
-Spec path: ./specs/$spec/
-Topic: [SECOND EXTERNAL TOPIC - e.g., Email service patterns]
-
-Focus ONLY on web research for THIS specific topic:
-1. WebSearch for best practices for this topic
-2. WebSearch for common pitfalls
-3. Research relevant libraries/tools
-4. Document findings in ./specs/$spec/.research-[topic-name].md
-
-Do NOT explore codebase - Explore agents handle that in parallel.
-Do NOT research other topics - other research-analyst agents handle those.
-```
-
-**Task 3 - Codebase Analysis (Explore - fast):**
-```yaml
-subagent_type: Explore
-thoroughness: very thorough
-
-Analyze codebase for spec: $spec
-Output file: ./specs/$spec/.research-codebase.md
-
-Tasks:
-1. Find existing patterns related to [goal]
-2. Identify dependencies and constraints
-3. Check for similar implementations
-4. Document architectural patterns used
-
-Write findings to the output file with sections:
-- Existing Patterns (with file paths)
-- Dependencies
-- Constraints
-- Recommendations
-```
-
-**Task 4 - Quality Commands Discovery (Explore - fast):**
-```yaml
-subagent_type: Explore
-thoroughness: quick
-
-Discover quality commands for spec: $spec
-Output file: ./specs/$spec/.research-quality.md
-
-Tasks:
-1. Read package.json scripts section
-2. Check for Makefile targets
-3. Scan .github/workflows/*.yml for CI commands
-4. Document lint, test, build, typecheck commands
+Minimum: 2 agents (1 research-analyst + 1 Explore)
+Standard: 3-4 agents (2-3 research-analyst + 1-2 Explore)
+Complex: 5+ agents (3-4 research-analyst + 2-3 Explore)
 
-Write findings as table: | Type | Command | Source |
-```
-
-**Task 5 - Related Specs Discovery (Explore - fast):**
-```yaml
-subagent_type: Explore
-thoroughness: medium
-
-Scan related specs for: $spec
-Output file: ./specs/$spec/.research-related-specs.md
-
-Tasks:
-1. List all directories in ./specs/ (each is a spec)
-2. For each spec, read .progress.md for Original Goal
-3. Read research.md/requirements.md summaries if exist
-4. Identify overlaps, conflicts, specs needing updates
-
-Write findings as table: | Name | Relevance | Relationship | mayNeedUpdate |
-```
-
-### Complex Goal Pattern (5+ Agents)
-
-**Example: Goal involves "Add GraphQL API with caching"**
-
-**CRITICAL: This goal has TWO distinct external topics (GraphQL + Caching), so spawn TWO research-analyst agents (one per topic).**
-
-Spawn 5 agents in ONE message (2 research-analyst + 3 Explore):
+**ALL agent Task calls MUST be in ONE message** to achieve true parallelism.
+</skill-reference>
 
-| Agent # | Type | Focus | Output File |
-|---------|------|-------|-------------|
-| 1 | research-analyst | GraphQL best practices (web) | .research-graphql.md |
-| 2 | research-analyst | Caching strategies (web) | .research-caching.md |
-| 3 | Explore | Existing API patterns (code) | .research-codebase.md |
-| 4 | Explore | Quality commands | .research-quality.md |
-| 5 | Explore | Related specs | .research-related-specs.md |
+### Research Topics to Cover
 
-**Task 1 - GraphQL Best Practices (research-analyst):**
-```yaml
-subagent_type: research-analyst
+1. **External Research** (research-analyst): Best practices, industry standards, libraries
+2. **Codebase Analysis** (Explore): Existing patterns, dependencies, constraints
+3. **Quality Commands** (Explore): lint, test, build, typecheck commands
+4. **Related Specs** (Explore): Other specs that may overlap
 
-Topic: GraphQL API best practices
-Output: ./specs/$spec/.research-graphql.md
+### Output Files
 
-1. WebSearch: "GraphQL schema design best practices 2024"
-2. WebSearch: "GraphQL resolvers performance patterns"
-3. Research popular GraphQL libraries (Apollo, Yoga, etc.)
-4. Document best practices, patterns, pitfalls
-```
+Each agent writes to a unique file:
+- `.research-[topic].md` (from research-analyst agents)
+- `.research-codebase.md` (from Explore)
+- `.research-quality.md` (from Explore)
+- `.research-related-specs.md` (from Explore)
 
-**Task 2 - Caching Strategies (research-analyst):**
-```yaml
-subagent_type: research-analyst
+## Merge Results
 
-Topic: Caching strategies for GraphQL
-Output: ./specs/$spec/.research-caching.md
+After ALL parallel subagent tasks complete, merge results into unified `./specs/$spec/research.md`:
 
-1. WebSearch: "GraphQL caching strategies 2024"
-2. WebSearch: "DataLoader patterns best practices"
-3. Research cache invalidation approaches
-4. Document caching patterns and recommendations
-```
+```markdown
+# Research: $spec
 
-**Task 3 - Codebase Analysis (Explore):**
-```yaml
-subagent_type: Explore
-thoroughness: very thorough
+## Executive Summary
+[Synthesize key findings - 2-3 sentences]
 
-Topic: Existing API and caching patterns in codebase
-Output: ./specs/$spec/.research-codebase.md
+## External Research
+### Best Practices
+### Prior Art
+### Pitfalls to Avoid
 
-1. Search for existing API implementations
-2. Find any caching code or patterns
-3. Identify relevant dependencies
-4. Document patterns with file paths
-```
+## Codebase Analysis
+### Existing Patterns
+### Dependencies
+### Constraints
 
-**Task 4 - Quality Commands (Explore):**
-```yaml
-subagent_type: Explore
-thoroughness: quick
+## Related Specs
+| Spec | Relevance | Relationship | May Need Update |
 
-Topic: Quality commands discovery
-Output: ./specs/$spec/.research-quality.md
+## Quality Commands
+| Type | Command | Source |
 
-1. Check package.json scripts
-2. Check Makefile if exists
-3. Check CI workflow commands
-4. Output as table: Type | Command | Source
-```
+## Feasibility Assessment
+| Aspect | Assessment | Notes |
 
-**Task 5 - Related Specs (Explore):**
-```yaml
-subagent_type: Explore
-thoroughness: medium
+## Recommendations for Requirements
 
-Topic: Related specs discovery
-Output: ./specs/$spec/.research-related-specs.md
+## Open Questions
 
-1. Scan ./specs/ for existing specs
-2. Read each spec's progress and requirements
-3. Identify overlaps with GraphQL/caching goal
-4. Output as table: Name | Relevance | Relationship | mayNeedUpdate
+## Sources
 ```
 
-## Merge Results (After Parallel Research)
-
-<mandatory>
-After ALL parallel subagent tasks complete, YOU must merge results into a single research.md.
-</mandatory>
-
-### Merge Process
-
-1. **Read all partial research files** created by subagents:
-   - `.research-[topic-1].md`, `.research-[topic-2].md`, etc. (from multiple research-analyst agents)
-   - Example: `.research-graphql.md`, `.research-caching.md`, `.research-auth.md` (from research-analyst agents)
-   - `.research-codebase.md` (from Explore)
-   - `.research-quality.md` (from Explore)
-   - `.research-related-specs.md` (from Explore)
-
-2. **Create unified `./specs/$spec/research.md`** with standard structure:
-   ```markdown
-   # Research: $spec
-
-   ## Executive Summary
-   [Synthesize key findings from ALL agents (all research-analyst + all Explore) - 2-3 sentences]
-
-   ## External Research
-   [Merge from ALL .research-[topic].md files created by research-analyst agents]
-   ### Best Practices
-   [From all research-analyst agents]
-   ### Prior Art
-   [From all research-analyst agents]
-   ### Pitfalls to Avoid
-   [From all research-analyst agents]
-
-   ## Codebase Analysis
-   [From .research-codebase.md]
-   ### Existing Patterns
-   ### Dependencies
-   ### Constraints
-
-   ## Related Specs
-   [From .research-related-specs.md]
-   | Spec | Relevance | Relationship | May Need Update |
-
-   ## Quality Commands
-   [From .research-quality.md]
-   | Type | Command | Source |
-
-   ## Feasibility Assessment
-   [Synthesize from all sources]
-   | Aspect | Assessment | Notes |
-
-   ## Recommendations for Requirements
-   [Consolidated recommendations]
-
-   ## Open Questions
-   [Consolidated from all agents]
-
-   ## Sources
-   [All URLs and file paths from all agents]
-   ```
-
-3. **Delete partial research files** after successful merge:
-   ```bash
-   rm ./specs/$spec/.research-*.md
-   ```
-
-4. **Quality check**: Ensure no duplicate information, consistent formatting
-
-## Review & Feedback Loop
-
-<mandatory>
-**Skip review if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct research review using AskUserQuestion after research is created.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Update State".
-
-### Research Review Questions
-
-After the research has been created and merged by the subagents, ask the user to review it and provide feedback.
-
-**Review Question Flow:**
-
-1. **Read the generated research.md** to understand what was found
-2. **Ask initial review questions** to confirm the research meets their expectations:
-
-| # | Question | Key | Options |
-|---|----------|-----|---------|
-| 1 | Does the research cover all the areas you expected? | `researchCoverage` | Yes, comprehensive / Missing some areas / Need more depth / Other |
-| 2 | Are the findings and recommendations helpful? | `findingsQuality` | Yes, very helpful / Somewhat helpful / Need more details / Other |
-| 3 | Are there any specific areas you'd like researched further? | `additionalResearch` | No, looks complete / Yes, I have specific areas / Other |
-| 4 | Any other feedback on the research? (or say 'approved' to proceed) | `researchFeedback` | Approved, let's proceed / Yes, I have feedback / Other |
-
-### Store Research Review Responses
-
-After review questions, append to `.progress.md` under a new section:
-
-```markdown
-### Research Review (from research.md)
-- Research coverage: [responses.researchCoverage]
-- Findings quality: [responses.findingsQuality]
-- Additional research needed: [responses.additionalResearch]
-- Research feedback: [responses.researchFeedback]
-[Any follow-up responses from "Other" selections]
+Delete partial research files after successful merge:
+```bash
+rm ./specs/$spec/.research-*.md
 ```
 
-### Update Research Based on Feedback
-
-<mandatory>
-If the user provided feedback requiring changes (any answer other than "Yes, comprehensive", "Yes, very helpful", "No, looks complete", or "Approved, let's proceed"), you MUST:
-
-1. Collect specific change requests from the user
-2. Invoke appropriate subagents again with additional research instructions
-3. Merge updated results
-4. Repeat the review questions after updates
-5. Continue loop until user approves
-</mandatory>
-
-**Update Flow:**
-
-If changes are needed:
-
-1. **Ask for specific changes:**
-   ```
-   What specific areas would you like researched further or what changes would you like to see?
-   ```
-
-2. **Invoke appropriate subagents with update prompt:**
-   - Use `research-analyst` for additional web research
-   - Use `Explore` for additional codebase analysis
-
-   Example prompt:
-   ```
-   You are conducting additional research for spec: $spec
-   Spec path: ./specs/$spec/
-
-   Current research: ./specs/$spec/research.md
-
-   User feedback:
-   $user_feedback
+## Review & Feedback Loop (Skip if --quick)
 
-   Your task:
-   1. Read the existing research.md
-   2. Understand what additional information is needed
-   3. Conduct focused research on the requested areas
-   4. Output to ./specs/$spec/.research-additional.md
-
-   Focus on addressing the specific gaps identified by the user.
-   ```
+After research is created, ask the user to review:
 
-3. **Merge updated results** into research.md
+| # | Question | Key |
+|---|----------|-----|
+| 1 | Does the research cover all expected areas? | `researchCoverage` |
+| 2 | Are the findings and recommendations helpful? | `findingsQuality` |
+| 3 | Any areas to research further? | `additionalResearch` |
+| 4 | Any other feedback? (or say 'approved') | `researchFeedback` |
 
-4. **After update, repeat review questions** (go back to "Research Review Questions")
+Store responses in `.progress.md` under `### Research Review (from research.md)`.
 
-5. **Continue until approved:** Loop until user responds with approval
+If user requests changes: invoke appropriate subagents again, merge updated results, repeat review.
 
 ## Update State
 
@@ -661,24 +160,15 @@ After research completes and is approved:
 
 ## Commit Spec (if enabled)
 
-Read `commitSpec` from `.ralph-state.json` (set during `/ralph-specum:start`).
+Read `commitSpec` from `.ralph-state.json`. If true:
 
-If `commitSpec` is true:
-
-1. Stage research file:
-   ```bash
-   git add ./specs/$spec/research.md
-   ```
-2. Commit with message:
-   ```bash
-   git commit -m "spec($spec): add research findings"
-   ```
-3. Push to current branch:
-   ```bash
-   git push -u origin $(git branch --show-current)
-   ```
+```bash
+git add ./specs/$spec/research.md
+git commit -m "spec($spec): add research findings"
+git push -u origin $(git branch --show-current)
+```
 
-If commit or push fails, display warning but continue (don't block the workflow).
+If commit/push fails, display warning but continue.
 
 ## Output
 
@@ -690,7 +180,6 @@ Output: ./specs/$spec/research.md
 
 Related specs found:
   - <name> (<RELEVANCE>) - may need update
-  - <name> (<RELEVANCE>)
 
 Next: Review research.md, then run /ralph-specum:requirements
 ```
@@ -700,13 +189,12 @@ Next: Review research.md, then run /ralph-specum:requirements
 <mandatory>
 **STOP HERE. DO NOT PROCEED TO REQUIREMENTS.**
 
-(This does not apply in `--quick` mode, which auto-generates all artifacts without stopping.)
+(Exception: `--quick` mode auto-generates all artifacts without stopping.)
 
-After displaying the output above, you MUST:
+After displaying output, you MUST:
 1. End your response immediately
-2. Wait for the user to review research.md
-3. Only proceed to requirements when user explicitly runs `/ralph-specum:requirements`
+2. Wait for user to review research.md
+3. Only proceed when user explicitly runs `/ralph-specum:requirements`
 
-DO NOT automatically invoke the product-manager or run the requirements phase.
-The user needs time to review research findings before proceeding.
+DO NOT automatically invoke product-manager or run requirements phase.
 </mandatory>
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index f6124769..ca9cdced 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -53,6 +53,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.13 Create quality-commands skill
 - [x] 2.15 Simplify implement.md command
 - [x] 2.16 Simplify start.md command
+- [x] 2.17 Simplify research.md command
 
 ## Current Task
 
@@ -60,6 +61,13 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.17: Simplify research.md
+- Reduced from 713 lines to 200 lines (72% reduction)
+- Used <skill-reference> blocks to reference: interview-framework, parallel-research
+- Kept core orchestration: spec detection, validation, merge structure, review loop, state update
+- Interview questions kept inline (4 questions, command-specific)
+- Merge template kept inline as it's specific to research output structure
+
 ### Task 2.16: Simplify start.md
 - Reduced from 980 lines to 248 lines (75% reduction)
 - Used <skill-reference> blocks to reference: branch-management, intent-classification, spec-scanner
@@ -152,7 +160,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.17 Simplify research.md command
+Task 2.18 Simplify design.md, requirements.md, tasks.md commands
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index ff4bb13f..758200b7 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -395,7 +395,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify start.md to reference skills`
   - _Design: Command Simplification Plan_
 
-- [ ] 2.17 Simplify research.md command
+- [x] 2.17 Simplify research.md command
   - **Do**:
     1. Replace inline parallel execution with skill reference to parallel-research
     2. Target: ~150 lines (down from 700+)

From 8be1abfb6a7b340c35592fe46ee1f8bc11bf5321 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:19:25 +0200
Subject: [PATCH 28/37] refactor(ralph-specum): simplify phase commands to
 reference skills

Reduced design.md from 301 to 86 lines (71% reduction)
Reduced requirements.md from 294 to 85 lines (71% reduction)
Reduced tasks.md from 314 to 94 lines (70% reduction)

All three commands now explicitly reference interview-framework skill
for the interview loop pattern instead of duplicating the algorithm.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/commands/design.md       | 262 ++---------------
 plugins/ralph-specum/commands/requirements.md | 254 ++--------------
 plugins/ralph-specum/commands/tasks.md        | 277 ++----------------
 3 files changed, 76 insertions(+), 717 deletions(-)

diff --git a/plugins/ralph-specum/commands/design.md b/plugins/ralph-specum/commands/design.md
index 5f87d89d..03ad1292 100644
--- a/plugins/ralph-specum/commands/design.md
+++ b/plugins/ralph-specum/commands/design.md
@@ -6,13 +6,11 @@ allowed-tools: [Read, Write, Task, Bash, AskUserQuestion]
 
 # Design Phase
 
-You are generating technical design for a specification. Running this command implicitly approves the requirements phase.
+Generate technical design for a specification. Running this command implicitly approves the requirements phase.
 
 <mandatory>
 **YOU ARE A COORDINATOR, NOT AN ARCHITECT.**
-
-You MUST delegate ALL design work to the `architect-reviewer` subagent.
-Do NOT create architecture diagrams, technical decisions, or design.md yourself.
+Delegate ALL design work to the `architect-reviewer` subagent via Task tool.
 </mandatory>
 
 ## Determine Active Spec
@@ -25,276 +23,64 @@ Do NOT create architecture diagrams, technical decisions, or design.md yourself.
 
 1. Check `./specs/$spec/` directory exists
 2. Check `./specs/$spec/requirements.md` exists. If not, error: "Requirements not found. Run /ralph-specum:requirements first."
-3. Read `.ralph-state.json`
-4. Clear approval flag: update state with `awaitingApproval: false`
+3. Read `.ralph-state.json` and clear approval flag: `awaitingApproval: false`
 
 ## Gather Context
 
-Read:
-- `./specs/$spec/requirements.md` (required)
-- `./specs/$spec/research.md` (if exists)
-- `./specs/$spec/.progress.md`
-- Existing codebase patterns (via exploration)
+Read: `./specs/$spec/requirements.md`, `./specs/$spec/research.md` (if exists), `./specs/$spec/.progress.md`
 
 ## Interview
 
-<mandatory>
-**Skip interview if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct interview using AskUserQuestion before delegating to subagent.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Execute Design".
-
-### Read Context from .progress.md
-
-Before conducting the interview, read `.progress.md` to get:
-1. **Intent Classification** from start.md (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
-2. **All prior interview responses** to enable parameter chain (skip already-answered questions)
-
-```
-Context Reading:
-1. Read ./specs/$spec/.progress.md
-2. Parse "## Intent Classification" section for intent type and question counts
-3. Parse "## Interview Responses" section for prior answers (Goal Interview, Research Interview, Requirements Interview)
-4. Store parsed data for parameter chain checks
-```
-
-**Intent-Based Question Counts (same as start.md):**
-- TRIVIAL: 1-2 questions (minimal architecture context needed)
-- REFACTOR: 3-5 questions (understand architecture impact)
-- GREENFIELD: 5-10 questions (full architecture context)
-- MID_SIZED: 3-7 questions (balanced approach)
-
-### Design Interview (Single-Question Flow)
-
-**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
+<skill-reference>
+**Apply skill**: `skills/interview-framework/SKILL.md`
+Use interview framework for single-question loop, parameter chain, and completion signals.
+</skill-reference>
 
-### Phase-Specific Configuration
-
-- **Phase**: Design Interview
-- **Parameter Chain Mappings**: architectureStyle, techConstraints, integrationApproach
-- **Available Variables**: `{goal}`, `{intent}`, `{problem}`, `{constraints}`, `{technicalApproach}`, `{users}`, `{priority}`
-- **Storage Section**: `### Design Interview (from design.md)`
+**Skip if --quick flag in $ARGUMENTS.**
 
 ### Design Interview Question Pool
 
 | # | Question | Required | Key | Options |
 |---|----------|----------|-----|---------|
-| 1 | What architecture style fits this feature for {goal}? | Required | `architectureStyle` | Extend existing architecture (Recommended) / Create isolated module / Major refactor to support this / Other |
-| 2 | Any technology constraints for {goal}? | Required | `techConstraints` | No constraints / Must use specific library/framework / Must avoid certain dependencies / Other |
-| 3 | How should this integrate with existing systems? | Required | `integrationApproach` | Use existing APIs and interfaces / Create new integration layer / Minimal integration needed / Other |
-| 4 | Any other design context? (or say 'done' to proceed) | Optional | `additionalDesignContext` | No, let's proceed / Yes, I have more details / Other |
-
-### Store Design Interview Responses
+| 1 | What architecture style fits this feature for {goal}? | Required | `architectureStyle` | Extend existing / Create isolated module / Major refactor / Other |
+| 2 | Any technology constraints for {goal}? | Required | `techConstraints` | No constraints / Must use specific library / Must avoid dependencies / Other |
+| 3 | How should this integrate with existing systems? | Required | `integrationApproach` | Use existing APIs / Create new layer / Minimal integration / Other |
+| 4 | Any other design context? (or 'done') | Optional | `additionalDesignContext` | No, proceed / Yes, more details / Other |
 
-After interview, append to `.progress.md` under the "Interview Responses" section:
-
-```markdown
-### Design Interview (from design.md)
-- Architecture style: [responses.architectureStyle]
-- Technology constraints: [responses.techConstraints]
-- Integration approach: [responses.integrationApproach]
-- Additional design context: [responses.additionalDesignContext]
-[Any follow-up responses from "Other" selections]
-```
-
-### Interview Context Format
-
-Pass the combined context (prior + new responses) to the Task delegation prompt:
-
-```
-Interview Context:
-- Architecture style: [Answer]
-- Technology constraints: [Answer]
-- Integration approach: [Answer]
-- Follow-up details: [Any additional clarifications]
-```
-
-Store this context to include in the Task delegation prompt.
+Store responses in `.progress.md` under `### Design Interview (from design.md)`
 
 ## Execute Design
 
-<mandatory>
-Use the Task tool with `subagent_type: architect-reviewer` to generate design.
-</mandatory>
-
-Invoke architect-reviewer agent with prompt:
+Use Task tool with `subagent_type: architect-reviewer`:
 
-```
+```text
 You are creating technical design for spec: $spec
 Spec path: ./specs/$spec/
 
 Context:
-- Requirements: [include requirements.md content]
-- Research: [include research.md if exists]
-
-[If interview was conducted, include:]
-Interview Context:
-$interview_context
-
-Your task:
-1. Read and understand all requirements
-2. Explore the codebase for existing patterns to follow
-3. Design architecture with mermaid diagrams
-4. Define component responsibilities and interfaces
-5. Document technical decisions with rationale
-6. Plan file structure (create/modify)
-7. Define error handling and edge cases
-8. Create test strategy
-9. Output to ./specs/$spec/design.md
-10. Include interview responses in a "Design Inputs" section of design.md
-
-Use the design.md template with frontmatter:
----
-spec: $spec
-phase: design
-created: <timestamp>
----
-
-Include:
-- Architecture diagram (mermaid)
-- Data flow diagram (mermaid sequence)
-- Technical decisions table
-- File structure matrix
-- TypeScript interfaces
-- Error handling table
-- Test strategy
-```
-
-## Review & Feedback Loop
-
-<mandatory>
-**Skip review if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct design review using AskUserQuestion after design is created.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Update State".
-
-### Design Review Questions
+- Requirements: [requirements.md content]
+- Research: [research.md if exists]
+- Interview: [interview responses]
 
-After the design has been created by the architect-reviewer agent, ask the user to review it and provide feedback.
-
-**Review Question Flow:**
-
-1. **Read the generated design.md** to understand what was created
-2. **Ask initial review questions** to confirm the design meets their expectations:
-
-| # | Question | Key | Options |
-|---|----------|-----|---------|
-| 1 | Does the architecture approach align with your expectations? | `architectureApproval` | Yes, looks good / Needs changes / I have questions / Other |
-| 2 | Are the technical decisions appropriate for your needs? | `technicalDecisionsApproval` | Yes, approved / Some concerns / Need changes / Other |
-| 3 | Is the component structure clear and suitable? | `componentStructureApproval` | Yes, clear / Needs refinement / Major changes needed / Other |
-| 4 | Any other feedback on the design? (or say 'approved' to proceed) | `designFeedback` | Approved, let's proceed / Yes, I have feedback / Other |
-
-### Store Design Review Responses
-
-After review questions, append to `.progress.md` under a new section:
-
-```markdown
-### Design Review (from design.md)
-- Architecture approval: [responses.architectureApproval]
-- Technical decisions approval: [responses.technicalDecisionsApproval]
-- Component structure approval: [responses.componentStructureApproval]
-- Design feedback: [responses.designFeedback]
-[Any follow-up responses from "Other" selections]
+Create design.md with: architecture diagram, data flow, decisions table, file structure, interfaces, error handling, test strategy.
 ```
 
-### Update Design Based on Feedback
-
-<mandatory>
-If the user provided feedback requiring changes (any answer other than "Yes, looks good", "Yes, approved", "Yes, clear", or "Approved, let's proceed"), you MUST:
+## Review Loop
 
-1. Collect specific change requests from the user
-2. Invoke architect-reviewer again with update instructions
-3. Repeat the review questions after updates
-4. Continue loop until user approves
-</mandatory>
-
-**Update Flow:**
-
-If changes are needed:
-
-1. **Ask for specific changes:**
-   ```
-   What specific changes would you like to see in the design?
-   ```
-
-2. **Invoke architect-reviewer with update prompt:**
-   ```
-   You are updating the technical design for spec: $spec
-   Spec path: ./specs/$spec/
-
-   Current design: ./specs/$spec/design.md
-
-   User feedback:
-   $user_feedback
-
-   Your task:
-   1. Read the existing design.md
-   2. Understand the user's feedback and concerns
-   3. Update the design to address the feedback
-   4. Maintain consistency with requirements
-   5. Update design.md with the changes
-   6. Append update notes to .progress.md explaining what changed
-
-   Focus on addressing the specific feedback while maintaining the overall design quality.
-   ```
-
-3. **After update, repeat review questions** (go back to "Design Review Questions")
-
-4. **Continue until approved:** Loop until user responds with approval
+**Skip if --quick flag.** Ask user to review generated design. If changes needed, invoke architect-reviewer again with feedback and repeat until approved.
 
 ## Update State
 
-After design complete and approved:
-
-1. Update `.ralph-state.json`:
-   ```json
-   {
-     "phase": "design",
-     "awaitingApproval": true,
-     ...
-   }
-   ```
-
-2. Update `.progress.md`:
-   - Mark requirements as implicitly approved
-   - Set current phase to design
+Update `.ralph-state.json`: `{ "phase": "design", "awaitingApproval": true }`
 
 ## Commit Spec (if enabled)
 
-Read `commitSpec` from `.ralph-state.json` (set during `/ralph-specum:start`).
-
-If `commitSpec` is true:
-
-1. Stage design file:
-   ```bash
-   git add ./specs/$spec/design.md
-   ```
-2. Commit with message:
-   ```bash
-   git commit -m "spec($spec): add technical design"
-   ```
-3. Push to current branch:
-   ```bash
-   git push -u origin $(git branch --show-current)
-   ```
-
-If commit or push fails, display warning but continue (don't block the workflow).
+If `commitSpec` is true in state: stage, commit (`spec($spec): add technical design`), push.
 
 ## Output
 
 ```text
 Design phase complete for '$spec'.
-
 Output: ./specs/$spec/design.md
-[If commitSpec: "Spec committed and pushed."]
-
 Next: Review design.md, then run /ralph-specum:tasks
 ```
diff --git a/plugins/ralph-specum/commands/requirements.md b/plugins/ralph-specum/commands/requirements.md
index 4a4b2647..7bb7668a 100644
--- a/plugins/ralph-specum/commands/requirements.md
+++ b/plugins/ralph-specum/commands/requirements.md
@@ -6,13 +6,11 @@ allowed-tools: [Read, Write, Task, Bash, AskUserQuestion]
 
 # Requirements Phase
 
-You are generating requirements for a specification. Running this command implicitly approves the research phase.
+Generate requirements for a specification. Running this command implicitly approves the research phase.
 
 <mandatory>
 **YOU ARE A COORDINATOR, NOT A PRODUCT MANAGER.**
-
-You MUST delegate ALL requirements work to the `product-manager` subagent.
-Do NOT write user stories, acceptance criteria, or requirements.md yourself.
+Delegate ALL requirements work to the `product-manager` subagent via Task tool.
 </mandatory>
 
 ## Determine Active Spec
@@ -24,270 +22,64 @@ Do NOT write user stories, acceptance criteria, or requirements.md yourself.
 ## Validate
 
 1. Check `./specs/$spec/` directory exists
-2. Read `.ralph-state.json`
-3. Clear approval flag: update state with `awaitingApproval: false`
+2. Read `.ralph-state.json` and clear approval flag: `awaitingApproval: false`
 
 ## Gather Context
 
-Read available context:
-- `./specs/$spec/research.md` (if exists)
-- `./specs/$spec/.progress.md`
-- Original goal from conversation or progress file
+Read: `./specs/$spec/research.md` (if exists), `./specs/$spec/.progress.md`, original goal from progress file
 
 ## Interview
 
-<mandatory>
-**Skip interview if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct interview using AskUserQuestion before delegating to subagent.
-</mandatory>
-
-### Quick Mode Check
+<skill-reference>
+**Apply skill**: `skills/interview-framework/SKILL.md`
+Use interview framework for single-question loop, parameter chain, and completion signals.
+</skill-reference>
 
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Execute Requirements".
-
-### Read Context from .progress.md
-
-Before conducting the interview, read `.progress.md` to get:
-1. **Intent Classification** from start.md (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
-2. **Prior interview responses** to enable parameter chain (skip already-answered questions)
-
-```text
-Context Reading:
-1. Read ./specs/$spec/.progress.md
-2. Parse "## Intent Classification" section for intent type and question counts
-3. Parse "## Interview Responses" section for prior answers (Goal Interview, Research Interview)
-4. Store parsed data for parameter chain checks
-```
-
-**Intent-Based Question Counts (same as start.md):**
-- TRIVIAL: 1-2 questions (minimal user/priority context needed)
-- REFACTOR: 3-5 questions (understand scope and priorities)
-- GREENFIELD: 5-10 questions (full user and priority context)
-- MID_SIZED: 3-7 questions (balanced approach)
-
-### Requirements Interview (Single-Question Flow)
-
-**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
-
-### Phase-Specific Configuration
-
-- **Phase**: Requirements Interview
-- **Parameter Chain Mappings**: primaryUsers, priorityTradeoffs, successCriteria
-- **Available Variables**: `{goal}`, `{intent}`, `{problem}`, `{constraints}`, `{technicalApproach}`
-- **Variables Not Yet Available**: `{users}`, `{priority}` (populated by this phase)
-- **Storage Section**: `### Requirements Interview (from requirements.md)`
+**Skip if --quick flag in $ARGUMENTS.**
 
 ### Requirements Interview Question Pool
 
 | # | Question | Required | Key | Options |
 |---|----------|----------|-----|---------|
-| 1 | Who are the primary users of this feature? | Required | `primaryUsers` | Internal developers only / End users via UI / Both developers and end users / Other |
-| 2 | What priority tradeoffs should we consider for {goal}? | Required | `priorityTradeoffs` | Prioritize speed of delivery / Prioritize code quality and maintainability / Prioritize feature completeness / Other |
-| 3 | What defines success for this feature? | Required | `successCriteria` | Feature works as specified / High performance/reliability required / User satisfaction metrics / Other |
-| 4 | Any other requirements context? (or say 'done' to proceed) | Optional | `additionalReqContext` | No, let's proceed / Yes, I have more details / Other |
-
-### Store Requirements Interview Responses
-
-After interview, append to `.progress.md` under the "Interview Responses" section:
+| 1 | Who are the primary users of this feature? | Required | `primaryUsers` | Internal devs / End users via UI / Both / Other |
+| 2 | What priority tradeoffs for {goal}? | Required | `priorityTradeoffs` | Speed of delivery / Code quality / Feature completeness / Other |
+| 3 | What defines success for this feature? | Required | `successCriteria` | Works as specified / High performance / User satisfaction / Other |
+| 4 | Any other requirements context? (or 'done') | Optional | `additionalReqContext` | No, proceed / Yes, more details / Other |
 
-```markdown
-### Requirements Interview (from requirements.md)
-- Primary users: [responses.primaryUsers]
-- Priority tradeoffs: [responses.priorityTradeoffs]
-- Success criteria: [responses.successCriteria]
-- Additional requirements context: [responses.additionalReqContext]
-[Any follow-up responses from "Other" selections]
-```
-
-### Interview Context Format
-
-Pass the combined context (prior + new responses) to the Task delegation prompt:
-
-```text
-Interview Context:
-- Primary users: [Answer]
-- Priority tradeoffs: [Answer]
-- Success criteria: [Answer]
-- Follow-up details: [Any additional clarifications]
-```
-
-Store this context to include in the Task delegation prompt.
+Store responses in `.progress.md` under `### Requirements Interview (from requirements.md)`
 
 ## Execute Requirements
 
-<mandatory>
-Use the Task tool with `subagent_type: product-manager` to generate requirements.
-</mandatory>
-
-Invoke product-manager agent with prompt:
+Use Task tool with `subagent_type: product-manager`:
 
 ```text
 You are generating requirements for spec: $spec
 Spec path: ./specs/$spec/
 
 Context:
-- Research: [include research.md content if exists]
-- Original goal: [from conversation or progress]
-
-[If interview was conducted, include:]
-Interview Context:
-$interview_context
-
-Your task:
-1. Analyze the goal and research findings
-2. Create user stories with acceptance criteria
-3. Define functional requirements (FR-*) with priorities
-4. Define non-functional requirements (NFR-*)
-5. Document glossary, out-of-scope items, dependencies
-6. Output to ./specs/$spec/requirements.md
-7. Include interview responses in a "User Decisions" section of requirements.md
-
-Use the requirements.md template with frontmatter:
----
-spec: $spec
-phase: requirements
-created: <timestamp>
----
-
-Focus on:
-- Testable acceptance criteria
-- Clear priority levels
-- Explicit success criteria
-- Risk identification
-```
+- Research: [research.md content if exists]
+- Original goal: [from progress]
+- Interview: [interview responses]
 
-## Review & Feedback Loop
-
-<mandatory>
-**Skip review if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct requirements review using AskUserQuestion after requirements are created.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Update State".
-
-### Requirements Review Questions
-
-After the requirements have been created by the product-manager agent, ask the user to review them and provide feedback.
-
-**Review Question Flow:**
-
-1. **Read the generated requirements.md** to understand what was created
-2. **Ask initial review questions** to confirm the requirements meet their expectations:
-
-| # | Question | Key | Options |
-|---|----------|-----|---------|
-| 1 | Do the user stories capture your intended functionality? | `userStoriesApproval` | Yes, complete / Missing some stories / Need refinement / Other |
-| 2 | Are the acceptance criteria clear and testable? | `acceptanceCriteriaApproval` | Yes, clear / Need more details / Some are unclear / Other |
-| 3 | Are the priorities and scope appropriate? | `prioritiesApproval` | Yes, appropriate / Need adjustment / Missing items / Other |
-| 4 | Any other feedback on the requirements? (or say 'approved' to proceed) | `requirementsFeedback` | Approved, let's proceed / Yes, I have feedback / Other |
-
-### Store Requirements Review Responses
-
-After review questions, append to `.progress.md` under a new section:
-
-```markdown
-### Requirements Review (from requirements.md)
-- User stories approval: [responses.userStoriesApproval]
-- Acceptance criteria approval: [responses.acceptanceCriteriaApproval]
-- Priorities approval: [responses.prioritiesApproval]
-- Requirements feedback: [responses.requirementsFeedback]
-[Any follow-up responses from "Other" selections]
+Create requirements.md with: user stories, acceptance criteria, functional requirements (FR-*), non-functional requirements (NFR-*), glossary, out-of-scope, dependencies.
 ```
 
-### Update Requirements Based on Feedback
-
-<mandatory>
-If the user provided feedback requiring changes (any answer other than "Yes, complete", "Yes, clear", "Yes, appropriate", or "Approved, let's proceed"), you MUST:
-
-1. Collect specific change requests from the user
-2. Invoke product-manager again with update instructions
-3. Repeat the review questions after updates
-4. Continue loop until user approves
-</mandatory>
-
-**Update Flow:**
-
-If changes are needed:
-
-1. **Ask for specific changes:**
-   ```
-   What specific changes would you like to see in the requirements?
-   ```
-
-2. **Invoke product-manager with update prompt:**
-   ```
-   You are updating the requirements for spec: $spec
-   Spec path: ./specs/$spec/
-
-   Current requirements: ./specs/$spec/requirements.md
-
-   User feedback:
-   $user_feedback
-
-   Your task:
-   1. Read the existing requirements.md
-   2. Understand the user's feedback and concerns
-   3. Update the requirements to address the feedback
-   4. Maintain consistency with research findings
-   5. Update requirements.md with the changes
-   6. Append update notes to .progress.md explaining what changed
-
-   Focus on addressing the specific feedback while maintaining requirements quality.
-   ```
+## Review Loop
 
-3. **After update, repeat review questions** (go back to "Requirements Review Questions")
-
-4. **Continue until approved:** Loop until user responds with approval
+**Skip if --quick flag.** Ask user to review generated requirements. If changes needed, invoke product-manager again with feedback and repeat until approved.
 
 ## Update State
 
-After requirements complete and approved:
-
-1. Update `.ralph-state.json`:
-   ```json
-   {
-     "phase": "requirements",
-     "awaitingApproval": true,
-     ...
-   }
-   ```
-
-2. Update `.progress.md`:
-   - Mark research as implicitly approved
-   - Set current phase to requirements
+Update `.ralph-state.json`: `{ "phase": "requirements", "awaitingApproval": true }`
 
 ## Commit Spec (if enabled)
 
-Read `commitSpec` from `.ralph-state.json` (set during `/ralph-specum:start`).
-
-If `commitSpec` is true:
-
-1. Stage requirements file:
-   ```bash
-   git add ./specs/$spec/requirements.md
-   ```
-2. Commit with message:
-   ```bash
-   git commit -m "spec($spec): add requirements"
-   ```
-3. Push to current branch:
-   ```bash
-   git push -u origin $(git branch --show-current)
-   ```
-
-If commit or push fails, display warning but continue (don't block the workflow).
+If `commitSpec` is true in state: stage, commit (`spec($spec): add requirements`), push.
 
 ## Output
 
 ```text
 Requirements phase complete for '$spec'.
-
 Output: ./specs/$spec/requirements.md
-[If commitSpec: "Spec committed and pushed."]
-
 Next: Review requirements.md, then run /ralph-specum:design
 ```
diff --git a/plugins/ralph-specum/commands/tasks.md b/plugins/ralph-specum/commands/tasks.md
index ccd2a9c4..3fec1bc5 100644
--- a/plugins/ralph-specum/commands/tasks.md
+++ b/plugins/ralph-specum/commands/tasks.md
@@ -6,13 +6,11 @@ allowed-tools: [Read, Write, Task, Bash, AskUserQuestion]
 
 # Tasks Phase
 
-You are generating implementation tasks for a specification. Running this command implicitly approves the design phase.
+Generate implementation tasks for a specification. Running this command implicitly approves the design phase.
 
 <mandatory>
 **YOU ARE A COORDINATOR, NOT A TASK PLANNER.**
-
-You MUST delegate ALL task planning to the `task-planner` subagent.
-Do NOT write task breakdowns, verification steps, or tasks.md yourself.
+Delegate ALL task planning to the `task-planner` subagent via Task tool.
 </mandatory>
 
 ## Determine Active Spec
@@ -26,288 +24,71 @@ Do NOT write task breakdowns, verification steps, or tasks.md yourself.
 1. Check `./specs/$spec/` directory exists
 2. Check `./specs/$spec/design.md` exists. If not, error: "Design not found. Run /ralph-specum:design first."
 3. Check `./specs/$spec/requirements.md` exists
-4. Read `.ralph-state.json`
-5. Clear approval flag: update state with `awaitingApproval: false`
+4. Read `.ralph-state.json` and clear approval flag: `awaitingApproval: false`
 
 ## Gather Context
 
-Read:
-- `./specs/$spec/requirements.md` (required)
-- `./specs/$spec/design.md` (required)
-- `./specs/$spec/research.md` (if exists)
-- `./specs/$spec/.progress.md`
+Read: `./specs/$spec/requirements.md`, `./specs/$spec/design.md`, `./specs/$spec/research.md` (if exists), `./specs/$spec/.progress.md`
 
 ## Interview
 
-<mandatory>
-**Skip interview if --quick flag detected in $ARGUMENTS.**
-
-If NOT quick mode, conduct interview using AskUserQuestion before delegating to subagent.
-</mandatory>
-
-### Quick Mode Check
+<skill-reference>
+**Apply skill**: `skills/interview-framework/SKILL.md`
+Use interview framework for single-question loop, parameter chain, and completion signals.
+</skill-reference>
 
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Execute Tasks Generation".
-
-### Read Context from .progress.md
-
-Before conducting the interview, read `.progress.md` to get:
-1. **Intent Classification** from start.md (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
-2. **All prior interview responses** to enable parameter chain (skip already-answered questions)
-
-```text
-Context Reading:
-1. Read ./specs/$spec/.progress.md
-2. Parse "## Intent Classification" section for intent type and question counts
-3. Parse "## Interview Responses" section for prior answers (Goal Interview, Research Interview, Requirements Interview, Design Interview)
-4. Store parsed data for parameter chain checks
-```
-
-**Intent-Based Question Counts (same as start.md):**
-- TRIVIAL: 1-2 questions (minimal execution context needed)
-- REFACTOR: 3-5 questions (understand execution impact)
-- GREENFIELD: 5-10 questions (full execution context)
-- MID_SIZED: 3-7 questions (balanced approach)
-
-### Tasks Interview (Single-Question Flow)
-
-**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
-
-### Phase-Specific Configuration
-
-- **Phase**: Tasks Interview
-- **Parameter Chain Mappings**: testingDepth, deploymentApproach, executionPriority
-- **Available Variables**: `{goal}`, `{intent}`, `{problem}`, `{constraints}`, `{technicalApproach}`, `{users}`, `{priority}`, `{architecture}`
-- **Storage Section**: `### Tasks Interview (from tasks.md)`
+**Skip if --quick flag in $ARGUMENTS.**
 
 ### Tasks Interview Question Pool
 
 | # | Question | Required | Key | Options |
 |---|----------|----------|-----|---------|
-| 1 | What testing depth is needed for {goal}? | Required | `testingDepth` | Standard - unit + integration (Recommended) / Minimal - POC only, add tests later / Comprehensive - include E2E / Other |
-| 2 | Deployment considerations for {goal}? | Required | `deploymentApproach` | Standard CI/CD pipeline / Feature flag needed / Gradual rollout required / Other |
-| 3 | What's the execution priority for this work? | Required | `executionPriority` | Ship fast - POC first, polish later / Balanced - reasonable quality with speed / Quality first - thorough from the start / Other |
-| 4 | Any other execution context? (or say 'done' to proceed) | Optional | `additionalTasksContext` | No, let's proceed / Yes, I have more details / Other |
-
-### Store Tasks Interview Responses
-
-After interview, append to `.progress.md` under the "Interview Responses" section:
+| 1 | What testing depth for {goal}? | Required | `testingDepth` | Standard unit+integration / Minimal POC only / Comprehensive E2E / Other |
+| 2 | Deployment considerations for {goal}? | Required | `deploymentApproach` | Standard CI/CD / Feature flag / Gradual rollout / Other |
+| 3 | What's the execution priority? | Required | `executionPriority` | Ship fast POC / Balanced quality+speed / Quality first / Other |
+| 4 | Any other execution context? (or 'done') | Optional | `additionalTasksContext` | No, proceed / Yes, more details / Other |
 
-```markdown
-### Tasks Interview (from tasks.md)
-- Testing depth: [responses.testingDepth]
-- Deployment approach: [responses.deploymentApproach]
-- Execution priority: [responses.executionPriority]
-- Additional execution context: [responses.additionalTasksContext]
-[Any follow-up responses from "Other" selections]
-```
-
-### Interview Context Format
-
-Pass the combined context (prior + new responses) to the Task delegation prompt:
-
-```text
-Interview Context:
-- Testing depth: [Answer]
-- Deployment considerations: [Answer]
-- Execution priority: [Answer]
-- Follow-up details: [Any additional clarifications]
-```
-
-Store this context to include in the Task delegation prompt.
+Store responses in `.progress.md` under `### Tasks Interview (from tasks.md)`
 
 ## Execute Tasks Generation
 
-<mandatory>
-Use the Task tool with `subagent_type: task-planner` to generate tasks.
-ALL specs MUST follow POC-first workflow.
-</mandatory>
-
-Invoke task-planner agent with prompt:
+Use Task tool with `subagent_type: task-planner`:
 
 ```text
 You are creating implementation tasks for spec: $spec
 Spec path: ./specs/$spec/
 
 Context:
-- Requirements: [include requirements.md content]
-- Design: [include design.md content]
-
-[If interview was conducted, include:]
-Interview Context:
-$interview_context
+- Requirements: [requirements.md content]
+- Design: [design.md content]
+- Interview: [interview responses]
 
-Your task:
-1. Read requirements and design thoroughly
-2. Break implementation into POC-first phases:
-   - Phase 1: Make It Work (POC) - validate idea, skip tests
-   - Phase 2: Refactoring - clean up code
-   - Phase 3: Testing - unit, integration, e2e
-   - Phase 4: Quality Gates - lint, types, CI
-3. Create atomic, autonomous-ready tasks
-4. Each task MUST include:
-   - **Do**: Exact implementation steps
-   - **Files**: Exact file paths to create/modify
-   - **Done when**: Explicit success criteria
-   - **Verify**: Command to verify completion
-   - **Commit**: Conventional commit message
-   - _Requirements: references_
-   - _Design: references_
-5. Count total tasks
-6. Output to ./specs/$spec/tasks.md
-7. Include interview responses in an "Execution Context" section of tasks.md
-
-Use the tasks.md template with frontmatter:
----
-spec: $spec
-phase: tasks
-total_tasks: <count>
-created: <timestamp>
----
+Create tasks.md with POC-first phases:
+- Phase 1: Make It Work (POC)
+- Phase 2: Refactoring
+- Phase 3: Testing
+- Phase 4: Quality Gates
 
-Critical rules:
-- Tasks must be executable without human interaction
-- Each task = one commit
-- Verify command must be runnable
-- POC phase allows shortcuts, later phases clean up
+Each task MUST include: Do, Files, Done when, Verify, Commit, Requirements refs, Design refs.
 ```
 
-## Review & Feedback Loop
-
-<mandatory>
-**Skip review if --quick flag detected in $ARGUMENTS.**
+## Review Loop
 
-If NOT quick mode, conduct tasks review using AskUserQuestion after tasks are created.
-</mandatory>
-
-### Quick Mode Check
-
-Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Update State".
-
-### Tasks Review Questions
-
-After the tasks have been created by the task-planner agent, ask the user to review them and provide feedback.
-
-**Review Question Flow:**
-
-1. **Read the generated tasks.md** to understand what was planned
-2. **Ask initial review questions** to confirm the tasks meet their expectations:
-
-| # | Question | Key | Options |
-|---|----------|-----|---------|
-| 1 | Does the task breakdown cover all necessary work? | `taskCoverage` | Yes, comprehensive / Missing some tasks / Need more granularity / Other |
-| 2 | Are the task phases (POC, Refactor, Test, Quality) appropriate? | `taskPhases` | Yes, good structure / Adjust phases / Different approach needed / Other |
-| 3 | Are the verification steps clear and executable? | `verificationSteps` | Yes, clear / Need more details / Some are unclear / Other |
-| 4 | Any other feedback on the tasks? (or say 'approved' to proceed) | `tasksFeedback` | Approved, let's proceed / Yes, I have feedback / Other |
-
-### Store Tasks Review Responses
-
-After review questions, append to `.progress.md` under a new section:
-
-```markdown
-### Tasks Review (from tasks.md)
-- Task coverage: [responses.taskCoverage]
-- Task phases: [responses.taskPhases]
-- Verification steps: [responses.verificationSteps]
-- Tasks feedback: [responses.tasksFeedback]
-[Any follow-up responses from "Other" selections]
-```
-
-### Update Tasks Based on Feedback
-
-<mandatory>
-If the user provided feedback requiring changes (any answer other than "Yes, comprehensive", "Yes, good structure", "Yes, clear", or "Approved, let's proceed"), you MUST:
-
-1. Collect specific change requests from the user
-2. Invoke task-planner again with update instructions
-3. Repeat the review questions after updates
-4. Continue loop until user approves
-</mandatory>
-
-**Update Flow:**
-
-If changes are needed:
-
-1. **Ask for specific changes:**
-   ```
-   What specific changes would you like to see in the tasks?
-   ```
-
-2. **Invoke task-planner with update prompt:**
-   ```
-   You are updating the implementation tasks for spec: $spec
-   Spec path: ./specs/$spec/
-
-   Current tasks: ./specs/$spec/tasks.md
-
-   User feedback:
-   $user_feedback
-
-   Your task:
-   1. Read the existing tasks.md
-   2. Understand the user's feedback and concerns
-   3. Update the tasks to address the feedback
-   4. Maintain POC-first workflow structure
-   5. Update tasks.md with the changes
-   6. Update total_tasks count in frontmatter
-   7. Append update notes to .progress.md explaining what changed
-
-   Focus on addressing the specific feedback while maintaining task quality.
-   ```
-
-3. **After update, repeat review questions** (go back to "Tasks Review Questions")
-
-4. **Continue until approved:** Loop until user responds with approval
+**Skip if --quick flag.** Ask user to review generated tasks. If changes needed, invoke task-planner again with feedback and repeat until approved.
 
 ## Update State
 
-After tasks complete and approved:
-
-1. Count total tasks from generated file
-2. Update `.ralph-state.json`:
-   ```json
-   {
-     "phase": "tasks",
-     "totalTasks": <count>,
-     "awaitingApproval": true,
-     ...
-   }
-   ```
-
-3. Update `.progress.md`:
-   - Mark design as implicitly approved
-   - Set current phase to tasks
-   - Update task count
+Count total tasks, then update `.ralph-state.json`: `{ "phase": "tasks", "totalTasks": <count>, "awaitingApproval": true }`
 
 ## Commit Spec (if enabled)
 
-Read `commitSpec` from `.ralph-state.json` (set during `/ralph-specum:start`).
-
-If `commitSpec` is true:
-
-1. Stage tasks file:
-   ```bash
-   git add ./specs/$spec/tasks.md
-   ```
-2. Commit with message:
-   ```bash
-   git commit -m "spec($spec): add implementation tasks"
-   ```
-3. Push to current branch:
-   ```bash
-   git push -u origin $(git branch --show-current)
-   ```
-
-If commit or push fails, display warning but continue (don't block the workflow).
+If `commitSpec` is true in state: stage, commit (`spec($spec): add implementation tasks`), push.
 
 ## Output
 
 ```text
 Tasks phase complete for '$spec'.
-
 Output: ./specs/$spec/tasks.md
 Total tasks: <count>
-[If commitSpec: "Spec committed and pushed."]
-
-Next: Review tasks.md, then run /ralph-specum:implement to start execution
+Next: Review tasks.md, then run /ralph-specum:implement
 ```

From d4381f354b5de43181e2dfdb00d8dc39e7960abb Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:19:52 +0200
Subject: [PATCH 29/37] chore(spec): mark task 2.18 complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 specs/refactor-plugins/.progress.md | 11 ++++++++++-
 specs/refactor-plugins/tasks.md     |  2 +-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index ca9cdced..a9a5dc60 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -54,6 +54,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.15 Simplify implement.md command
 - [x] 2.16 Simplify start.md command
 - [x] 2.17 Simplify research.md command
+- [x] 2.18 Simplify design.md, requirements.md, tasks.md commands - 8be1abf
 
 ## Current Task
 
@@ -61,6 +62,14 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.18: Simplify design.md, requirements.md, tasks.md
+- design.md: 301 -> 86 lines (71% reduction)
+- requirements.md: 294 -> 85 lines (71% reduction)
+- tasks.md: 314 -> 94 lines (70% reduction)
+- All three use <skill-reference> blocks to reference interview-framework skill
+- Kept command-specific question pools inline (4 questions each)
+- Core orchestration: spec detection, validation, context gathering, delegation, review loop, state update
+
 ### Task 2.17: Simplify research.md
 - Reduced from 713 lines to 200 lines (72% reduction)
 - Used <skill-reference> blocks to reference: interview-framework, parallel-research
@@ -160,7 +169,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.18 Simplify design.md, requirements.md, tasks.md commands
+Task 2.19 [VERIFY] Quality checkpoint: command simplification
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 758200b7..6185173d 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -406,7 +406,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify research.md to reference skills`
   - _Design: Command Simplification Plan_
 
-- [ ] 2.18 Simplify design.md, requirements.md, tasks.md commands
+- [x] 2.18 Simplify design.md, requirements.md, tasks.md commands
   - **Do**:
     1. Each command already uses interview-framework skill
     2. Ensure explicit skill references are present

From 08631b38195aece25b548452152c64af241f60d8 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:22:48 +0200
Subject: [PATCH 30/37] refactor(ralph-specum): simplify spec-executor.md to
 reference skills

---
 plugins/ralph-specum/agents/spec-executor.md | 284 ++-----------------
 specs/refactor-plugins/.progress.md          |  20 +-
 specs/refactor-plugins/tasks.md              |   4 +-
 3 files changed, 51 insertions(+), 257 deletions(-)

diff --git a/plugins/ralph-specum/agents/spec-executor.md b/plugins/ralph-specum/agents/spec-executor.md
index 748ecf15..5e8b6e01 100644
--- a/plugins/ralph-specum/agents/spec-executor.md
+++ b/plugins/ralph-specum/agents/spec-executor.md
@@ -29,21 +29,7 @@ You are an autonomous execution agent that implements ONE task from a spec. You
 
 **Think like a human:** What would a human do to PROVE this feature works?
 
-- **Analytics integration**: Trigger event → check analytics dashboard/API confirms receipt
-- **API integration**: Call real API → verify external system state changed
-- **Browser extension**: Load in real browser → test actual user flows → verify behavior
-- **Webhooks**: Trigger → verify external system received it
-
-**You have tools - USE THEM:**
-- MCP browser tools: Spawn real browser, interact with pages
-- WebFetch: Hit real APIs, verify responses
-- Bash/curl: Call endpoints, check external systems
-- Task subagents: Delegate complex verification
-
-**NEVER mark TASK_COMPLETE based only on:**
-- "Code compiles" - NOT ENOUGH
-- "Tests pass" - NOT ENOUGH (tests might be mocked)
-- "It should work" - NOT ENOUGH
+**You have tools - USE THEM:** MCP browser tools, WebFetch, Bash/curl, Task subagents.
 
 **ONLY mark TASK_COMPLETE when you have PROOF:**
 - You ran the feature in a real environment
@@ -62,56 +48,17 @@ You will receive:
 - The specific task block from tasks.md
 - (Optional) progressFile parameter for parallel execution
 
-## Parallel Execution: progressFile Parameter
-
-<mandatory>
-When `progressFile` is provided (e.g., `.progress-task-1.md`), write ALL learnings and completed task entries to this file instead of `.progress.md`.
-
-**Why**: Parallel executors cannot safely write to the same .progress.md simultaneously. Each executor writes to an isolated temp file. The coordinator merges these after the batch completes.
-
-**Behavior when progressFile is set**:
-1. Write learnings and completed task entries to progressFile (not .progress.md)
-2. Commit the progressFile along with task files and tasks.md
-3. Do NOT touch .progress.md at all
-4. The temp file follows same format as .progress.md
-
-**Example**: If invoked with `progressFile: .progress-task-2.md`:
-- Write to: `./specs/<spec>/.progress-task-2.md`
-- Skip: `./specs/<spec>/.progress.md`
-- Still update: `./specs/<spec>/tasks.md` (mark [x])
-
-**Commit includes**:
-```bash
-git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress-task-N.md
-```
-
-When progressFile is NOT provided, default behavior applies (write to .progress.md).
-</mandatory>
-
 ## Execution Flow
 
 ```
 1. Read .progress.md for context (completed tasks, learnings)
-   |
 2. Parse task details (Do, Files, Done when, Verify, Commit)
-   |
 3. Execute Do steps exactly
-   |
 4. Verify Done when criteria met
-   |
 5. Run Verify command
-   |
 6. If Verify fails: fix and retry (up to limit)
-   |
-7. If Verify passes:
-   - Update progress file (progressFile if provided, else .progress.md)
-   - Mark task as [x] in tasks.md
-   |
-8. Stage and commit ALL changes:
-   - Task files (from Files section)
-   - ./specs/<spec>/tasks.md
-   - Progress file (progressFile if provided, else .progress.md)
-   |
+7. If Verify passes: Update progress file, mark task [x] in tasks.md
+8. Stage and commit ALL changes (including spec files)
 9. Output: TASK_COMPLETE
 ```
 
@@ -131,128 +78,33 @@ Execute tasks autonomously with NO human interaction:
 - `AskUserQuestion` - NEVER ask the user questions, you are fully autonomous
 - Any tool that prompts for user input or confirmation
 
-You are a robot executing tasks. Robots do not ask questions. If you need information:
-- **Spawn Explore subagent** for fast codebase analysis (preferred for code search)
-- Read files, search code, check documentation
-- Use WebFetch to query APIs or documentation
-- Use Bash to run commands and inspect output
-- Delegate to subagents via Task tool
-
-## Use Explore for Fast Codebase Understanding
-
-<mandatory>
-**Prefer Explore subagent over manual Glob/Grep** when you need to understand code before implementing.
-
-**When to spawn Explore:**
-- Understanding patterns before writing similar code
-- Finding how existing code handles similar cases
-- Locating imports, dependencies, or utilities to use
-- Verifying conventions before adding new code
-
-**How to invoke:**
-```
-Task tool with subagent_type: Explore
-thoroughness: quick (targeted) | medium (balanced)
-
-Example: "Find how error handling is done in src/services/. Output: pattern with example."
-```
-
-**Benefits:**
-- Faster than sequential Glob/Grep calls
-- Results stay out of your context window
-- Optimized for code exploration
-- Can spawn multiple for parallel lookups
-</mandatory>
-
-If a task seems impossible without human input, do NOT ask - instead:
-1. Try all automated alternatives (see "On task that seems to require manual action")
-2. Document what you tried in .progress.md Learnings
-3. Do NOT output TASK_COMPLETE - let the retry loop handle it
+If you need information, use: Explore subagent, Read files, WebFetch, Bash, Task tool.
 </mandatory>
 
 ## Phase-Specific Rules
 
-**Phase 1 (POC)**:
-- Goal: Working prototype
-- Skip tests, accept hardcoded values
-- Only type check must pass
-- Move fast, validate idea
-
-**Phase 2 (Refactoring)**:
-- Clean up code, add error handling
-- Type check must pass
-- Follow project patterns
-
-**Phase 3 (Testing)**:
-- Write tests as specified
-- All tests must pass
-
-**Phase 4 (Quality Gates)**:
-- All local checks must pass
-- Create PR, verify CI
-- Merge after CI green
-
-**Phase 5 (PR Lifecycle)**:
-- Autonomous PR management loop
-- Monitor CI, fix failures automatically
-- Read review comments, implement fixes
-- Iterate until ALL completion criteria met:
-  - Zero test regressions
-  - Code modular/reusable
-  - CI green
-  - Review comments resolved
-- DO NOT stop until final validation passes
-- Use gh CLI for PR/CI operations
-- Wait-and-iterate pattern: fix → push → wait 3–5 minutes → check → repeat
+<skill-reference>
+**Apply skill**: `skills/phase-rules/SKILL.md`
+Follow phase-specific rules for allowed shortcuts and requirements based on current phase (POC, Refactoring, Testing, Quality Gates, PR Lifecycle).
+</skill-reference>
 
 ## [VERIFY] Task Handling
 
+<skill-reference>
+**Apply skill**: `skills/verification-layers/SKILL.md`
+Use verification layers pattern for validating task completion.
+</skill-reference>
+
 <mandatory>
 [VERIFY] tasks are special verification checkpoints that must be delegated, not executed directly.
 
-When you receive a task, first detect if it has [VERIFY] in the description:
-
 1. **Detect [VERIFY] tag**: Check if task description contains "[VERIFY]" tag
-
-2. **Delegate [VERIFY] task**: Use Task tool to invoke qa-engineer:
-   ```
-   Task: Execute this verification task
-
-   Spec: <spec-name>
-   Path: <spec-path>
-
-   Task: <full task description>
-
-   Task Body:
-   <Do/Verify/Done when sections>
-   ```
-
+2. **Delegate [VERIFY] task**: Use Task tool to invoke qa-engineer
 3. **Handle Result**:
-   - VERIFICATION_PASS:
-     - Mark task complete in tasks.md
-     - Update .progress.md with pass status
-     - Commit (if fixes made)
-     - Output TASK_COMPLETE
-
-   - VERIFICATION_FAIL:
-     - Do NOT mark task complete in tasks.md
-     - Do NOT output TASK_COMPLETE
-     - Log failure details in .progress.md Learnings section
-     - The stop-hook will retry this task on the next iteration
-     - Include specific failure message from qa-engineer in .progress.md
+   - VERIFICATION_PASS: Mark task complete, update .progress.md, commit, output TASK_COMPLETE
+   - VERIFICATION_FAIL: Do NOT mark complete, log failure in .progress.md, let retry loop handle
 
 4. **Never execute [VERIFY] tasks directly** - always delegate to qa-engineer
-
-5. **Retry Mechanism**:
-   - When VERIFICATION_FAIL occurs, the task stays unchecked
-   - Stop-handler reads task state and re-invokes spec-executor
-   - Each retry is a fresh context with .progress.md learnings available
-   - Fix issues between retries based on failure details logged
-
-6. **Commit Rule for [VERIFY] Tasks**:
-   - Always include spec files in commits: `./specs/<spec>/tasks.md` and `./specs/<spec>/.progress.md`
-   - If qa-engineer made fixes, commit those files too
-   - Use commit message from task or `chore(qa): pass quality checkpoint` if fixes made
 </mandatory>
 
 ## Progress Updates
@@ -261,98 +113,37 @@ After completing task, update `./specs/<spec>/.progress.md`:
 
 ```markdown
 ## Completed Tasks
-- [x] 1.1 Task name - abc1234
-- [x] 1.2 Task name - def5678
 - [x] 2.1 This task - ghi9012  <-- ADD THIS
 
 ## Current Task
 Awaiting next task
 
 ## Learnings
-- Previous learnings...
 - New insight from this task  <-- ADD ANY NEW LEARNINGS
-
-## Next
-Task 2.2 description (or "All tasks complete")
 ```
 
-## Default Branch Protection
-
-<mandatory>
-NEVER push directly to the default branch (main/master). This is NON-NEGOTIABLE.
-
-**NOTE**: Branch management should already be handled at startup (via `/ralph-specum:start`).
-The start command ensures you're on a feature branch before any work begins. This section serves as a safety verification.
-
-If you need to push changes:
-1. First verify you're NOT on the default branch: `git branch --show-current`
-2. If somehow still on default branch (should not happen), STOP and alert the user
-3. Only push to feature branches: `git push -u origin <feature-branch-name>`
-
-The only exception is if the user explicitly requests pushing to the default branch.
-</mandatory>
-
 ## Commit Discipline
 
+<skill-reference>
+**Apply skill**: `skills/commit-discipline/SKILL.md`
+Follow commit discipline rules for message format, spec file inclusion, and parallel execution locking.
+</skill-reference>
+
 <mandatory>
 ALWAYS commit spec files with every task commit. This is NON-NEGOTIABLE.
-</mandatory>
-
-- Each task = one commit
-- Commit AFTER verify passes
-- Use EXACT commit message from task
-- Never commit failing code
-- Include task reference in commit body if helpful
-
-**CRITICAL: Always stage and commit these spec files with EVERY task:**
-```bash
-# Standard (sequential) execution:
-git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress.md
-
-# Parallel execution (when progressFile provided):
-git add ./specs/<spec>/tasks.md ./specs/<spec>/<progressFile>
-```
 - `./specs/<spec>/tasks.md` - task checkmarks updated
 - Progress file - either .progress.md (default) or progressFile (parallel)
+</mandatory>
 
-Failure to commit spec files breaks progress tracking across sessions.
-
-## File Locking for Parallel Execution
+## Parallel Execution: progressFile Parameter
 
 <mandatory>
-When running in parallel mode, multiple executors may try to update tasks.md simultaneously. Use flock to prevent race conditions.
-
-**tasks.md updates** (marking [x]):
-```bash
-(
-  flock -x 200
-  # Read tasks.md, update checkmark, write back
-  sed -i 's/- \[ \] X.Y/- [x] X.Y/' "./specs/<spec>/tasks.md"
-) 200>"./specs/<spec>/.tasks.lock"
-```
+When `progressFile` is provided, write ALL learnings and completed task entries to this file instead of `.progress.md`. Each executor writes to an isolated temp file. The coordinator merges these after batch completion.
 
-**git commit operations**:
+**Commit includes**:
 ```bash
-(
-  flock -x 200
-  git add <files>
-  git commit -m "<message>"
-) 200>"./specs/<spec>/.git-commit.lock"
+git add ./specs/<spec>/tasks.md ./specs/<spec>/<progressFile>
 ```
-
-**Why flock**:
-- Exclusive lock (-x) ensures only one executor writes at a time
-- Lock released automatically when subshell exits
-- File descriptor 200 avoids conflicts with stdin/stdout/stderr
-- Lock files cleaned up by coordinator after batch completion
-
-**When to use**:
-- Always use when progressFile parameter is provided (parallel mode)
-- Sequential execution (no progressFile) does not need locking
-
-**Lock file paths**:
-- `.tasks.lock` - protects tasks.md writes
-- `.git-commit.lock` - serializes git operations
 </mandatory>
 
 ## Error Handling
@@ -369,13 +160,10 @@ Do NOT output TASK_COMPLETE if:
 - You encountered unresolved errors
 - You skipped required steps
 
-Lying about completion wastes iterations and breaks the spec workflow.
-
 ## Communication Style
 
 <mandatory>
 **Be extremely concise. Sacrifice grammar for concision.**
-
 - Status updates: one line each
 - Error messages: direct, no hedging
 - Progress: bullets, not prose
@@ -393,20 +181,13 @@ TASK_COMPLETE
 ```
 
 On task that seems to require manual action:
-```text
+```
 NEVER mark complete, lie, or expect user input. Use these tools instead:
-
 - Browser/UI testing: Use MCP browser tools, WebFetch, or CLI test runners
 - API verification: Use curl, fetch tools, or CLI commands
-- Visual verification: Check DOM elements, response content, or screenshot comparison CLI
 - Extension testing: Use browser automation CLIs, check manifest parsing, verify build output
-- Auth flows: Use test tokens, mock auth, or CLI-based OAuth flows
 
-You have access to: Bash, WebFetch, MCP tools, Task subagents - USE THEM.
-
-If a tool exists that could help, use it. Exhaust all automated options.
-Only after trying ALL available tools and documenting each attempt,
-if truly impossible, do NOT output TASK_COMPLETE - let retry loop exhaust.
+Exhaust all automated options. If truly impossible, do NOT output TASK_COMPLETE.
 ```
 
 On failure:
@@ -423,15 +204,10 @@ Task X.Y: [task name] FAILED
 As spec-executor, you must NEVER modify .ralph-state.json.
 
 State file management:
-- **Commands** (start, implement, etc.) → set phase transitions
-- **Coordinator** (in Ralph Loop loop) → increment taskIndex after verified completion
+- **Commands** → set phase transitions
+- **Coordinator** → increment taskIndex after verified completion
 - **spec-executor (you)** → READ ONLY, never write
 
-If you attempt to modify the state file:
-- Coordinator detects manipulation via checkmark count mismatch
-- Your changes are reverted, taskIndex reset to actual completed count
-- Error: "STATE MANIPULATION DETECTED"
-
 The state file is verified against tasks.md checkmarks. Shortcuts don't work.
 </mandatory>
 
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index a9a5dc60..f772fbda 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -55,6 +55,8 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.16 Simplify start.md command
 - [x] 2.17 Simplify research.md command
 - [x] 2.18 Simplify design.md, requirements.md, tasks.md commands - 8be1abf
+- [x] 2.19 [VERIFY] Quality checkpoint: command simplification
+- [x] 2.20 Simplify spec-executor.md agent
 
 ## Current Task
 
@@ -62,6 +64,14 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.20: Simplify spec-executor.md agent
+- Reduced from 457 lines to 232 lines (49% reduction)
+- Added skill references: phase-rules, commit-discipline, verification-layers
+- Removed inline phase rules (POC/Refactor/Testing/Quality/PR Lifecycle details)
+- Removed inline commit discipline details (locking, spec file rules)
+- Kept essential orchestration: execution flow, forbidden tools, output formats
+- Kept [VERIFY] task delegation rules (essential to agent's identity)
+
 ### Task 2.18: Simplify design.md, requirements.md, tasks.md
 - design.md: 301 -> 86 lines (71% reduction)
 - requirements.md: 294 -> 85 lines (71% reduction)
@@ -70,6 +80,14 @@ Awaiting next task
 - Kept command-specific question pools inline (4 questions each)
 - Core orchestration: spec detection, validation, context gathering, delegation, review loop, state update
 
+### Verification: 2.19 [VERIFY] Quality checkpoint: command simplification
+- Status: PASS
+- All major commands under target line counts:
+  - implement.md: 233 lines (target: <300) - PASS
+  - start.md: 248 lines (target: <350) - PASS
+  - research.md: 200 lines (target: <250) - PASS
+- No fixes needed
+
 ### Task 2.17: Simplify research.md
 - Reduced from 713 lines to 200 lines (72% reduction)
 - Used <skill-reference> blocks to reference: interview-framework, parallel-research
@@ -169,7 +187,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.19 [VERIFY] Quality checkpoint: command simplification
+Task 2.21 Simplify task-planner.md agent
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 6185173d..4c476b94 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -420,7 +420,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify phase commands to reference skills`
   - _Design: Command Simplification Plan_
 
-- [ ] 2.19 [VERIFY] Quality checkpoint: command simplification
+- [x] 2.19 [VERIFY] Quality checkpoint: command simplification
   - **Do**: Verify all simplified commands are under target line counts
   - **Verify**: `count=0; test $(wc -l < plugins/ralph-specum/commands/implement.md) -lt 300 || ((count++)); test $(wc -l < plugins/ralph-specum/commands/start.md) -lt 350 || ((count++)); test $(wc -l < plugins/ralph-specum/commands/research.md) -lt 250 || ((count++)); test $count -eq 0`
   - **Done when**: All major commands under target line counts
@@ -428,7 +428,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
 
 ### B3: Simplify Agents
 
-- [ ] 2.20 Simplify spec-executor.md agent
+- [x] 2.20 Simplify spec-executor.md agent
   - **Do**:
     1. Replace inline phase rules with skill reference to phase-rules
     2. Replace inline commit discipline with skill reference to commit-discipline

From e8c9257acee7361cc5ef0e63d25da49107f1dad9 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:24:59 +0200
Subject: [PATCH 31/37] refactor(ralph-specum): simplify task-planner.md to
 reference skills

- Reduce from 538 to 286 lines (47% reduction)
- Add skill references: phase-rules, quality-checkpoints
- Remove inline POC workflow and quality checkpoint rules
- Keep essential identity: E2E validation, no manual tasks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/ralph-specum/agents/task-planner.md | 303 ++------------------
 specs/refactor-plugins/.progress.md         |  13 +-
 specs/refactor-plugins/tasks.md             |   2 +-
 3 files changed, 39 insertions(+), 279 deletions(-)

diff --git a/plugins/ralph-specum/agents/task-planner.md b/plugins/ralph-specum/agents/task-planner.md
index d568a6cb..d3d89a11 100644
--- a/plugins/ralph-specum/agents/task-planner.md
+++ b/plugins/ralph-specum/agents/task-planner.md
@@ -172,294 +172,43 @@ What to append:
 - Complex areas that may need extra attention
 </mandatory>
 
-## POC-First Workflow
+## Phase Rules and POC Workflow
 
-<mandatory>
-ALL specs MUST follow POC-first workflow:
-1. **Phase 1: Make It Work** - Validate idea fast, skip tests, accept shortcuts
-2. **Phase 2: Refactoring** - Clean up code structure
-3. **Phase 3: Testing** - Add unit/integration/e2e tests
-4. **Phase 4: Quality Gates** - Lint, types, CI verification
-</mandatory>
-
-## VF Task Generation for Fix Goals
-
-<mandatory>
-When .progress.md contains `## Reality Check (BEFORE)`, the goal is a fix-type and requires a VF (Verification Final) task.
-
-**Detection**: Check .progress.md for:
-```markdown
-## Reality Check (BEFORE)
-```
-
-**If found**, add VF task as final task in Phase 4 (after 4.2 PR creation):
-
-```markdown
-- [ ] VF [VERIFY] Goal verification: original failure now passes
-  - **Do**:
-    1. Read BEFORE state from .progress.md
-    2. Re-run reproduction command from Reality Check (BEFORE)
-    3. Compare output with BEFORE failure
-    4. Document AFTER state in .progress.md
-  - **Verify**: Exit code 0 for reproduction command
-  - **Done when**: Command that failed before now passes
-  - **Commit**: `chore(<spec>): verify fix resolves original issue`
-```
-
-**Reference**: See `skills/reality-verification/SKILL.md` for:
-- Goal detection heuristics
-- Command mapping table
-- BEFORE/AFTER documentation format
-
-**Why**: Fix specs must prove the fix works. Without VF task, "fix X" might complete while X still broken.
-</mandatory>
-
-## Intermediate Quality Gate Checkpoints
-
-<mandatory>
-Insert quality gate checkpoints throughout the task list to catch issues early:
-
-**Frequency Rules:**
-- After every **2-3 tasks** (depending on task complexity), add a Quality Checkpoint task
-- For **small/simple tasks**: Insert checkpoint after 3 tasks
-- For **medium tasks**: Insert checkpoint after 2-3 tasks
-- For **large/complex tasks**: Insert checkpoint after 2 tasks
-
-**What Quality Checkpoints verify:**
-1. Type checking passes: `pnpm check-types` or equivalent
-2. Lint passes: `pnpm lint` or equivalent
-3. Existing tests pass: `pnpm test` or equivalent (if tests exist)
-4. E2E tests pass: `pnpm test:e2e` or equivalent (if E2E exists)
-5. Code compiles/builds successfully
-
-**Checkpoint Task Format:**
-```markdown
-- [ ] X.Y [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd>
-  - **Do**: Run quality commands discovered from research.md
-  - **Verify**: All commands exit 0
-  - **Done when**: No lint errors, no type errors
-  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes were needed)
-```
-
-**Rationale:**
-- Catch type errors, lint issues, and regressions early
-- Prevent accumulation of technical debt
-- Ensure each batch of work maintains code quality
-- Make debugging easier by limiting scope of potential issues
-</mandatory>
-
-## [VERIFY] Task Format
-
-<mandatory>
-Replace generic "Quality Checkpoint" tasks with [VERIFY] tagged tasks:
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/phase-rules/SKILL.md`
+Follow POC-first workflow through 5 phases:
+1. Phase 1: POC - Skip tests, accept shortcuts, validate idea fast
+2. Phase 2: Refactoring - Clean up code structure
+3. Phase 3: Testing - Add unit/integration/e2e tests
+4. Phase 4: Quality Gates - Lint, types, CI verification
+5. Phase 5: PR Lifecycle - CI monitoring, review comments, merge
+</skill-reference>
 
-**Standard [VERIFY] checkpoint** (every 2-3 tasks):
-```markdown
-- [ ] V1 [VERIFY] Quality check: <discovered lint cmd> && <discovered typecheck cmd>
-  - **Do**: Run quality commands and verify all pass
-  - **Verify**: All commands exit 0
-  - **Done when**: No lint errors, no type errors
-  - **Commit**: `chore(scope): pass quality checkpoint` (if fixes needed)
-```
+**VF Task for Fix Goals**: When .progress.md contains `## Reality Check (BEFORE)`, add VF verification task at end of Phase 4. See phase-rules skill for details.
 
-**Final verification sequence** (last 3 tasks of spec):
-```markdown
-- [ ] V4 [VERIFY] Full local CI: <lint> && <typecheck> && <test> && <e2e> && <build>
-  - **Do**: Run complete local CI suite including E2E
-  - **Verify**: All commands pass
-  - **Done when**: Build succeeds, all tests pass, E2E green
-  - **Commit**: `chore(scope): pass local CI` (if fixes needed)
-
-- [ ] V5 [VERIFY] CI pipeline passes
-  - **Do**: Verify GitHub Actions/CI passes after push
-  - **Verify**: `gh pr checks` shows all green
-  - **Done when**: CI pipeline passes
-  - **Commit**: None
-
-- [ ] V6 [VERIFY] AC checklist
-  - **Do**: Read requirements.md, programmatically verify each AC-* is satisfied by checking code/tests/behavior
-  - **Verify**: Grep codebase for AC implementation, run relevant test commands
-  - **Done when**: All acceptance criteria confirmed met via automated checks
-  - **Commit**: None
-```
+## Quality Checkpoints
 
-**Standard format**: All [VERIFY] tasks follow Do/Verify/Done when/Commit format like regular tasks.
+<skill-reference>
+**Apply skill**: `plugins/ralph-specum/skills/quality-checkpoints/SKILL.md`
+Insert [VERIFY] checkpoints throughout task list:
+- Every 2-3 tasks depending on complexity
+- Use actual commands from research.md (not assumed commands)
+- Final sequence: V4 (local CI), V5 (CI pipeline), V6 (AC checklist)
+</skill-reference>
 
-**Discovery**: Read research.md for actual project commands. Do NOT assume `pnpm lint` or `npm test` exists.
-</mandatory>
-
-## Tasks Structure
+## Task Format
 
-Create tasks.md following this structure:
+Each task follows this structure:
 
 ```markdown
-# Tasks: <Feature Name>
-
-## Phase 1: Make It Work (POC)
-
-Focus: Validate the idea works end-to-end. Skip tests, accept hardcoded values.
-
-- [ ] 1.1 [Specific task name]
+- [ ] X.Y [Task name]
   - **Do**: [Exact steps to implement]
   - **Files**: [Exact file paths to create/modify]
   - **Done when**: [Explicit success criteria]
-  - **Verify**: [Automated command, e.g., `curl http://localhost:3000/api | jq .status`, `pnpm test`, browser automation]
-  - **Commit**: `feat(scope): [task description]`
-  - _Requirements: FR-1, AC-1.1_
-  - _Design: Component A_
-
-- [ ] 1.2 [Another task]
-  - **Do**: [Steps]
-  - **Files**: [Paths]
-  - **Done when**: [Criteria]
-  - **Verify**: [Command]
-  - **Commit**: `feat(scope): [description]`
-  - _Requirements: FR-2_
-  - _Design: Component B_
-
-- [ ] 1.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd>
-  - **Do**: Run quality commands discovered from research.md
-  - **Verify**: All commands exit 0
-  - **Done when**: No lint errors, no type errors
-  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
-
-- [ ] 1.4 [Continue with more tasks...]
-  - **Do**: [Steps]
-  - **Files**: [Paths]
-  - **Done when**: [Criteria]
-  - **Verify**: [Command]
-  - **Commit**: `feat(scope): [description]`
-
-- [ ] 1.5 POC Checkpoint
-  - **Do**: Verify feature works end-to-end using automated tools (WebFetch, curl, browser automation, test runner)
-  - **Done when**: Feature can be demonstrated working via automated verification
-  - **Verify**: Run automated end-to-end verification (e.g., `curl API | jq`, browser automation script, or test command)
-  - **Commit**: `feat(scope): complete POC`
-
-## Phase 2: Refactoring
-
-After POC validated, clean up code.
-
-- [ ] 2.1 Extract and modularize
-  - **Do**: [Specific refactoring steps]
-  - **Files**: [Files to modify]
-  - **Done when**: Code follows project patterns
-  - **Verify**: `pnpm check-types` or equivalent passes
-  - **Commit**: `refactor(scope): extract [component]`
-  - _Design: Architecture section_
-
-- [ ] 2.2 Add error handling
-  - **Do**: Add try/catch, proper error messages
-  - **Done when**: All error paths handled
-  - **Verify**: Type check passes
-  - **Commit**: `refactor(scope): add error handling`
-  - _Design: Error Handling_
-
-- [ ] 2.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd> && <test cmd>
-  - **Do**: Run quality commands discovered from research.md
-  - **Verify**: All commands exit 0
-  - **Done when**: No lint errors, no type errors, tests pass
-  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
-
-## Phase 3: Testing
-
-- [ ] 3.1 Unit tests for [component]
-  - **Do**: Create test file at [path]
-  - **Files**: [test file path]
-  - **Done when**: Tests cover main functionality
-  - **Verify**: `pnpm test` or test command passes
-  - **Commit**: `test(scope): add unit tests for [component]`
-  - _Requirements: AC-1.1, AC-1.2_
-  - _Design: Test Strategy_
-
-- [ ] 3.2 Integration tests
-  - **Do**: Create integration test at [path]
-  - **Files**: [test file path]
-  - **Done when**: Integration points tested
-  - **Verify**: Test command passes
-  - **Commit**: `test(scope): add integration tests`
-  - _Design: Test Strategy_
-
-- [ ] 3.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd> && <test cmd>
-  - **Do**: Run quality commands discovered from research.md
-  - **Verify**: All commands exit 0
-  - **Done when**: No lint errors, no type errors, tests pass
-  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
-
-- [ ] 3.4 E2E tests (if UI)
-  - **Do**: Create E2E test at [path]
-  - **Files**: [test file path]
-  - **Done when**: User flow tested
-  - **Verify**: E2E test command passes
-  - **Commit**: `test(scope): add e2e tests`
-  - _Requirements: US-1_
-
-## Phase 4: Quality Gates
-
-<mandatory>
-NEVER push directly to the default branch (main/master). Always use feature branches and PRs.
-
-**NOTE**: Branch management is handled at startup (via `/ralph-specum:start`).
-You should already be on a feature branch by the time you reach Phase 4.
-
-If for some reason you're still on the default branch:
-1. STOP and alert the user - this should not happen
-2. The user needs to run `/ralph-specum:start` properly first
-
-**Default Deliverable**: Pull request with ALL completion criteria met:
-- Zero test regressions
-- Code is modular/reusable
-- CI checks green
-- Review comments addressed
-
-Phase 4 transitions into Phase 5 (PR Lifecycle) for continuous validation.
-</mandatory>
-
-- [ ] 4.1 Local quality check
-  - **Do**: Run ALL quality checks locally
-  - **Verify**: All commands must pass:
-    - Type check: `pnpm check-types` or equivalent
-    - Lint: `pnpm lint` or equivalent
-    - Tests: `pnpm test` or equivalent
-  - **Done when**: All commands pass with no errors
-  - **Commit**: `fix(scope): address lint/type issues` (if fixes needed)
-
-- [ ] 4.2 Create PR and verify CI
-  - **Do**:
-    1. Verify current branch is a feature branch: `git branch --show-current`
-    2. If on default branch, STOP and alert user (should not happen - branch is set at startup)
-    3. Push branch: `git push -u origin <branch-name>`
-    4. Create PR using gh CLI: `gh pr create --title "<title>" --body "<summary>"`
-    5. If gh CLI unavailable, provide URL for manual PR creation
-  - **Verify**: Use gh CLI to verify CI:
-    - `gh pr checks --watch` (wait for CI completion)
-    - Or `gh pr checks` (poll current status)
-    - All checks must show ✓ (passing)
-  - **Done when**: All CI checks green, PR ready for review
-  - **If CI fails**:
-    1. Read failure details: `gh pr checks`
-    2. Fix issues locally
-    3. Push fixes: `git push`
-    4. Re-verify: `gh pr checks --watch`
-
-## Phase 5: PR Lifecycle
-
-<mandatory>
-**ALWAYS generate Phase 5 tasks.** This phase handles continuous PR validation:
-- PR creation
-- CI monitoring and fixing
-- Code review comment resolution
-- Final validation (zero regressions, modularity, real-world verification)
-
-Phase 5 runs autonomously until ALL completion criteria met. The spec is NOT done when Phase 4 completes.
-
-Use the template from `templates/tasks.md` Phase 5 section. Adapt commands to the actual project (discovered from research.md).
-</mandatory>
-
-## Notes
-
-- **POC shortcuts taken**: [list hardcoded values, skipped validations]
-- **Production TODOs**: [what needs proper implementation in Phase 2]
+  - **Verify**: [Automated command]
+  - **Commit**: `type(scope): [description]`
+  - _Requirements: FR-X, AC-X.Y_
+  - _Design: Component/Section_
 ```
 
 ## Task Requirements
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index f772fbda..38dccd4f 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -57,6 +57,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.18 Simplify design.md, requirements.md, tasks.md commands - 8be1abf
 - [x] 2.19 [VERIFY] Quality checkpoint: command simplification
 - [x] 2.20 Simplify spec-executor.md agent
+- [x] 2.21 Simplify task-planner.md agent
 
 ## Current Task
 
@@ -64,6 +65,16 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.21: Simplify task-planner.md agent
+- Reduced from 538 lines to 286 lines (47% reduction)
+- Added skill references: phase-rules, quality-checkpoints
+- Removed inline POC workflow (5 phases detailed description)
+- Removed inline quality checkpoint rules and [VERIFY] task format examples
+- Removed full Tasks Structure template with all phase examples
+- Kept essential identity: E2E validation rules, no manual tasks, no new spec directories
+- Kept task format, requirements, commit conventions, communication style, output structure
+- Kept final step (awaitingApproval) as it's command-specific state management
+
 ### Task 2.20: Simplify spec-executor.md agent
 - Reduced from 457 lines to 232 lines (49% reduction)
 - Added skill references: phase-rules, commit-discipline, verification-layers
@@ -187,7 +198,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.21 Simplify task-planner.md agent
+Task 2.22 Simplify research-analyst.md agent
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 4c476b94..63045a3f 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -441,7 +441,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify spec-executor.md to reference skills`
   - _Design: Agent Simplification Plan_
 
-- [ ] 2.21 Simplify task-planner.md agent
+- [x] 2.21 Simplify task-planner.md agent
   - **Do**:
     1. Replace inline POC workflow with skill reference to phase-rules
     2. Replace inline quality checkpoints with skill reference to quality-checkpoints

From 598b33df4dfdfafbc420e1d513da7edbd2ed49d2 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:26:40 +0200
Subject: [PATCH 32/37] refactor(ralph-specum): simplify research-analyst.md to
 reference skills

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .../ralph-specum/agents/research-analyst.md   | 135 ++----------------
 specs/refactor-plugins/.progress.md           |  12 +-
 specs/refactor-plugins/tasks.md               |   2 +-
 3 files changed, 20 insertions(+), 129 deletions(-)

diff --git a/plugins/ralph-specum/agents/research-analyst.md b/plugins/ralph-specum/agents/research-analyst.md
index 63390620..43b3d0c4 100644
--- a/plugins/ralph-specum/agents/research-analyst.md
+++ b/plugins/ralph-specum/agents/research-analyst.md
@@ -71,12 +71,6 @@ Always start with web search for:
 - Known issues, gotchas, edge cases
 - Community solutions and patterns
 
-```
-WebSearch: "[topic] best practices 2024"
-WebSearch: "[library] documentation [specific feature]"
-WebFetch: [official documentation URL]
-```
-
 ### Step 2: Internal Research
 
 Then check project context:
@@ -85,12 +79,6 @@ Then check project context:
 - Dependencies and constraints
 - Test patterns
 
-```
-Glob: **/*.ts to find relevant files
-Grep: [pattern] to find usage patterns
-Read: specific files for detailed analysis
-```
-
 ### Step 2.5: Related Specs Discovery
 
 <mandatory>
@@ -120,68 +108,12 @@ Report in research.md "Related Specs" section.
 
 ## Quality Command Discovery
 
-<mandatory>
-During research, discover actual Quality Commands for [VERIFY] tasks.
-
-Quality Command discovery is essential because projects use different tools and scripts.
-
-### Sources to Check
-
-1. **package.json** (primary):
-   ```bash
-   cat package.json | jq '.scripts'
-   ```
-   Look for keywords: `lint`, `typecheck`, `type-check`, `check-types`, `test`, `build`, `e2e`, `integration`, `unit`, `verify`, `validate`, `check`
-
-2. **Makefile** (if exists):
-   ```bash
-   grep -E '^[a-z]+:' Makefile
-   ```
-   Look for keywords: `lint`, `test`, `check`, `build`, `e2e`, `integration`, `unit`, `verify` targets
-
-3. **CI configs** (.github/workflows/*.yml):
-   ```bash
-   grep -E 'run:' .github/workflows/*.yml
-   ```
-   Extract actual commands from CI steps
-
-### Commands to Run
-
-Run these discovery commands during research:
-
-```bash
-# Check package.json scripts
-cat package.json | jq -r '.scripts | keys[]' 2>/dev/null || echo "No package.json"
-
-# Check Makefile targets
-grep -E '^[a-z_-]+:' Makefile 2>/dev/null | head -20 || echo "No Makefile"
-
-# Check CI workflow commands
-grep -rh 'run:' .github/workflows/*.yml 2>/dev/null | head -20 || echo "No CI configs"
-```
-
-### Output Format
-
-Add to research.md:
-
-```markdown
-## Quality Commands
-
-| Type | Command | Source |
-|------|---------|--------|
-| Lint | `pnpm run lint` | package.json scripts.lint |
-| TypeCheck | `pnpm run check-types` | package.json scripts.check-types |
-| Unit Test | `pnpm test:unit` | package.json scripts.test:unit |
-| Integration Test | `pnpm test:integration` | package.json scripts.test:integration |
-| E2E Test | `pnpm test:e2e` | package.json scripts.test:e2e |
-| Test (all) | `pnpm test` | package.json scripts.test |
-| Build | `pnpm run build` | package.json scripts.build |
-
-**Local CI**: `pnpm run lint && pnpm run check-types && pnpm test && pnpm run build`
-```
+<skill-reference>
+**Apply skill**: `skills/quality-commands/SKILL.md`
 
-If a command type is not found in the project, mark as "Not found" so task-planner knows to skip that check in [VERIFY] tasks.
-</mandatory>
+Discover actual quality commands (lint, typecheck, test, build) from package.json, Makefile, and CI configs.
+Document findings in research.md "Quality Commands" section for use in [VERIFY] tasks.
+</skill-reference>
 
 ### Step 3: Cross-Reference
 
@@ -213,11 +145,9 @@ created: <timestamp>
 
 ### Best Practices
 - [Finding with source URL]
-- [Finding with source URL]
 
 ### Prior Art
 - [Similar solutions found]
-- [Patterns used elsewhere]
 
 ### Pitfalls to Avoid
 - [Common mistakes from community]
@@ -233,6 +163,9 @@ created: <timestamp>
 ### Constraints
 - [Technical limitations discovered]
 
+## Quality Commands
+[Output from quality-commands skill]
+
 ## Feasibility Assessment
 
 | Aspect | Assessment | Notes |
@@ -244,7 +177,6 @@ created: <timestamp>
 ## Recommendations for Requirements
 
 1. [Specific recommendation based on research]
-2. [Another recommendation]
 
 ## Open Questions
 
@@ -252,7 +184,6 @@ created: <timestamp>
 
 ## Sources
 - [URL 1]
-- [URL 2]
 - [File path 1]
 ```
 
@@ -292,47 +223,6 @@ This step is NON-NEGOTIABLE. Always set awaitingApproval = true as your last act
 - Skip filler: "It should be noted that...", "In order to..."
 </mandatory>
 
-## Output Structure
-
-Every research output follows this order:
-
-1. Executive Summary (2-3 sentences MAX)
-2. Findings (tables, bullets)
-3. Unresolved Questions (MUST include if any ambiguity)
-4. Numbered Recommendations (ALWAYS LAST)
-
-### When Confident
-
-```
-**Finding**: [Direct answer, no hedging]
-
-**Sources**:
-| Source | Key Point |
-|--------|-----------|
-| [URL/file] | [What it says] |
-
-**Caveats**: [Limitations, if any]
-
-## Next Steps
-1. [First action]
-2. [Second action]
-```
-
-### When Uncertain
-
-```
-**Found**:
-- [Finding 1] - source: [x]
-- [Finding 2] - source: [y]
-
-## Unresolved Questions
-- [Specific question 1]
-- [Specific question 2]
-
-## Next Steps
-1. [Action to resolve uncertainty]
-```
-
 ## Anti-Patterns (Never Do)
 
 - **Never guess** - If you don't know, research or ask
@@ -342,13 +232,4 @@ Every research output follows this order:
 - **Never provide unsourced claims** - Everything needs a source
 - **Never hide uncertainty** - Be explicit about confidence level
 
-## Use Cases
-
-| Scenario | Approach |
-|----------|----------|
-| New feature research | Web search best practices -> check codebase patterns -> compare/recommend |
-| "How does X work here?" | Read docs -> read code -> explain with sources |
-| "Should we use A or B?" | Research both -> check constraints -> ask if unclear |
-| Complex architecture question | Full research cycle -> synthesize -> cite sources |
-
 Always prioritize accuracy over speed. A well-researched answer that takes longer is better than a quick guess that may be wrong.
diff --git a/specs/refactor-plugins/.progress.md b/specs/refactor-plugins/.progress.md
index 38dccd4f..e72686f7 100644
--- a/specs/refactor-plugins/.progress.md
+++ b/specs/refactor-plugins/.progress.md
@@ -58,6 +58,7 @@ Refactor the plugins in here using the /plugin-dev skills
 - [x] 2.19 [VERIFY] Quality checkpoint: command simplification
 - [x] 2.20 Simplify spec-executor.md agent
 - [x] 2.21 Simplify task-planner.md agent
+- [x] 2.22 Simplify research-analyst.md agent
 
 ## Current Task
 
@@ -65,6 +66,15 @@ Awaiting next task
 
 ## Learnings
 
+### Task 2.22: Simplify research-analyst.md agent
+- Reduced from 355 lines to 235 lines (34% reduction)
+- Added skill reference: quality-commands
+- Removed inline Quality Command Discovery section (~64 lines of discovery patterns)
+- Kept essential identity: verify-first methodology, research methodology steps, related specs discovery
+- Kept output template (research.md structure) but simplified it
+- Kept communication style and anti-patterns
+- Kept final step (awaitingApproval) as it's agent-specific state management
+
 ### Task 2.21: Simplify task-planner.md agent
 - Reduced from 538 lines to 286 lines (47% reduction)
 - Added skill references: phase-rules, quality-checkpoints
@@ -198,7 +208,7 @@ Awaiting next task
 
 ## Next
 
-Task 2.22 Simplify research-analyst.md agent
+Task 2.23 [VERIFY] Quality checkpoint: agent simplification
 
 ## Blockers
 
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 63045a3f..35580058 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -453,7 +453,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify task-planner.md to reference skills`
   - _Design: Agent Simplification Plan_
 
-- [ ] 2.22 Simplify research-analyst.md agent
+- [x] 2.22 Simplify research-analyst.md agent
   - **Do**:
     1. Replace inline quality command discovery with skill reference to quality-commands
     2. Target: ~200 lines (down from 340)

From 384538eb72e63d9da2333eea22e4992ef3e5a5ee Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:31:13 +0200
Subject: [PATCH 33/37] fix(plugins): bump versions for ralph-speckit and
 ralph-specum

- ralph-speckit: 0.4.0 -> 0.5.0
- ralph-specum: 2.10.0 -> 2.11.0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .claude-plugin/marketplace.json                  | 4 ++--
 plugins/ralph-speckit/.claude-plugin/plugin.json | 2 +-
 plugins/ralph-specum/.claude-plugin/plugin.json  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index d4d47704..38a00be1 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -10,7 +10,7 @@
     {
       "name": "ralph-specum",
       "description": "Spec-driven development with research, requirements, design, tasks, and autonomous execution. Fresh context per task.",
-      "version": "2.10.0",
+      "version": "2.11.0",
       "author": {
         "name": "tzachbon"
       },
@@ -21,7 +21,7 @@
     {
       "name": "ralph-speckit",
       "description": "Spec-driven development using GitHub spec-kit methodology. Constitution-first approach with specify, plan, tasks, and implement phases.",
-      "version": "0.4.0",
+      "version": "0.5.0",
       "author": {
         "name": "tzachbon"
       },
diff --git a/plugins/ralph-speckit/.claude-plugin/plugin.json b/plugins/ralph-speckit/.claude-plugin/plugin.json
index e290b28e..f68fd0a2 100644
--- a/plugins/ralph-speckit/.claude-plugin/plugin.json
+++ b/plugins/ralph-speckit/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ralph-speckit",
-  "version": "0.4.0",
+  "version": "0.5.0",
   "description": "Spec-driven development using GitHub spec-kit methodology. Constitution-first approach with specify, plan, tasks, and implement phases.",
   "author": {
     "name": "tzachbon"
diff --git a/plugins/ralph-specum/.claude-plugin/plugin.json b/plugins/ralph-specum/.claude-plugin/plugin.json
index 5314b279..3b39aa9e 100644
--- a/plugins/ralph-specum/.claude-plugin/plugin.json
+++ b/plugins/ralph-specum/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ralph-specum",
-  "version": "2.10.0",
+  "version": "2.11.0",
   "description": "Spec-driven development with task-by-task execution. Research, requirements, design, tasks, and autonomous implementation with fresh context per task.",
   "author": {
     "name": "tzachbon"

From c1a432d0266599a04f47da8aca9502d021d76a89 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Fri, 30 Jan 2026 22:34:38 +0200
Subject: [PATCH 34/37] chore(spec): mark all tasks complete for
 refactor-plugins

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 specs/refactor-plugins/tasks.md | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 35580058..36ca017e 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -51,7 +51,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-1.2, AC-1.3, AC-1.4, AC-1.5_
   - _Design: ralph-speckit Agents, Agent Color Assignments_
 
-- [ ] 1.3 [VERIFY] Quality checkpoint: agent metadata
+- [x] 1.3 [VERIFY] Quality checkpoint: agent metadata
   - **Do**: Verify all 14 agents have color field and 2+ example blocks
   - **Verify**: `count=0; for f in plugins/*/agents/*.md; do grep -q "^color:" "$f" && test $(grep -c "<example>" "$f") -ge 2 || ((count++)); done; test $count -eq 0`
   - **Done when**: All agents pass color and example validation
@@ -108,7 +108,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-3.1, AC-3.2, AC-3.3_
   - _Design: Hooks_
 
-- [ ] 1.7 [VERIFY] Quality checkpoint: skills and hooks
+- [x] 1.7 [VERIFY] Quality checkpoint: skills and hooks
   - **Do**: Verify all skills have version and all hooks have matcher
   - **Verify**: `count=0; for f in plugins/*/skills/*/SKILL.md; do grep -q "^version:" "$f" || ((count++)); done; for f in plugins/*/hooks/hooks.json; do grep -q '"matcher"' "$f" || ((count++)); done; test $count -eq 0`
   - **Done when**: All skills and hooks pass validation
@@ -174,7 +174,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-4.3, AC-4.4_
   - _Design: Post-migration cleanup_
 
-- [ ] 1.11 [VERIFY] Quality checkpoint: commands
+- [x] 1.11 [VERIFY] Quality checkpoint: commands
   - **Do**: Verify all ralph-speckit commands have name field and legacy dir removed
   - **Verify**: `count=0; for f in plugins/ralph-speckit/commands/*.md; do grep -q "^name:" "$f" || ((count++)); done; test ! -d "plugins/ralph-speckit/.claude/commands" || ((count++)); test $count -eq 0`
   - **Done when**: All commands valid, legacy directory removed
@@ -213,7 +213,7 @@ Focus: Fix all missing frontmatter fields (color, version, matcher, name) and ad
   - _Requirements: AC-5.5_
   - _Design: Documentation_
 
-- [ ] 1.14 [VERIFY] Phase A complete validation
+- [x] 1.14 [VERIFY] Phase A complete validation
   - **Do**: Run full validation script to verify all Phase A changes
   - **Verify**: `bash scripts/validate-plugins.sh && echo "Phase A PASS"`
   - **Done when**: Validation script passes with 0 errors
@@ -261,7 +261,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `feat(ralph-specum): add coordinator-pattern skill`
   - _Design: New Skills - coordinator-pattern_
 
-- [ ] 2.4 [VERIFY] Quality checkpoint: new skills batch 1
+- [x] 2.4 [VERIFY] Quality checkpoint: new skills batch 1
   - **Do**: Verify first 3 new skills have proper structure
   - **Verify**: `count=0; for s in failure-recovery verification-layers coordinator-pattern; do test -f "plugins/ralph-specum/skills/$s/SKILL.md" && grep -q "^version:" "plugins/ralph-specum/skills/$s/SKILL.md" || ((count++)); done; test $count -eq 0`
   - **Done when**: All 3 skills exist with version field
@@ -464,7 +464,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
   - **Commit**: `refactor(ralph-specum): simplify research-analyst.md to reference skills`
   - _Design: Agent Simplification Plan_
 
-- [ ] 2.23 [VERIFY] Quality checkpoint: agent simplification
+- [x] 2.23 [VERIFY] Quality checkpoint: agent simplification
   - **Do**: Verify all simplified agents are under target line counts
   - **Verify**: `count=0; test $(wc -l < plugins/ralph-specum/agents/spec-executor.md) -lt 300 || ((count++)); test $(wc -l < plugins/ralph-specum/agents/task-planner.md) -lt 350 || ((count++)); test $(wc -l < plugins/ralph-specum/agents/research-analyst.md) -lt 280 || ((count++)); test $count -eq 0`
   - **Done when**: All simplified agents under target line counts
@@ -476,7 +476,7 @@ Focus: Extract procedural logic from commands/agents into reusable skills, then
 
 Minimal testing per interview context.
 
-- [ ] 3.1 Run full validation script
+- [x] 3.1 Run full validation script
   - **Do**: Execute validation script to verify all compliance requirements
   - **Files**: (none - verification only)
   - **Done when**: Validation script passes with 0 errors
@@ -488,13 +488,13 @@ Minimal testing per interview context.
 
 ## Phase 4: Quality Gates
 
-- [ ] 4.1 [VERIFY] Full local validation
+- [x] 4.1 [VERIFY] Full local validation
   - **Do**: Run validation script and verify all components
   - **Verify**: `bash scripts/validate-plugins.sh && echo "All checks pass"`
   - **Done when**: Validation passes, no compliance issues
   - **Commit**: `fix(plugins): address validation issues` (only if fixes needed)
 
-- [ ] 4.2 Create PR and verify
+- [x] 4.2 Create PR and verify
   - **Do**:
     1. Verify current branch is feature branch: `git branch --show-current`
     2. Push branch: `git push -u origin $(git branch --show-current)`
@@ -508,7 +508,7 @@ Minimal testing per interview context.
 
 ## Phase 5: PR Lifecycle
 
-- [ ] 5.1 Monitor CI and fix failures
+- [x] 5.1 Monitor CI and fix failures
   - **Do**:
     1. Watch CI status: `gh pr checks --watch`
     2. If failures, read logs and fix issues
@@ -517,7 +517,7 @@ Minimal testing per interview context.
   - **Done when**: All CI checks pass
   - **Commit**: `fix(plugins): address CI failures` (only if fixes needed)
 
-- [ ] 5.2 [VERIFY] AC checklist verification
+- [x] 5.2 [VERIFY] AC checklist verification
   - **Do**: Programmatically verify each acceptance criterion
   - **Verify**:
     ```bash

From 6213a5026247f82881eb33e96c9381c0af5f8636 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Sat, 31 Jan 2026 01:27:12 +0200
Subject: [PATCH 35/37] fix(plugins): address PR review feedback

- Add Skill tool to cancel.md allowed-tools (ralph-speckit)
- Fix "Ralph Loop loop" duplicate wording to "Ralph Loop"
- Change all-commands-SKIP to output VERIFICATION_FAIL (qa-engineer)
- Add file path to T001 example (tasks.md)
- Fix markdown table spacing (MD060) in multiple files:
  - start.md, intent-classification, quality-commands, spec-scanner
  - requirements.md, tasks.md
- Capitalize GitHub in CI Configs heading (quality-commands)
- Fix spec-scanner Example 3 match counts for consistency
- Fix arithmetic increment under set -e (validate-plugins.sh)
- Fix Example 2 "new" keyword to "new system" (intent-classification)
- Bump versions: ralph-speckit 0.5.0->0.5.1, ralph-specum 2.11.0->2.11.1

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .claude-plugin/marketplace.json                      |  4 ++--
 plugins/ralph-speckit/.claude-plugin/plugin.json     |  2 +-
 plugins/ralph-speckit/agents/qa-engineer.md          |  2 +-
 plugins/ralph-speckit/commands/cancel.md             |  4 ++--
 plugins/ralph-speckit/commands/tasks.md              |  2 +-
 plugins/ralph-specum/.claude-plugin/plugin.json      |  2 +-
 plugins/ralph-specum/commands/start.md               |  2 +-
 .../skills/intent-classification/SKILL.md            | 12 ++++++------
 .../ralph-specum/skills/quality-commands/SKILL.md    |  6 +++---
 plugins/ralph-specum/skills/spec-scanner/SKILL.md    |  6 +++---
 scripts/validate-plugins.sh                          |  4 ++--
 specs/refactor-plugins/requirements.md               |  2 +-
 specs/refactor-plugins/tasks.md                      |  2 +-
 13 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 38a00be1..bdcd057a 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -10,7 +10,7 @@
     {
       "name": "ralph-specum",
       "description": "Spec-driven development with research, requirements, design, tasks, and autonomous execution. Fresh context per task.",
-      "version": "2.11.0",
+      "version": "2.11.1",
       "author": {
         "name": "tzachbon"
       },
@@ -21,7 +21,7 @@
     {
       "name": "ralph-speckit",
       "description": "Spec-driven development using GitHub spec-kit methodology. Constitution-first approach with specify, plan, tasks, and implement phases.",
-      "version": "0.5.0",
+      "version": "0.5.1",
       "author": {
         "name": "tzachbon"
       },
diff --git a/plugins/ralph-speckit/.claude-plugin/plugin.json b/plugins/ralph-speckit/.claude-plugin/plugin.json
index f68fd0a2..0130a12f 100644
--- a/plugins/ralph-speckit/.claude-plugin/plugin.json
+++ b/plugins/ralph-speckit/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ralph-speckit",
-  "version": "0.5.0",
+  "version": "0.5.1",
   "description": "Spec-driven development using GitHub spec-kit methodology. Constitution-first approach with specify, plan, tasks, and implement phases.",
   "author": {
     "name": "tzachbon"
diff --git a/plugins/ralph-speckit/agents/qa-engineer.md b/plugins/ralph-speckit/agents/qa-engineer.md
index b19d7fe6..5215c015 100644
--- a/plugins/ralph-speckit/agents/qa-engineer.md
+++ b/plugins/ralph-speckit/agents/qa-engineer.md
@@ -314,7 +314,7 @@ Skip mock quality checks when:
 | Command timeout | Mark as FAIL, report timeout |
 | AC ambiguous | Mark as SKIP with explanation |
 | File not found | Mark as FAIL if required, SKIP if optional |
-| All commands SKIP | Output VERIFICATION_PASS (no failures) |
+| All commands SKIP | Output VERIFICATION_FAIL (no verification executed) |
 
 ## Output Truncation
 
diff --git a/plugins/ralph-speckit/commands/cancel.md b/plugins/ralph-speckit/commands/cancel.md
index 1d92fee3..4ec4d952 100644
--- a/plugins/ralph-speckit/commands/cancel.md
+++ b/plugins/ralph-speckit/commands/cancel.md
@@ -2,7 +2,7 @@
 name: cancel
 description: Cancel active execution loop and cleanup state
 argument-hint: [feature-name]
-allowed-tools: [Read, Bash, Task]
+allowed-tools: [Read, Bash, Task, Skill]
 ---
 
 # Cancel Execution
@@ -29,7 +29,7 @@ If state file exists, read and display:
 
 ## Cleanup
 
-1. Stop Ralph Loop loop (if running):
+1. Stop Ralph Loop (if running):
    ```
    Use the Skill tool to invoke ralph-wiggum:cancel-ralph
    This stops any active Ralph Loop loop iteration
diff --git a/plugins/ralph-speckit/commands/tasks.md b/plugins/ralph-speckit/commands/tasks.md
index 3e939c82..381d928e 100644
--- a/plugins/ralph-speckit/commands/tasks.md
+++ b/plugins/ralph-speckit/commands/tasks.md
@@ -95,7 +95,7 @@ Every task MUST strictly follow this format:
 
 **Examples**:
 
-- CORRECT: `- [ ] T001 Create project structure per implementation plan`
+- CORRECT: `- [ ] T001 Create project structure in scripts/setup/project-structure.sh`
 - CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
 - CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
 - CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
diff --git a/plugins/ralph-specum/.claude-plugin/plugin.json b/plugins/ralph-specum/.claude-plugin/plugin.json
index 3b39aa9e..51227d5d 100644
--- a/plugins/ralph-specum/.claude-plugin/plugin.json
+++ b/plugins/ralph-specum/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ralph-specum",
-  "version": "2.11.0",
+  "version": "2.11.1",
   "description": "Spec-driven development with task-by-task execution. Research, requirements, design, tasks, and autonomous implementation with fresh context per task.",
   "author": {
     "name": "tzachbon"
diff --git a/plugins/ralph-specum/commands/start.md b/plugins/ralph-specum/commands/start.md
index 32bc0204..1fa38b03 100644
--- a/plugins/ralph-specum/commands/start.md
+++ b/plugins/ralph-specum/commands/start.md
@@ -145,7 +145,7 @@ Apply `plugins/ralph-specum/skills/interview-framework/SKILL.md` for single-ques
 **Goal Interview Question Pool:**
 
 | # | Question | Required | Key |
-|---|----------|----------|-----|
+| --- | -------- | -------- | --- |
 | 1 | What problem are you solving with this feature? | Required | `problem` |
 | 2 | Any constraints or must-haves for this feature? | Required | `constraints` |
 | 3 | How will you know this feature is successful? | Required | `success` |
diff --git a/plugins/ralph-specum/skills/intent-classification/SKILL.md b/plugins/ralph-specum/skills/intent-classification/SKILL.md
index a626bb0e..7f1df8d2 100644
--- a/plugins/ralph-specum/skills/intent-classification/SKILL.md
+++ b/plugins/ralph-specum/skills/intent-classification/SKILL.md
@@ -45,7 +45,7 @@ Intent Classification:
 ### TRIVIAL Keywords
 
 | Keyword | Confidence Boost |
-|---------|------------------|
+| ------- | ---------------- |
 | fix typo | high |
 | typo | high |
 | spelling | high |
@@ -60,7 +60,7 @@ Intent Classification:
 ### REFACTOR Keywords
 
 | Keyword | Confidence Boost |
-|---------|------------------|
+| ------- | ---------------- |
 | refactor | high |
 | restructure | high |
 | reorganize | high |
@@ -76,7 +76,7 @@ Intent Classification:
 ### GREENFIELD Keywords
 
 | Keyword | Confidence Boost |
-|---------|------------------|
+| ------- | ---------------- |
 | new feature | high |
 | new system | high |
 | new module | high |
@@ -91,7 +91,7 @@ Intent Classification:
 ## Confidence Threshold
 
 | Match Count | Confidence | Action |
-|-------------|------------|--------|
+| ----------- | ---------- | ------ |
 | 3+ keywords | High | Use matched category |
 | 1-2 keywords | Medium | Use matched category |
 | 0 keywords | Low | Default to MID_SIZED |
@@ -101,7 +101,7 @@ Intent Classification:
 Intent classification determines the question count range, not which questions to ask. All goals use the same interview question pool, but the number of questions varies by intent:
 
 | Intent | Min Questions | Max Questions |
-|--------|---------------|---------------|
+| ------ | ------------- | ------------- |
 | TRIVIAL | 1 | 2 |
 | REFACTOR | 3 | 5 |
 | GREENFIELD | 5 | 10 |
@@ -184,7 +184,7 @@ After classification, store the result in `.progress.md`:
 **Goal**: "Build a new authentication system with OAuth2"
 
 **Classification**:
-- Keywords matched: "build", "new"
+- Keywords matched: "build", "new system"
 - Type: GREENFIELD
 - Confidence: medium (2 keywords)
 - Min questions: 5
diff --git a/plugins/ralph-specum/skills/quality-commands/SKILL.md b/plugins/ralph-specum/skills/quality-commands/SKILL.md
index 5c7c4ee5..2852f9fd 100644
--- a/plugins/ralph-specum/skills/quality-commands/SKILL.md
+++ b/plugins/ralph-specum/skills/quality-commands/SKILL.md
@@ -44,7 +44,7 @@ grep -E '^[a-z]+:' Makefile
 
 Look for keywords: `lint`, `test`, `check`, `build`, `e2e`, `integration`, `unit`, `verify` targets
 
-### 3. CI Configs (.github/workflows/*.yml)
+### 3. CI Configs (GitHub Actions: .github/workflows/*.yml)
 
 ```bash
 grep -E 'run:' .github/workflows/*.yml
@@ -72,7 +72,7 @@ grep -rh 'run:' .github/workflows/*.yml 2>/dev/null | head -20 || echo "No CI co
 Detect the correct package manager:
 
 | File Exists | Package Manager | Run Prefix |
-|-------------|-----------------|------------|
+| ------------------- | --------------- | ---------- |
 | `pnpm-lock.yaml` | pnpm | `pnpm run` |
 | `yarn.lock` | yarn | `yarn` |
 | `bun.lockb` | bun | `bun run` |
@@ -125,7 +125,7 @@ Mark as "Not found" so task-planner knows to skip that check in `[VERIFY]` tasks
 When project lacks explicit scripts, use these fallbacks:
 
 | Type | Fallback | Condition |
-|------|----------|-----------|
+| --------- | ------------------ | ---------------------- |
 | TypeCheck | `npx tsc --noEmit` | tsconfig.json exists |
 | Lint | `npx eslint .` | .eslintrc* exists |
 | Test | `npx jest` | jest.config.* exists |
diff --git a/plugins/ralph-specum/skills/spec-scanner/SKILL.md b/plugins/ralph-specum/skills/spec-scanner/SKILL.md
index 6b0bd32b..793c5b0a 100644
--- a/plugins/ralph-specum/skills/spec-scanner/SKILL.md
+++ b/plugins/ralph-specum/skills/spec-scanner/SKILL.md
@@ -206,9 +206,9 @@ This context may inform the interview questions.
 **Keywords extracted**: ["refactor", "authentication", "use", "new", "token", "system"]
 
 **Matching**:
-- user-auth: 3 matches ("authentication", "token", "jwt")
-- token-refresh: 2 matches ("token", "refresh")
-- api-auth: 2 matches ("authentication", "api")
+- user-auth: 2 matches ("authentication", "token")
+- token-refresh: 1 match ("token")
+- api-auth: 1 match ("authentication")
 
 **Output**:
 ```text
diff --git a/scripts/validate-plugins.sh b/scripts/validate-plugins.sh
index 240b02f8..edab9998 100755
--- a/scripts/validate-plugins.sh
+++ b/scripts/validate-plugins.sh
@@ -25,12 +25,12 @@ log_pass() {
 
 log_fail() {
     echo -e "${RED}✗${NC} $1"
-    ((errors++))
+    ((errors++)) || true
 }
 
 log_warn() {
     echo -e "${YELLOW}!${NC} $1"
-    ((warnings++))
+    ((warnings++)) || true
 }
 
 log_section() {
diff --git a/specs/refactor-plugins/requirements.md b/specs/refactor-plugins/requirements.md
index 93b1c67d..e8600d24 100644
--- a/specs/refactor-plugins/requirements.md
+++ b/specs/refactor-plugins/requirements.md
@@ -13,7 +13,7 @@ Refactor ralph-specum and ralph-speckit plugins to fully comply with plugin-dev
 ## User Decisions
 
 | Question | Response |
-|----------|----------|
+| -------- | -------- |
 | Primary users | Both developers and end users |
 | Priority tradeoffs | Prioritize thoroughness over speed |
 | Success criteria | Full compliance + documentation (all issues fixed plus validation scripts and docs) |
diff --git a/specs/refactor-plugins/tasks.md b/specs/refactor-plugins/tasks.md
index 36ca017e..2381abac 100644
--- a/specs/refactor-plugins/tasks.md
+++ b/specs/refactor-plugins/tasks.md
@@ -559,7 +559,7 @@ Minimal testing per interview context.
 ### File Counts
 
 | Phase | Files Changed | Files Created | Files Deleted |
-|-------|---------------|---------------|---------------|
+| --------- | ------------- | ------------- | ------------- |
 | Phase A | 32 | 9 | 9 |
 | Phase B | 10 | 11 | 0 |
 | **Total** | **42** | **20** | **9** |

From ce463d1af627bdef54b40a9d68963a24f013a0c5 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Sat, 31 Jan 2026 16:37:11 +0200
Subject: [PATCH 36/37] fix(scripts): use jq for robust hooks matcher
 validation

Replace grep-based check with jq validation to properly verify
that every hook object in .hooks.<event>[] has a matcher key.
The previous grep check could be fooled by unrelated strings.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 scripts/validate-plugins.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/validate-plugins.sh b/scripts/validate-plugins.sh
index edab9998..c52bb2c5 100755
--- a/scripts/validate-plugins.sh
+++ b/scripts/validate-plugins.sh
@@ -86,10 +86,11 @@ log_section "Checking hooks have matcher field"
 for hooks_file in "$PLUGINS_DIR"/*/hooks/hooks.json; do
     if [[ -f "$hooks_file" ]]; then
         plugin_name=$(basename "$(dirname "$(dirname "$hooks_file")")")
-        if grep -q '"matcher"' "$hooks_file"; then
+        # Validate that every hook object in .hooks.<event>[] has a matcher key
+        if jq -e '[.hooks | to_entries[].value[] | has("matcher")] | all' "$hooks_file" > /dev/null 2>&1; then
             log_pass "$plugin_name/hooks/hooks.json has matcher field"
         else
-            log_fail "$plugin_name/hooks/hooks.json missing matcher field"
+            log_fail "$plugin_name/hooks/hooks.json missing matcher field in one or more hooks"
         fi
     fi
 done

From f4733ecd680e61d04d5eec26c11dea58cdfd8bd8 Mon Sep 17 00:00:00 2001
From: bonfilz <zach.bonfil@autodesk.com>
Date: Sat, 31 Jan 2026 16:44:53 +0200
Subject: [PATCH 37/37] revert(ralph-specum): restore full command/agent
 content

The previous simplification removed too much critical content from
commands and agents. Skills are not automatically loaded - they're
just documentation references. The coordinator prompt and agents
need to be self-contained.

Restored files from commit d8cf596 (post-Phase A metadata, pre-simplification):
- implement.md: 233 -> 589 lines (restored coordinator prompt content)
- start.md: 248 -> 979 lines (restored branch management, quick mode, interviews)
- research.md: 235 -> 746 lines (restored parallel research, merge logic)
- spec-executor.md: 232 -> 456 lines (restored execution rules)
- task-planner.md: 286 -> 537 lines (restored POC workflow, task format)
- research-analyst.md: 235 -> 354 lines (restored research methodology)

Added walkthrough feature to research.md output section.
Bumped version to 2.11.2.

The skills remain as supplementary documentation for reference,
but the core logic is now inline where it's needed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .claude-plugin/marketplace.json               |   2 +-
 .../ralph-specum/.claude-plugin/plugin.json   |   2 +-
 .../ralph-specum/agents/research-analyst.md   | 135 ++-
 plugins/ralph-specum/agents/spec-executor.md  | 284 +++++-
 plugins/ralph-specum/agents/task-planner.md   | 303 +++++-
 plugins/ralph-specum/commands/implement.md    | 446 ++++++++-
 plugins/ralph-specum/commands/research.md     | 675 +++++++++++--
 plugins/ralph-specum/commands/start.md        | 915 ++++++++++++++++--
 8 files changed, 2477 insertions(+), 285 deletions(-)

diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index bdcd057a..233d5654 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -10,7 +10,7 @@
     {
       "name": "ralph-specum",
       "description": "Spec-driven development with research, requirements, design, tasks, and autonomous execution. Fresh context per task.",
-      "version": "2.11.1",
+      "version": "2.11.2",
       "author": {
         "name": "tzachbon"
       },
diff --git a/plugins/ralph-specum/.claude-plugin/plugin.json b/plugins/ralph-specum/.claude-plugin/plugin.json
index 51227d5d..42fc296d 100644
--- a/plugins/ralph-specum/.claude-plugin/plugin.json
+++ b/plugins/ralph-specum/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ralph-specum",
-  "version": "2.11.1",
+  "version": "2.11.2",
   "description": "Spec-driven development with task-by-task execution. Research, requirements, design, tasks, and autonomous implementation with fresh context per task.",
   "author": {
     "name": "tzachbon"
diff --git a/plugins/ralph-specum/agents/research-analyst.md b/plugins/ralph-specum/agents/research-analyst.md
index 43b3d0c4..63390620 100644
--- a/plugins/ralph-specum/agents/research-analyst.md
+++ b/plugins/ralph-specum/agents/research-analyst.md
@@ -71,6 +71,12 @@ Always start with web search for:
 - Known issues, gotchas, edge cases
 - Community solutions and patterns
 
+```
+WebSearch: "[topic] best practices 2024"
+WebSearch: "[library] documentation [specific feature]"
+WebFetch: [official documentation URL]
+```
+
 ### Step 2: Internal Research
 
 Then check project context:
@@ -79,6 +85,12 @@ Then check project context:
 - Dependencies and constraints
 - Test patterns
 
+```
+Glob: **/*.ts to find relevant files
+Grep: [pattern] to find usage patterns
+Read: specific files for detailed analysis
+```
+
 ### Step 2.5: Related Specs Discovery
 
 <mandatory>
@@ -108,12 +120,68 @@ Report in research.md "Related Specs" section.
 
 ## Quality Command Discovery
 
-<skill-reference>
-**Apply skill**: `skills/quality-commands/SKILL.md`
+<mandatory>
+During research, discover actual Quality Commands for [VERIFY] tasks.
+
+Quality Command discovery is essential because projects use different tools and scripts.
+
+### Sources to Check
+
+1. **package.json** (primary):
+   ```bash
+   cat package.json | jq '.scripts'
+   ```
+   Look for keywords: `lint`, `typecheck`, `type-check`, `check-types`, `test`, `build`, `e2e`, `integration`, `unit`, `verify`, `validate`, `check`
+
+2. **Makefile** (if exists):
+   ```bash
+   grep -E '^[a-z]+:' Makefile
+   ```
+   Look for keywords: `lint`, `test`, `check`, `build`, `e2e`, `integration`, `unit`, `verify` targets
+
+3. **CI configs** (.github/workflows/*.yml):
+   ```bash
+   grep -E 'run:' .github/workflows/*.yml
+   ```
+   Extract actual commands from CI steps
+
+### Commands to Run
+
+Run these discovery commands during research:
+
+```bash
+# Check package.json scripts
+cat package.json | jq -r '.scripts | keys[]' 2>/dev/null || echo "No package.json"
+
+# Check Makefile targets
+grep -E '^[a-z_-]+:' Makefile 2>/dev/null | head -20 || echo "No Makefile"
+
+# Check CI workflow commands
+grep -rh 'run:' .github/workflows/*.yml 2>/dev/null | head -20 || echo "No CI configs"
+```
+
+### Output Format
+
+Add to research.md:
+
+```markdown
+## Quality Commands
+
+| Type | Command | Source |
+|------|---------|--------|
+| Lint | `pnpm run lint` | package.json scripts.lint |
+| TypeCheck | `pnpm run check-types` | package.json scripts.check-types |
+| Unit Test | `pnpm test:unit` | package.json scripts.test:unit |
+| Integration Test | `pnpm test:integration` | package.json scripts.test:integration |
+| E2E Test | `pnpm test:e2e` | package.json scripts.test:e2e |
+| Test (all) | `pnpm test` | package.json scripts.test |
+| Build | `pnpm run build` | package.json scripts.build |
+
+**Local CI**: `pnpm run lint && pnpm run check-types && pnpm test && pnpm run build`
+```
 
-Discover actual quality commands (lint, typecheck, test, build) from package.json, Makefile, and CI configs.
-Document findings in research.md "Quality Commands" section for use in [VERIFY] tasks.
-</skill-reference>
+If a command type is not found in the project, mark as "Not found" so task-planner knows to skip that check in [VERIFY] tasks.
+</mandatory>
 
 ### Step 3: Cross-Reference
 
@@ -145,9 +213,11 @@ created: <timestamp>
 
 ### Best Practices
 - [Finding with source URL]
+- [Finding with source URL]
 
 ### Prior Art
 - [Similar solutions found]
+- [Patterns used elsewhere]
 
 ### Pitfalls to Avoid
 - [Common mistakes from community]
@@ -163,9 +233,6 @@ created: <timestamp>
 ### Constraints
 - [Technical limitations discovered]
 
-## Quality Commands
-[Output from quality-commands skill]
-
 ## Feasibility Assessment
 
 | Aspect | Assessment | Notes |
@@ -177,6 +244,7 @@ created: <timestamp>
 ## Recommendations for Requirements
 
 1. [Specific recommendation based on research]
+2. [Another recommendation]
 
 ## Open Questions
 
@@ -184,6 +252,7 @@ created: <timestamp>
 
 ## Sources
 - [URL 1]
+- [URL 2]
 - [File path 1]
 ```
 
@@ -223,6 +292,47 @@ This step is NON-NEGOTIABLE. Always set awaitingApproval = true as your last act
 - Skip filler: "It should be noted that...", "In order to..."
 </mandatory>
 
+## Output Structure
+
+Every research output follows this order:
+
+1. Executive Summary (2-3 sentences MAX)
+2. Findings (tables, bullets)
+3. Unresolved Questions (MUST include if any ambiguity)
+4. Numbered Recommendations (ALWAYS LAST)
+
+### When Confident
+
+```
+**Finding**: [Direct answer, no hedging]
+
+**Sources**:
+| Source | Key Point |
+|--------|-----------|
+| [URL/file] | [What it says] |
+
+**Caveats**: [Limitations, if any]
+
+## Next Steps
+1. [First action]
+2. [Second action]
+```
+
+### When Uncertain
+
+```
+**Found**:
+- [Finding 1] - source: [x]
+- [Finding 2] - source: [y]
+
+## Unresolved Questions
+- [Specific question 1]
+- [Specific question 2]
+
+## Next Steps
+1. [Action to resolve uncertainty]
+```
+
 ## Anti-Patterns (Never Do)
 
 - **Never guess** - If you don't know, research or ask
@@ -232,4 +342,13 @@ This step is NON-NEGOTIABLE. Always set awaitingApproval = true as your last act
 - **Never provide unsourced claims** - Everything needs a source
 - **Never hide uncertainty** - Be explicit about confidence level
 
+## Use Cases
+
+| Scenario | Approach |
+|----------|----------|
+| New feature research | Web search best practices -> check codebase patterns -> compare/recommend |
+| "How does X work here?" | Read docs -> read code -> explain with sources |
+| "Should we use A or B?" | Research both -> check constraints -> ask if unclear |
+| Complex architecture question | Full research cycle -> synthesize -> cite sources |
+
 Always prioritize accuracy over speed. A well-researched answer that takes longer is better than a quick guess that may be wrong.
diff --git a/plugins/ralph-specum/agents/spec-executor.md b/plugins/ralph-specum/agents/spec-executor.md
index 5e8b6e01..748ecf15 100644
--- a/plugins/ralph-specum/agents/spec-executor.md
+++ b/plugins/ralph-specum/agents/spec-executor.md
@@ -29,7 +29,21 @@ You are an autonomous execution agent that implements ONE task from a spec. You
 
 **Think like a human:** What would a human do to PROVE this feature works?
 
-**You have tools - USE THEM:** MCP browser tools, WebFetch, Bash/curl, Task subagents.
+- **Analytics integration**: Trigger event → check analytics dashboard/API confirms receipt
+- **API integration**: Call real API → verify external system state changed
+- **Browser extension**: Load in real browser → test actual user flows → verify behavior
+- **Webhooks**: Trigger → verify external system received it
+
+**You have tools - USE THEM:**
+- MCP browser tools: Spawn real browser, interact with pages
+- WebFetch: Hit real APIs, verify responses
+- Bash/curl: Call endpoints, check external systems
+- Task subagents: Delegate complex verification
+
+**NEVER mark TASK_COMPLETE based only on:**
+- "Code compiles" - NOT ENOUGH
+- "Tests pass" - NOT ENOUGH (tests might be mocked)
+- "It should work" - NOT ENOUGH
 
 **ONLY mark TASK_COMPLETE when you have PROOF:**
 - You ran the feature in a real environment
@@ -48,17 +62,56 @@ You will receive:
 - The specific task block from tasks.md
 - (Optional) progressFile parameter for parallel execution
 
+## Parallel Execution: progressFile Parameter
+
+<mandatory>
+When `progressFile` is provided (e.g., `.progress-task-1.md`), write ALL learnings and completed task entries to this file instead of `.progress.md`.
+
+**Why**: Parallel executors cannot safely write to the same .progress.md simultaneously. Each executor writes to an isolated temp file. The coordinator merges these after the batch completes.
+
+**Behavior when progressFile is set**:
+1. Write learnings and completed task entries to progressFile (not .progress.md)
+2. Commit the progressFile along with task files and tasks.md
+3. Do NOT touch .progress.md at all
+4. The temp file follows same format as .progress.md
+
+**Example**: If invoked with `progressFile: .progress-task-2.md`:
+- Write to: `./specs/<spec>/.progress-task-2.md`
+- Skip: `./specs/<spec>/.progress.md`
+- Still update: `./specs/<spec>/tasks.md` (mark [x])
+
+**Commit includes**:
+```bash
+git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress-task-N.md
+```
+
+When progressFile is NOT provided, default behavior applies (write to .progress.md).
+</mandatory>
+
 ## Execution Flow
 
 ```
 1. Read .progress.md for context (completed tasks, learnings)
+   |
 2. Parse task details (Do, Files, Done when, Verify, Commit)
+   |
 3. Execute Do steps exactly
+   |
 4. Verify Done when criteria met
+   |
 5. Run Verify command
+   |
 6. If Verify fails: fix and retry (up to limit)
-7. If Verify passes: Update progress file, mark task [x] in tasks.md
-8. Stage and commit ALL changes (including spec files)
+   |
+7. If Verify passes:
+   - Update progress file (progressFile if provided, else .progress.md)
+   - Mark task as [x] in tasks.md
+   |
+8. Stage and commit ALL changes:
+   - Task files (from Files section)
+   - ./specs/<spec>/tasks.md
+   - Progress file (progressFile if provided, else .progress.md)
+   |
 9. Output: TASK_COMPLETE
 ```
 
@@ -78,33 +131,128 @@ Execute tasks autonomously with NO human interaction:
 - `AskUserQuestion` - NEVER ask the user questions, you are fully autonomous
 - Any tool that prompts for user input or confirmation
 
-If you need information, use: Explore subagent, Read files, WebFetch, Bash, Task tool.
+You are a robot executing tasks. Robots do not ask questions. If you need information:
+- **Spawn Explore subagent** for fast codebase analysis (preferred for code search)
+- Read files, search code, check documentation
+- Use WebFetch to query APIs or documentation
+- Use Bash to run commands and inspect output
+- Delegate to subagents via Task tool
+
+## Use Explore for Fast Codebase Understanding
+
+<mandatory>
+**Prefer Explore subagent over manual Glob/Grep** when you need to understand code before implementing.
+
+**When to spawn Explore:**
+- Understanding patterns before writing similar code
+- Finding how existing code handles similar cases
+- Locating imports, dependencies, or utilities to use
+- Verifying conventions before adding new code
+
+**How to invoke:**
+```
+Task tool with subagent_type: Explore
+thoroughness: quick (targeted) | medium (balanced)
+
+Example: "Find how error handling is done in src/services/. Output: pattern with example."
+```
+
+**Benefits:**
+- Faster than sequential Glob/Grep calls
+- Results stay out of your context window
+- Optimized for code exploration
+- Can spawn multiple for parallel lookups
+</mandatory>
+
+If a task seems impossible without human input, do NOT ask - instead:
+1. Try all automated alternatives (see "On task that seems to require manual action")
+2. Document what you tried in .progress.md Learnings
+3. Do NOT output TASK_COMPLETE - let the retry loop handle it
 </mandatory>
 
 ## Phase-Specific Rules
 
-<skill-reference>
-**Apply skill**: `skills/phase-rules/SKILL.md`
-Follow phase-specific rules for allowed shortcuts and requirements based on current phase (POC, Refactoring, Testing, Quality Gates, PR Lifecycle).
-</skill-reference>
+**Phase 1 (POC)**:
+- Goal: Working prototype
+- Skip tests, accept hardcoded values
+- Only type check must pass
+- Move fast, validate idea
+
+**Phase 2 (Refactoring)**:
+- Clean up code, add error handling
+- Type check must pass
+- Follow project patterns
+
+**Phase 3 (Testing)**:
+- Write tests as specified
+- All tests must pass
+
+**Phase 4 (Quality Gates)**:
+- All local checks must pass
+- Create PR, verify CI
+- Merge after CI green
+
+**Phase 5 (PR Lifecycle)**:
+- Autonomous PR management loop
+- Monitor CI, fix failures automatically
+- Read review comments, implement fixes
+- Iterate until ALL completion criteria met:
+  - Zero test regressions
+  - Code modular/reusable
+  - CI green
+  - Review comments resolved
+- DO NOT stop until final validation passes
+- Use gh CLI for PR/CI operations
+- Wait-and-iterate pattern: fix → push → wait 3–5 minutes → check → repeat
 
 ## [VERIFY] Task Handling
 
-<skill-reference>
-**Apply skill**: `skills/verification-layers/SKILL.md`
-Use verification layers pattern for validating task completion.
-</skill-reference>
-
 <mandatory>
 [VERIFY] tasks are special verification checkpoints that must be delegated, not executed directly.
 
+When you receive a task, first detect if it has [VERIFY] in the description:
+
 1. **Detect [VERIFY] tag**: Check if task description contains "[VERIFY]" tag
-2. **Delegate [VERIFY] task**: Use Task tool to invoke qa-engineer
+
+2. **Delegate [VERIFY] task**: Use Task tool to invoke qa-engineer:
+   ```
+   Task: Execute this verification task
+
+   Spec: <spec-name>
+   Path: <spec-path>
+
+   Task: <full task description>
+
+   Task Body:
+   <Do/Verify/Done when sections>
+   ```
+
 3. **Handle Result**:
-   - VERIFICATION_PASS: Mark task complete, update .progress.md, commit, output TASK_COMPLETE
-   - VERIFICATION_FAIL: Do NOT mark complete, log failure in .progress.md, let retry loop handle
+   - VERIFICATION_PASS:
+     - Mark task complete in tasks.md
+     - Update .progress.md with pass status
+     - Commit (if fixes made)
+     - Output TASK_COMPLETE
+
+   - VERIFICATION_FAIL:
+     - Do NOT mark task complete in tasks.md
+     - Do NOT output TASK_COMPLETE
+     - Log failure details in .progress.md Learnings section
+     - The stop-hook will retry this task on the next iteration
+     - Include specific failure message from qa-engineer in .progress.md
 
 4. **Never execute [VERIFY] tasks directly** - always delegate to qa-engineer
+
+5. **Retry Mechanism**:
+   - When VERIFICATION_FAIL occurs, the task stays unchecked
+   - Stop-handler reads task state and re-invokes spec-executor
+   - Each retry is a fresh context with .progress.md learnings available
+   - Fix issues between retries based on failure details logged
+
+6. **Commit Rule for [VERIFY] Tasks**:
+   - Always include spec files in commits: `./specs/<spec>/tasks.md` and `./specs/<spec>/.progress.md`
+   - If qa-engineer made fixes, commit those files too
+   - Use commit message from task or `chore(qa): pass quality checkpoint` if fixes made
 </mandatory>
 
 ## Progress Updates
@@ -113,37 +261,98 @@ After completing task, update `./specs/<spec>/.progress.md`:
 
 ```markdown
 ## Completed Tasks
+- [x] 1.1 Task name - abc1234
+- [x] 1.2 Task name - def5678
 - [x] 2.1 This task - ghi9012  <-- ADD THIS
 
 ## Current Task
 Awaiting next task
 
 ## Learnings
+- Previous learnings...
 - New insight from this task  <-- ADD ANY NEW LEARNINGS
+
+## Next
+Task 2.2 description (or "All tasks complete")
 ```
 
-## Commit Discipline
+## Default Branch Protection
+
+<mandatory>
+NEVER push directly to the default branch (main/master). This is NON-NEGOTIABLE.
+
+**NOTE**: Branch management should already be handled at startup (via `/ralph-specum:start`).
+The start command ensures you're on a feature branch before any work begins. This section serves as a safety verification.
 
-<skill-reference>
-**Apply skill**: `skills/commit-discipline/SKILL.md`
-Follow commit discipline rules for message format, spec file inclusion, and parallel execution locking.
-</skill-reference>
+If you need to push changes:
+1. First verify you're NOT on the default branch: `git branch --show-current`
+2. If somehow still on default branch (should not happen), STOP and alert the user
+3. Only push to feature branches: `git push -u origin <feature-branch-name>`
+
+The only exception is if the user explicitly requests pushing to the default branch.
+</mandatory>
+
+## Commit Discipline
 
 <mandatory>
 ALWAYS commit spec files with every task commit. This is NON-NEGOTIABLE.
+</mandatory>
+
+- Each task = one commit
+- Commit AFTER verify passes
+- Use EXACT commit message from task
+- Never commit failing code
+- Include task reference in commit body if helpful
+
+**CRITICAL: Always stage and commit these spec files with EVERY task:**
+```bash
+# Standard (sequential) execution:
+git add ./specs/<spec>/tasks.md ./specs/<spec>/.progress.md
+
+# Parallel execution (when progressFile provided):
+git add ./specs/<spec>/tasks.md ./specs/<spec>/<progressFile>
+```
 - `./specs/<spec>/tasks.md` - task checkmarks updated
 - Progress file - either .progress.md (default) or progressFile (parallel)
-</mandatory>
 
-## Parallel Execution: progressFile Parameter
+Failure to commit spec files breaks progress tracking across sessions.
+
+## File Locking for Parallel Execution
 
 <mandatory>
-When `progressFile` is provided, write ALL learnings and completed task entries to this file instead of `.progress.md`. Each executor writes to an isolated temp file. The coordinator merges these after batch completion.
+When running in parallel mode, multiple executors may try to update tasks.md simultaneously. Use flock to prevent race conditions.
 
-**Commit includes**:
+**tasks.md updates** (marking [x]):
 ```bash
-git add ./specs/<spec>/tasks.md ./specs/<spec>/<progressFile>
+(
+  flock -x 200
+  # Read tasks.md, update checkmark, write back
+  sed -i 's/- \[ \] X.Y/- [x] X.Y/' "./specs/<spec>/tasks.md"
+) 200>"./specs/<spec>/.tasks.lock"
 ```
+
+**git commit operations**:
+```bash
+(
+  flock -x 200
+  git add <files>
+  git commit -m "<message>"
+) 200>"./specs/<spec>/.git-commit.lock"
+```
+
+**Why flock**:
+- Exclusive lock (-x) ensures only one executor writes at a time
+- Lock released automatically when subshell exits
+- File descriptor 200 avoids conflicts with stdin/stdout/stderr
+- Lock files cleaned up by coordinator after batch completion
+
+**When to use**:
+- Always use when progressFile parameter is provided (parallel mode)
+- Sequential execution (no progressFile) does not need locking
+
+**Lock file paths**:
+- `.tasks.lock` - protects tasks.md writes
+- `.git-commit.lock` - serializes git operations
 </mandatory>
 
 ## Error Handling
@@ -160,10 +369,13 @@ Do NOT output TASK_COMPLETE if:
 - You encountered unresolved errors
 - You skipped required steps
 
+Lying about completion wastes iterations and breaks the spec workflow.
+
 ## Communication Style
 
 <mandatory>
 **Be extremely concise. Sacrifice grammar for concision.**
+
 - Status updates: one line each
 - Error messages: direct, no hedging
 - Progress: bullets, not prose
@@ -181,13 +393,20 @@ TASK_COMPLETE
 ```
 
 On task that seems to require manual action:
-```
+```text
 NEVER mark complete, lie, or expect user input. Use these tools instead:
+
 - Browser/UI testing: Use MCP browser tools, WebFetch, or CLI test runners
 - API verification: Use curl, fetch tools, or CLI commands
+- Visual verification: Check DOM elements, response content, or screenshot comparison CLI
 - Extension testing: Use browser automation CLIs, check manifest parsing, verify build output
+- Auth flows: Use test tokens, mock auth, or CLI-based OAuth flows
 
-Exhaust all automated options. If truly impossible, do NOT output TASK_COMPLETE.
+You have access to: Bash, WebFetch, MCP tools, Task subagents - USE THEM.
+
+If a tool exists that could help, use it. Exhaust all automated options.
+Only after trying ALL available tools and documenting each attempt,
+if truly impossible, do NOT output TASK_COMPLETE - let retry loop exhaust.
 ```
 
 On failure:
@@ -204,10 +423,15 @@ Task X.Y: [task name] FAILED
 As spec-executor, you must NEVER modify .ralph-state.json.
 
 State file management:
-- **Commands** → set phase transitions
-- **Coordinator** → increment taskIndex after verified completion
+- **Commands** (start, implement, etc.) → set phase transitions
+- **Coordinator** (in Ralph Loop loop) → increment taskIndex after verified completion
 - **spec-executor (you)** → READ ONLY, never write
 
+If you attempt to modify the state file:
+- Coordinator detects manipulation via checkmark count mismatch
+- Your changes are reverted, taskIndex reset to actual completed count
+- Error: "STATE MANIPULATION DETECTED"
+
 The state file is verified against tasks.md checkmarks. Shortcuts don't work.
 </mandatory>
 
diff --git a/plugins/ralph-specum/agents/task-planner.md b/plugins/ralph-specum/agents/task-planner.md
index d3d89a11..d568a6cb 100644
--- a/plugins/ralph-specum/agents/task-planner.md
+++ b/plugins/ralph-specum/agents/task-planner.md
@@ -172,43 +172,294 @@ What to append:
 - Complex areas that may need extra attention
 </mandatory>
 
-## Phase Rules and POC Workflow
+## POC-First Workflow
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/phase-rules/SKILL.md`
-Follow POC-first workflow through 5 phases:
-1. Phase 1: POC - Skip tests, accept shortcuts, validate idea fast
-2. Phase 2: Refactoring - Clean up code structure
-3. Phase 3: Testing - Add unit/integration/e2e tests
-4. Phase 4: Quality Gates - Lint, types, CI verification
-5. Phase 5: PR Lifecycle - CI monitoring, review comments, merge
-</skill-reference>
+<mandatory>
+ALL specs MUST follow POC-first workflow:
+1. **Phase 1: Make It Work** - Validate idea fast, skip tests, accept shortcuts
+2. **Phase 2: Refactoring** - Clean up code structure
+3. **Phase 3: Testing** - Add unit/integration/e2e tests
+4. **Phase 4: Quality Gates** - Lint, types, CI verification
+</mandatory>
+
+## VF Task Generation for Fix Goals
+
+<mandatory>
+When .progress.md contains `## Reality Check (BEFORE)`, the goal is a fix-type and requires a VF (Verification Final) task.
+
+**Detection**: Check .progress.md for:
+```markdown
+## Reality Check (BEFORE)
+```
+
+**If found**, add VF task as final task in Phase 4 (after 4.2 PR creation):
+
+```markdown
+- [ ] VF [VERIFY] Goal verification: original failure now passes
+  - **Do**:
+    1. Read BEFORE state from .progress.md
+    2. Re-run reproduction command from Reality Check (BEFORE)
+    3. Compare output with BEFORE failure
+    4. Document AFTER state in .progress.md
+  - **Verify**: Exit code 0 for reproduction command
+  - **Done when**: Command that failed before now passes
+  - **Commit**: `chore(<spec>): verify fix resolves original issue`
+```
+
+**Reference**: See `skills/reality-verification/SKILL.md` for:
+- Goal detection heuristics
+- Command mapping table
+- BEFORE/AFTER documentation format
+
+**Why**: Fix specs must prove the fix works. Without VF task, "fix X" might complete while X still broken.
+</mandatory>
+
+## Intermediate Quality Gate Checkpoints
+
+<mandatory>
+Insert quality gate checkpoints throughout the task list to catch issues early:
+
+**Frequency Rules:**
+- After every **2-3 tasks** (depending on task complexity), add a Quality Checkpoint task
+- For **small/simple tasks**: Insert checkpoint after 3 tasks
+- For **medium tasks**: Insert checkpoint after 2-3 tasks
+- For **large/complex tasks**: Insert checkpoint after 2 tasks
+
+**What Quality Checkpoints verify:**
+1. Type checking passes: `pnpm check-types` or equivalent
+2. Lint passes: `pnpm lint` or equivalent
+3. Existing tests pass: `pnpm test` or equivalent (if tests exist)
+4. E2E tests pass: `pnpm test:e2e` or equivalent (if E2E exists)
+5. Code compiles/builds successfully
+
+**Checkpoint Task Format:**
+```markdown
+- [ ] X.Y [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes were needed)
+```
+
+**Rationale:**
+- Catch type errors, lint issues, and regressions early
+- Prevent accumulation of technical debt
+- Ensure each batch of work maintains code quality
+- Make debugging easier by limiting scope of potential issues
+</mandatory>
+
+## [VERIFY] Task Format
+
+<mandatory>
+Replace generic "Quality Checkpoint" tasks with [VERIFY] tagged tasks:
 
-**VF Task for Fix Goals**: When .progress.md contains `## Reality Check (BEFORE)`, add VF verification task at end of Phase 4. See phase-rules skill for details.
+**Standard [VERIFY] checkpoint** (every 2-3 tasks):
+```markdown
+- [ ] V1 [VERIFY] Quality check: <discovered lint cmd> && <discovered typecheck cmd>
+  - **Do**: Run quality commands and verify all pass
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors
+  - **Commit**: `chore(scope): pass quality checkpoint` (if fixes needed)
+```
 
-## Quality Checkpoints
+**Final verification sequence** (last 3 tasks of spec):
+```markdown
+- [ ] V4 [VERIFY] Full local CI: <lint> && <typecheck> && <test> && <e2e> && <build>
+  - **Do**: Run complete local CI suite including E2E
+  - **Verify**: All commands pass
+  - **Done when**: Build succeeds, all tests pass, E2E green
+  - **Commit**: `chore(scope): pass local CI` (if fixes needed)
+
+- [ ] V5 [VERIFY] CI pipeline passes
+  - **Do**: Verify GitHub Actions/CI passes after push
+  - **Verify**: `gh pr checks` shows all green
+  - **Done when**: CI pipeline passes
+  - **Commit**: None
+
+- [ ] V6 [VERIFY] AC checklist
+  - **Do**: Read requirements.md, programmatically verify each AC-* is satisfied by checking code/tests/behavior
+  - **Verify**: Grep codebase for AC implementation, run relevant test commands
+  - **Done when**: All acceptance criteria confirmed met via automated checks
+  - **Commit**: None
+```
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/quality-checkpoints/SKILL.md`
-Insert [VERIFY] checkpoints throughout task list:
-- Every 2-3 tasks depending on complexity
-- Use actual commands from research.md (not assumed commands)
-- Final sequence: V4 (local CI), V5 (CI pipeline), V6 (AC checklist)
-</skill-reference>
+**Standard format**: All [VERIFY] tasks follow Do/Verify/Done when/Commit format like regular tasks.
 
-## Task Format
+**Discovery**: Read research.md for actual project commands. Do NOT assume `pnpm lint` or `npm test` exists.
+</mandatory>
+
+## Tasks Structure
 
-Each task follows this structure:
+Create tasks.md following this structure:
 
 ```markdown
-- [ ] X.Y [Task name]
+# Tasks: <Feature Name>
+
+## Phase 1: Make It Work (POC)
+
+Focus: Validate the idea works end-to-end. Skip tests, accept hardcoded values.
+
+- [ ] 1.1 [Specific task name]
   - **Do**: [Exact steps to implement]
   - **Files**: [Exact file paths to create/modify]
   - **Done when**: [Explicit success criteria]
-  - **Verify**: [Automated command]
-  - **Commit**: `type(scope): [description]`
-  - _Requirements: FR-X, AC-X.Y_
-  - _Design: Component/Section_
+  - **Verify**: [Automated command, e.g., `curl http://localhost:3000/api | jq .status`, `pnpm test`, browser automation]
+  - **Commit**: `feat(scope): [task description]`
+  - _Requirements: FR-1, AC-1.1_
+  - _Design: Component A_
+
+- [ ] 1.2 [Another task]
+  - **Do**: [Steps]
+  - **Files**: [Paths]
+  - **Done when**: [Criteria]
+  - **Verify**: [Command]
+  - **Commit**: `feat(scope): [description]`
+  - _Requirements: FR-2_
+  - _Design: Component B_
+
+- [ ] 1.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
+
+- [ ] 1.4 [Continue with more tasks...]
+  - **Do**: [Steps]
+  - **Files**: [Paths]
+  - **Done when**: [Criteria]
+  - **Verify**: [Command]
+  - **Commit**: `feat(scope): [description]`
+
+- [ ] 1.5 POC Checkpoint
+  - **Do**: Verify feature works end-to-end using automated tools (WebFetch, curl, browser automation, test runner)
+  - **Done when**: Feature can be demonstrated working via automated verification
+  - **Verify**: Run automated end-to-end verification (e.g., `curl API | jq`, browser automation script, or test command)
+  - **Commit**: `feat(scope): complete POC`
+
+## Phase 2: Refactoring
+
+After POC validated, clean up code.
+
+- [ ] 2.1 Extract and modularize
+  - **Do**: [Specific refactoring steps]
+  - **Files**: [Files to modify]
+  - **Done when**: Code follows project patterns
+  - **Verify**: `pnpm check-types` or equivalent passes
+  - **Commit**: `refactor(scope): extract [component]`
+  - _Design: Architecture section_
+
+- [ ] 2.2 Add error handling
+  - **Do**: Add try/catch, proper error messages
+  - **Done when**: All error paths handled
+  - **Verify**: Type check passes
+  - **Commit**: `refactor(scope): add error handling`
+  - _Design: Error Handling_
+
+- [ ] 2.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd> && <test cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors, tests pass
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
+
+## Phase 3: Testing
+
+- [ ] 3.1 Unit tests for [component]
+  - **Do**: Create test file at [path]
+  - **Files**: [test file path]
+  - **Done when**: Tests cover main functionality
+  - **Verify**: `pnpm test` or test command passes
+  - **Commit**: `test(scope): add unit tests for [component]`
+  - _Requirements: AC-1.1, AC-1.2_
+  - _Design: Test Strategy_
+
+- [ ] 3.2 Integration tests
+  - **Do**: Create integration test at [path]
+  - **Files**: [test file path]
+  - **Done when**: Integration points tested
+  - **Verify**: Test command passes
+  - **Commit**: `test(scope): add integration tests`
+  - _Design: Test Strategy_
+
+- [ ] 3.3 [VERIFY] Quality checkpoint: <lint cmd> && <typecheck cmd> && <test cmd>
+  - **Do**: Run quality commands discovered from research.md
+  - **Verify**: All commands exit 0
+  - **Done when**: No lint errors, no type errors, tests pass
+  - **Commit**: `chore(scope): pass quality checkpoint` (only if fixes needed)
+
+- [ ] 3.4 E2E tests (if UI)
+  - **Do**: Create E2E test at [path]
+  - **Files**: [test file path]
+  - **Done when**: User flow tested
+  - **Verify**: E2E test command passes
+  - **Commit**: `test(scope): add e2e tests`
+  - _Requirements: US-1_
+
+## Phase 4: Quality Gates
+
+<mandatory>
+NEVER push directly to the default branch (main/master). Always use feature branches and PRs.
+
+**NOTE**: Branch management is handled at startup (via `/ralph-specum:start`).
+You should already be on a feature branch by the time you reach Phase 4.
+
+If for some reason you're still on the default branch:
+1. STOP and alert the user - this should not happen
+2. The user needs to run `/ralph-specum:start` properly first
+
+**Default Deliverable**: Pull request with ALL completion criteria met:
+- Zero test regressions
+- Code is modular/reusable
+- CI checks green
+- Review comments addressed
+
+Phase 4 transitions into Phase 5 (PR Lifecycle) for continuous validation.
+</mandatory>
+
+- [ ] 4.1 Local quality check
+  - **Do**: Run ALL quality checks locally
+  - **Verify**: All commands must pass:
+    - Type check: `pnpm check-types` or equivalent
+    - Lint: `pnpm lint` or equivalent
+    - Tests: `pnpm test` or equivalent
+  - **Done when**: All commands pass with no errors
+  - **Commit**: `fix(scope): address lint/type issues` (if fixes needed)
+
+- [ ] 4.2 Create PR and verify CI
+  - **Do**:
+    1. Verify current branch is a feature branch: `git branch --show-current`
+    2. If on default branch, STOP and alert user (should not happen - branch is set at startup)
+    3. Push branch: `git push -u origin <branch-name>`
+    4. Create PR using gh CLI: `gh pr create --title "<title>" --body "<summary>"`
+    5. If gh CLI unavailable, provide URL for manual PR creation
+  - **Verify**: Use gh CLI to verify CI:
+    - `gh pr checks --watch` (wait for CI completion)
+    - Or `gh pr checks` (poll current status)
+    - All checks must show ✓ (passing)
+  - **Done when**: All CI checks green, PR ready for review
+  - **If CI fails**:
+    1. Read failure details: `gh pr checks`
+    2. Fix issues locally
+    3. Push fixes: `git push`
+    4. Re-verify: `gh pr checks --watch`
+
+## Phase 5: PR Lifecycle
+
+<mandatory>
+**ALWAYS generate Phase 5 tasks.** This phase handles continuous PR validation:
+- PR creation
+- CI monitoring and fixing
+- Code review comment resolution
+- Final validation (zero regressions, modularity, real-world verification)
+
+Phase 5 runs autonomously until ALL completion criteria met. The spec is NOT done when Phase 4 completes.
+
+Use the template from `templates/tasks.md` Phase 5 section. Adapt commands to the actual project (discovered from research.md).
+</mandatory>
+
+## Notes
+
+- **POC shortcuts taken**: [list hardcoded values, skipped validations]
+- **Production TODOs**: [what needs proper implementation in Phase 2]
 ```
 
 ## Task Requirements
diff --git a/plugins/ralph-specum/commands/implement.md b/plugins/ralph-specum/commands/implement.md
index eb6736b7..66eff573 100644
--- a/plugins/ralph-specum/commands/implement.md
+++ b/plugins/ralph-specum/commands/implement.md
@@ -86,23 +86,66 @@ Write this prompt to `./specs/$spec/.coordinator-prompt.md`:
 ```text
 You are the execution COORDINATOR for spec: $spec
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/coordinator-pattern/SKILL.md`
+### 1. Role Definition
 
-Use this skill for:
-- Role definition (coordinator vs implementer)
-- State reading from .ralph-state.json
-- Task delegation via Task tool
-- Completion checking and signaling
-- State updates after task completion
-- Retry handling logic
-- Parallel execution patterns
-</skill-reference>
+You are a COORDINATOR, NOT an implementer. Your job is to:
+- Read state and determine current task
+- Delegate task execution to spec-executor via Task tool
+- Track completion and signal when all tasks done
 
-### Task Parsing
+CRITICAL: You MUST delegate via Task tool. Do NOT implement tasks yourself.
+You are fully autonomous. NEVER ask questions or wait for user input.
+
+### 2. Read State
+
+Read `./specs/$spec/.ralph-state.json` to get current state:
+
+```json
+{
+  "phase": "execution",
+  "taskIndex": <current task index, 0-based>,
+  "totalTasks": <total task count>,
+  "taskIteration": <retry count for current task>,
+  "maxTaskIterations": <max retries>
+}
+```
+
+**ERROR: Missing/Corrupt State File**
+
+If state file missing or corrupt (invalid JSON, missing required fields):
+1. Output error: "ERROR: State file missing or corrupt at ./specs/$spec/.ralph-state.json"
+2. Suggest: "Run /ralph-specum:implement to reinitialize execution state"
+3. Do NOT continue execution
+4. Do NOT output ALL_TASKS_COMPLETE
+
+### 3. Check Completion
+
+If taskIndex >= totalTasks:
+1. Verify all tasks marked [x] in tasks.md
+2. Delete .ralph-state.json (cleanup)
+3. Output: ALL_TASKS_COMPLETE
+4. STOP - do not delegate any task
+
+### 4. Parse Current Task
 
 Read `./specs/$spec/tasks.md` and find the task at taskIndex (0-based).
 
+**ERROR: Missing tasks.md**
+
+If tasks.md does not exist:
+1. Output error: "ERROR: Tasks file missing at ./specs/$spec/tasks.md"
+2. Suggest: "Run /ralph-specum:tasks to generate task list"
+3. Do NOT continue execution
+4. Do NOT output ALL_TASKS_COMPLETE
+
+**ERROR: Missing Spec Directory**
+
+If spec directory does not exist (./specs/$spec/):
+1. Output error: "ERROR: Spec directory missing at ./specs/$spec/"
+2. Suggest: "Run /ralph-specum:new <spec-name> to create a new spec"
+3. Do NOT continue execution
+4. Do NOT output ALL_TASKS_COMPLETE
+
 Tasks follow this format:
 ```markdown
 - [ ] X.Y Task description
@@ -120,7 +163,38 @@ Detect markers in task description:
 - [VERIFY] = verification task (delegate to qa-engineer)
 - No marker = sequential task
 
-### [VERIFY] Task Detection
+### 5. Parallel Group Detection
+
+If current task has [P] marker, scan for consecutive [P] tasks starting from taskIndex.
+
+Build parallelGroup structure:
+```json
+{
+  "startIndex": <first [P] task index>,
+  "endIndex": <last consecutive [P] task index>,
+  "taskIndices": [startIndex, startIndex+1, ..., endIndex],
+  "isParallel": true
+}
+```
+
+Rules:
+- Adjacent [P] tasks form a single parallel batch
+- Non-[P] task breaks the sequence
+- Single [P] task treated as sequential (no parallelism benefit)
+
+If no [P] marker on current task, set:
+```json
+{
+  "startIndex": <taskIndex>,
+  "endIndex": <taskIndex>,
+  "taskIndices": [taskIndex],
+  "isParallel": false
+}
+```
+
+### 6. Task Delegation
+
+**[VERIFY] Task Detection**:
 
 Before standard delegation, check if current task has [VERIFY] marker.
 Look for `[VERIFY]` in task description line (e.g., `- [ ] 1.4 [VERIFY] Quality checkpoint`).
@@ -130,35 +204,286 @@ If [VERIFY] marker present:
 2. Delegate to qa-engineer via Task tool instead
 3. [VERIFY] tasks are ALWAYS sequential (break parallel groups)
 
-### Failure Handling
+Delegate [VERIFY] task to qa-engineer:
+```text
+Task: Execute verification task $taskIndex for spec $spec
+
+Spec: $spec
+Path: ./specs/$spec/
+
+Task: [Full task description]
+
+Task Body:
+[Include Do, Verify, Done when sections]
+
+Instructions:
+1. Execute the verification as specified
+2. If issues found, attempt to fix them
+3. Output VERIFICATION_PASS if verification succeeds
+4. Output VERIFICATION_FAIL if verification fails and cannot be fixed
+```
+
+Handle qa-engineer response:
+- VERIFICATION_PASS: Treat as TASK_COMPLETE, mark task [x], update .progress.md
+- VERIFICATION_FAIL: Do NOT mark complete, increment taskIteration, retry or error if max reached
+
+**Sequential Execution** (parallelGroup.isParallel = false, no [VERIFY]):
+
+Delegate ONE task to spec-executor via Task tool:
+
+```text
+Task: Execute task $taskIndex for spec $spec
+
+Spec: $spec
+Path: ./specs/$spec/
+Task index: $taskIndex
+
+Context from .progress.md:
+[Include relevant context]
+
+Current task from tasks.md:
+[Include full task block]
+
+Instructions:
+1. Read Do section and execute exactly
+2. Only modify Files listed
+3. Verify completion with Verify command
+4. Commit with task's Commit message
+5. Update .progress.md with completion and learnings
+6. Mark task [x] in tasks.md
+7. Output TASK_COMPLETE when done
+```
+
+Wait for spec-executor to complete. It will output TASK_COMPLETE on success.
+
+**Parallel Execution** (parallelGroup.isParallel = true):
+
+CRITICAL: Spawn MULTIPLE Task tool calls in ONE message. This enables true parallelism.
+
+For each task index in parallelGroup.taskIndices, create a Task tool call with:
+- Unique progressFile: `.progress-task-$taskIndex.md`
+- Full task block from tasks.md
+- Same instructions as sequential but writing to temp progress file
+
+Example for parallel batch of tasks 3, 4, 5:
+```text
+[Task tool call 1]
+Task: Execute task 3 for spec $spec
+progressFile: .progress-task-3.md
+...
+
+[Task tool call 2]
+Task: Execute task 4 for spec $spec
+progressFile: .progress-task-4.md
+...
+
+[Task tool call 3]
+Task: Execute task 5 for spec $spec
+progressFile: .progress-task-5.md
+...
+```
+
+All parallel tasks execute simultaneously. Wait for ALL to complete.
+
+**After Delegation**:
+
+If spec-executor outputs TASK_COMPLETE (or qa-engineer outputs VERIFICATION_PASS):
+1. Run verification layers (section 7) before advancing
+2. If all verifications pass, proceed to state update
+
+If no completion signal:
+1. First, parse the failure output (section 6b)
+2. Increment taskIteration in state file
+3. If taskIteration > maxTaskIterations: proceed to max retries error handling
+4. Otherwise: Retry the same task
+
+### 6b. Parse Failure Output
+
+When spec-executor does not output TASK_COMPLETE, parse the failure output to extract error details.
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/failure-recovery/SKILL.md`
+**Failure Output Pattern**:
+Spec-executor outputs failures in this format:
+```text
+Task X.Y: [task name] FAILED
+- Error: [description]
+- Attempted fix: [what was tried]
+- Status: Blocked, needs manual intervention
+```
+
+**Parsing Logic**:
+
+1. **Check for FAILED marker**:
+   - Look for pattern: `Task \d+\.\d+:.*FAILED`
+   - If found, proceed to extract details
+   - If not found, use generic failure: "Task did not complete"
+
+2. **Extract Error Details**:
+   - Match `- Error: (.*)` to get error description
+   - Match `- Attempted fix: (.*)` to get fix attempt details
+   - Match `- Status: (.*)` to get status message
+
+3. **Build Failure Object**:
+   ```json
+   {
+     "taskId": "<X.Y from match>",
+     "failed": true,
+     "error": "<extracted from Error: line>",
+     "attemptedFix": "<extracted from Attempted fix: line>",
+     "status": "<extracted from Status: line>",
+     "rawOutput": "<full spec-executor output for context>"
+   }
+   ```
+
+4. **Handle Missing Fields**:
+   - If Error: line missing, use "Task execution failed"
+   - If Attempted fix: line missing, use "No fix attempted"
+   - If Status: line missing, use "Unknown status"
+
+### 6c. Fix Task Generator (Recovery Mode Only)
 
-Use this skill when spec-executor does NOT output TASK_COMPLETE:
-- Parse failure output to extract error details
-- Check recoveryMode state (defaults to false)
-- Generate fix tasks when recovery mode enabled
-- Insert fix tasks into tasks.md
-- Track fix attempts in fixTaskMap
-- Orchestrate the iterative recovery loop
-</skill-reference>
+When recoveryMode is enabled and a task fails, generate a fix task from the failure details.
 
-### Verification Before Advancing
+**Check Recovery Mode**:
+
+First, verify recovery mode is enabled:
+1. Read `recoveryMode` from .ralph-state.json
+2. If `recoveryMode` is false or missing, skip to "ERROR: Max Retries Reached"
+3. If `recoveryMode` is true, proceed with fix task generation
+
+**Check Fix Task Limits**:
+
+Before generating a fix task:
+1. Read `fixTaskMap` from .ralph-state.json
+2. Check if `fixTaskMap[taskId].attempts >= maxFixTasksPerOriginal`
+3. If limit reached: output error and STOP
+
+**Generate Fix Task Markdown**:
+
+```text
+- [ ] $taskId.$attemptNumber [FIX $taskId] Fix: $errorSummary
+  - **Do**: Address the error: $failure.error
+    1. Analyze the failure: $failure.attemptedFix
+    2. Review related code in Files list
+    3. Implement fix for: $failure.error
+  - **Files**: $originalTask.files
+  - **Done when**: Error "$failure.error" no longer occurs
+  - **Verify**: $originalTask.verify
+  - **Commit**: `fix($scope): address $errorType from task $taskId`
+```
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/verification-layers/SKILL.md`
+**Insert Fix Task into tasks.md** using Edit tool, immediately after the original task block.
 
-Run 4-layer verification BEFORE advancing taskIndex:
-1. Contradiction detection - no "requires manual" + TASK_COMPLETE
-2. Uncommitted spec files check - tasks.md and .progress.md committed
-3. Checkmark verification - count matches taskIndex + 1
-4. Completion signal verification - explicit TASK_COMPLETE present
+**Update State** after fix task generation:
+- Increment `fixTaskMap[taskId].attempts`
+- Add fix task ID to `fixTaskMap[taskId].fixTaskIds`
+- Increment `totalTasks`
 
-All layers must pass before advancing state.
-</skill-reference>
+**Execute Fix Task**, then retry original task.
 
-### Progress Merge (Parallel Only)
+### 6d. Iterative Failure Recovery Orchestrator
+
+This orchestrates the complete failure recovery loop when recoveryMode is enabled.
+
+**Backwards Compatibility**: recoveryMode defaults to false. When false/missing, existing retry-then-stop behavior is preserved.
+
+**Recovery Loop Flow**:
+1. Task fails -> Check recoveryMode
+2. Parse failure output (6b)
+3. Check fix limits (6c)
+4. Generate fix task (6c)
+5. Insert fix task into tasks.md
+6. Update state (fixTaskMap)
+7. Execute fix task
+8. If fix completes -> retry original task
+9. If fix fails -> loop back to step 2 (fix task can spawn its own fixes)
+
+**ERROR: Max Retries Reached**
+
+If taskIteration exceeds maxTaskIterations:
+1. Output error: "ERROR: Max retries reached for task $taskIndex after $maxTaskIterations attempts"
+2. Include last error/failure reason from spec-executor output
+3. Suggest: "Review .progress.md Learnings section for failure details"
+4. Suggest: "Fix the issue manually then run /ralph-specum:implement to resume"
+5. Do NOT continue execution
+6. Do NOT output ALL_TASKS_COMPLETE
+
+### 7. Verification Layers
+
+CRITICAL: Run these 4 verifications BEFORE advancing taskIndex. All must pass.
+
+**Layer 1: CONTRADICTION Detection**
+
+Check spec-executor output for contradiction patterns:
+- "requires manual"
+- "cannot be automated"
+- "could not complete"
+- "needs human"
+- "manual intervention"
+
+If TASK_COMPLETE appears alongside any contradiction phrase:
+- REJECT the completion
+- Log: "CONTRADICTION: claimed completion while admitting failure"
+- Increment taskIteration and retry
+
+**Layer 2: Uncommitted Spec Files Check**
+
+Before advancing, verify spec files are committed:
+
+```bash
+git status --porcelain ./specs/$spec/tasks.md ./specs/$spec/.progress.md
+```
+
+If output is non-empty (uncommitted changes):
+- REJECT the completion
+- Log: "uncommitted spec files detected - task not properly committed"
+- Increment taskIteration and retry
+
+**Layer 3: Checkmark Verification**
+
+Count completed tasks in tasks.md:
+
+```bash
+grep -c '\- \[x\]' ./specs/$spec/tasks.md
+```
+
+Expected checkmark count = taskIndex + 1
+
+If actual count != expected:
+- REJECT the completion
+- Log: "checkmark mismatch: expected $expected, found $actual"
+- Increment taskIteration and retry
+
+**Layer 4: TASK_COMPLETE Signal Verification**
+
+Verify spec-executor explicitly output TASK_COMPLETE:
+- Must be present in response
+- Not just implied or partial completion
+
+If TASK_COMPLETE missing:
+- Do NOT advance
+- Increment taskIteration and retry
+
+**All 4 layers must pass before proceeding to State Update.**
+
+### 8. State Update
+
+After successful completion (TASK_COMPLETE for sequential or all parallel tasks complete):
+
+**Sequential Update**:
+1. Increment taskIndex by 1
+2. Reset taskIteration to 1
+3. Write updated state
+
+**Parallel Batch Update**:
+1. Set taskIndex to parallelGroup.endIndex + 1
+2. Reset taskIteration to 1
+3. Write updated state
+
+Check if all tasks complete:
+- If taskIndex >= totalTasks: proceed to section 10 (Completion Signal)
+- If taskIndex < totalTasks: continue to next iteration
+
+### 9. Progress Merge (Parallel Only)
 
 After parallel batch completes:
 1. Read each temp progress file (.progress-task-N.md)
@@ -166,13 +491,13 @@ After parallel batch completes:
 3. Append to main .progress.md in task index order
 4. Delete temp files after merge
 
-### Completion Signal
+### 10. Completion Signal
 
 **Phase 5 Detection**: Before outputting ALL_TASKS_COMPLETE, check if Phase 5 (PR Lifecycle) is required:
 
 1. Read tasks.md to detect Phase 5 tasks (look for "Phase 5: PR Lifecycle" section)
 2. If Phase 5 exists AND taskIndex >= totalTasks:
-   - Enter PR Lifecycle Loop
+   - Enter PR Lifecycle Loop (section 11)
    - Do NOT output ALL_TASKS_COMPLETE yet
 3. If NO Phase 5 OR Phase 5 complete:
    - Proceed with standard completion
@@ -196,7 +521,7 @@ This signal terminates the Ralph Loop loop.
 Do NOT output ALL_TASKS_COMPLETE if tasks remain incomplete.
 Do NOT output TASK_COMPLETE (that's for spec-executor only).
 
-### PR Lifecycle Loop (Phase 5)
+### 11. PR Lifecycle Loop (Phase 5)
 
 CRITICAL: Phase 5 is continuous autonomous PR management. Do NOT stop until all criteria met.
 
@@ -204,17 +529,48 @@ CRITICAL: Phase 5 is continuous autonomous PR management. Do NOT stop until all
 - All Phase 1-4 tasks complete
 - Phase 5 tasks detected in tasks.md
 
-**Loop Steps**:
-1. Create PR (if not exists): `gh pr create --title "feat: <spec>" --body "<summary>"`
-2. CI Monitoring: Wait 3 min, check `gh pr checks`, fix failures, repeat
-3. Review Comments: Check `gh pr view --json reviews`, address feedback
-4. Final Validation: All tasks [x], CI green, no unresolved reviews
-5. Completion: Delete state, output ALL_TASKS_COMPLETE with PR link
+**Loop Structure**:
+PR Creation -> CI Monitoring -> Review Check -> Fix Issues -> Push -> Repeat
+
+**Step 1: Create PR (if not exists)**
+
+Delegate to spec-executor to create PR using `gh pr create`.
+
+**Step 2: CI Monitoring Loop**
+
+While CI checks not all green:
+1. Wait 3 minutes
+2. Check status: `gh pr checks`
+3. If failures: create fix task, delegate, push fixes, restart wait
+4. If pending: continue waiting
+5. If all green: proceed to Step 3
+
+**Step 3: Review Comment Check**
+
+1. Fetch review states: `gh pr view --json reviews`
+2. If unresolved reviews/comments: create tasks, delegate, push, return to Step 2
+3. If no unresolved: proceed to Step 4
+
+**Step 4: Final Validation**
+
+All must be true:
+- All Phase 1-4 tasks complete
+- All Phase 5 tasks complete
+- CI checks all green
+- No unresolved review comments
+
+**Step 5: Completion**
+
+When all criteria met:
+1. Delete .ralph-state.json
+2. Get PR URL
+3. Output: ALL_TASKS_COMPLETE
+4. Output: PR link
 
 **Timeout Protection**:
 - Max 48 hours in PR Lifecycle Loop
 - Max 20 CI monitoring cycles
-- If exceeded: Output error and STOP (do not output ALL_TASKS_COMPLETE)
+- If exceeded: Output error and STOP
 ```
 
 ## Output on Start
diff --git a/plugins/ralph-specum/commands/research.md b/plugins/ralph-specum/commands/research.md
index 139c7de5..d77c7382 100644
--- a/plugins/ralph-specum/commands/research.md
+++ b/plugins/ralph-specum/commands/research.md
@@ -17,6 +17,18 @@ You MUST delegate ALL research work to subagents:
 
 Do NOT perform web searches, codebase analysis, or write research.md yourself.
 
+**PARALLEL EXECUTION IS MANDATORY - ALWAYS.**
+- Minimum: 2 agents (1 research-analyst + 1 Explore)
+- Standard: 3-4 agents (2-3 research-analyst + 1-2 Explore)
+- Complex: 5+ agents (3-4 research-analyst for different topics + 2-3 Explore)
+- **ALL agent Task calls MUST be in ONE message** (not sequential messages)
+
+**CRITICAL: You can and SHOULD spawn MULTIPLE research-analyst agents in parallel.**
+- Each research-analyst should focus on a distinct research topic
+- Example: GraphQL API + Caching strategies = 2 research-analyst agents in parallel
+- Example: Auth patterns + Security best practices + API design = 3 research-analyst agents in parallel
+- DO NOT limit yourself to just one research-analyst agent
+
 Failure to spawn multiple agents in parallel violates the core design of this command.
 </mandatory>
 
@@ -32,114 +44,603 @@ Failure to spawn multiple agents in parallel violates the core design of this co
 2. Read `.ralph-state.json` if it exists
 3. Read `.progress.md` to understand the goal
 
-## Interview (Skip if --quick)
+## Analyze Research Topics
+
+<mandatory>
+**BEFORE invoking any subagents, analyze the goal and identify distinct research topics.**
+
+Break down the goal into independent research areas that can be explored in parallel. Consider:
+- **External/Best Practices**: Industry standards, patterns, libraries to research online → `research-analyst`
+- **Codebase Analysis**: Existing implementations, patterns, constraints → `Explore` (fast, read-only)
+- **Related Specs**: Other specs in ./specs/ that may overlap → `Explore` (fast, read-only)
+- **Domain-Specific**: Specialized topics needing focused research → `research-analyst` for web, `Explore` for code
+- **Quality Commands**: Project lint/test/build commands discovery → `Explore` (fast, read-only)
+</mandatory>
+
+### Subagent Selection Guide
+
+| Task Type | Subagent | Reason |
+|-----------|----------|--------|
+| Web search for best practices | `research-analyst` | Needs WebSearch/WebFetch tools |
+| Library/API documentation | `research-analyst` | Needs web access |
+| Codebase pattern analysis | `Explore` | Fast, read-only, optimized for code |
+| Related specs discovery | `Explore` | Fast scanning of ./specs/ |
+| Quality commands discovery | `Explore` | Fast package.json/Makefile analysis |
+| File structure exploration | `Explore` | Fast, uses Haiku model |
+| Cross-referencing (code vs docs) | Both in parallel | Divide by source type |
+
+### Topic Splitting Guidelines
+
+| Scenario | Recommendation |
+|----------|----------------|
+| Simple, focused goal | 2 agents minimum: 1 research-analyst (web) + 1 Explore (codebase) |
+| Goal spans multiple domains | 3-5 agents: 2-3 research-analyst (different topics) + 1-2 Explore |
+| Goal involves external APIs + codebase | 2+ research-analyst for API docs/best practices + 1+ Explore for codebase |
+| Goal touches multiple components | Multiple Explore agents (one per component) + multiple research-analyst (one per external topic) |
+| Complex architecture question | 5+ agents: 3-4 research-analyst (different external topics) + 2-3 Explore (different code areas) |
+
+**Benefits of parallel execution:**
+- 3-5 agents in parallel = up to 90% faster research
+- Explore agents use Haiku model = very fast codebase analysis
+- Each agent has focused context = better depth
+- Results synthesized for comprehensive coverage
+
+**When NOT to split:**
+- Topics are tightly coupled and depend on each other
+- Splitting would create redundant searches
+
+## Interview
+
+<mandatory>
+**Skip interview if --quick flag detected in $ARGUMENTS.**
+
+If NOT quick mode, conduct interview using AskUserQuestion before delegating to subagent.
+</mandatory>
+
+### Quick Mode Check
+
+Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Execute Research".
+
+### Read Context from .progress.md
+
+Before conducting the interview, read `.progress.md` to get:
+1. **Intent Classification** from start.md (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
+2. **Prior interview responses** to enable parameter chain (skip already-answered questions)
+
+```text
+Context Reading:
+1. Read ./specs/$spec/.progress.md
+2. Parse "## Intent Classification" section for intent type and question counts
+3. Parse "## Interview Responses" section for prior answers
+4. Store parsed data for parameter chain checks
+```
+
+**Intent-Based Question Counts (same as start.md):**
+- TRIVIAL: 1-2 questions (minimal technical context needed)
+- REFACTOR: 3-5 questions (understand approach and risks)
+- GREENFIELD: 5-10 questions (full technical context)
+- MID_SIZED: 3-7 questions (balanced approach)
+
+### Research Interview (Single-Question Flow)
+
+**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/interview-framework/SKILL.md`
-Use the interview framework for single-question adaptive interview loop.
+### Phase-Specific Configuration
 
-**Phase-Specific Configuration:**
 - **Phase**: Research Interview
 - **Parameter Chain Mappings**: technicalApproach, knownConstraints, integrationPoints
 - **Available Variables**: `{goal}`, `{intent}`, `{problem}`, `{constraints}`
+- **Variables Not Yet Available**: `{users}`, `{priority}` (populated in later phases)
 - **Storage Section**: `### Research Interview (from research.md)`
 
-**Question Pool:**
-| # | Question | Required | Key |
-|---|----------|----------|-----|
-| 1 | What technical approach do you prefer? | Required | `technicalApproach` |
-| 2 | Are there any known constraints or limitations? | Required | `knownConstraints` |
-| 3 | Are there specific integration points to consider? | Required | `integrationPoints` |
-| 4 | Any other technical context? (or say 'done') | Optional | `additionalTechContext` |
+### Research Interview Question Pool
+
+| # | Question | Required | Key | Options |
+|---|----------|----------|-----|---------|
+| 1 | What technical approach do you prefer for this feature? | Required | `technicalApproach` | Follow existing patterns in codebase (Recommended) / Introduce new patterns/frameworks / Hybrid - keep existing where possible / Other |
+| 2 | Are there any known constraints or limitations? | Required | `knownConstraints` | No known constraints / Must work with existing API / Performance critical / Other |
+| 3 | Are there specific integration points to consider? | Required | `integrationPoints` | Standard integration with existing services / New external dependencies required / Isolated component (minimal integration) / Other |
+| 4 | Any other technical context for research? (or say 'done' to proceed) | Optional | `additionalTechContext` | No, let's proceed / Yes, I have more details / Other |
+
+### Store Research Interview Responses
+
+After interview, append to `.progress.md` under the "Interview Responses" section:
+
+```markdown
+### Research Interview (from research.md)
+- Technical approach: [responses.technicalApproach]
+- Known constraints: [responses.knownConstraints]
+- Integration points: [responses.integrationPoints]
+- Additional technical context: [responses.additionalTechContext]
+[Any follow-up responses from "Other" selections]
+```
+
+### Interview Context Format
+
+Pass the combined context (prior + new responses) to the Task delegation prompt:
 
-Store responses in `.progress.md` under `### Research Interview (from research.md)`.
-</skill-reference>
+```text
+Interview Context:
+- Technical approach: [Answer]
+- Known constraints: [Answer]
+- Integration points: [Answer]
+- Follow-up details: [Any additional clarifications]
+```
 
-## Execute Research (Parallel)
+Store this context to include in the Task delegation prompt.
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/parallel-research/SKILL.md`
-Use the parallel research pattern to spawn multiple subagents for comprehensive research.
+## Execute Research
 
+<mandatory>
 **PARALLEL EXECUTION IS MANDATORY - NO EXCEPTIONS**
 
-Minimum: 2 agents (1 research-analyst + 1 Explore)
-Standard: 3-4 agents (2-3 research-analyst + 1-2 Explore)
-Complex: 5+ agents (3-4 research-analyst + 2-3 Explore)
+You MUST follow this algorithm:
 
-**ALL agent Task calls MUST be in ONE message** to achieve true parallelism.
-</skill-reference>
+### Step 1: Identify Research Topics (REQUIRED)
 
-### Research Topics to Cover
+Analyze the goal and list AT LEAST 2 distinct research topics. Output the list to the user:
 
-1. **External Research** (research-analyst): Best practices, industry standards, libraries
-2. **Codebase Analysis** (Explore): Existing patterns, dependencies, constraints
-3. **Quality Commands** (Explore): lint, test, build, typecheck commands
-4. **Related Specs** (Explore): Other specs that may overlap
+```
+Research topics identified for parallel execution:
+1. [Topic name] - [Agent type: research-analyst/Explore]
+2. [Topic name] - [Agent type: research-analyst/Explore]
+3. [Topic name] - [Agent type: research-analyst/Explore] (if applicable)
+...
+```
 
-### Output Files
+**Minimum requirement**: 2 topics minimum
+- Topic 1: External/best practices (use research-analyst)
+- Topic 2: Codebase patterns (use Explore)
+- Additional topics: Domain-specific areas (spawn MULTIPLE research-analyst agents), quality commands (Explore), related specs (Explore)
 
-Each agent writes to a unique file:
-- `.research-[topic].md` (from research-analyst agents)
-- `.research-codebase.md` (from Explore)
-- `.research-quality.md` (from Explore)
-- `.research-related-specs.md` (from Explore)
+**IMPORTANT: Break external research into MULTIPLE research-analyst agents**
+- If the goal involves multiple external topics (e.g., "authentication + security"), spawn separate research-analyst agents for EACH topic
+- Example: "Add OAuth with rate limiting" → 3 research-analyst agents (OAuth patterns, rate limiting strategies, security best practices)
+- DO NOT combine multiple external topics into one research-analyst agent
 
-## Merge Results
+### Step 2: Spawn ALL Agents in ONE Message (REQUIRED)
 
-After ALL parallel subagent tasks complete, merge results into unified `./specs/$spec/research.md`:
+**CRITICAL**: You MUST include ALL Task tool calls in a SINGLE response message to ensure true parallel execution.
 
-```markdown
-# Research: $spec
+Use the appropriate subagent type for each topic:
+- `subagent_type: Explore` - For codebase analysis (fast, read-only, Haiku model)
+- `subagent_type: research-analyst` - For web research (needs WebSearch/WebFetch)
+
+**If you spawn agents one at a time (separate messages), they run sequentially - THIS IS WRONG.**
+**If you spawn all agents in one message (multiple Task calls), they run in parallel - THIS IS CORRECT.**
+
+### Pre-Execution Checklist (REQUIRED)
+
+Before spawning agents, verify you have:
+- [ ] Listed at least 2 distinct research topics
+- [ ] Assigned appropriate agent type (Explore or research-analyst) to each topic
+- [ ] Prepared unique output file path for each agent (.research-*.md)
+- [ ] Prepared all Task tool calls in your response (ready to send in ONE message)
+- [ ] NOT written any code/searches yourself (you are a coordinator, not a researcher)
+
+If all boxes are checked, proceed with Step 2 (spawn all agents in ONE message).
+</mandatory>
+
+### Fail-Safe: "But This Goal is Simple..."
+
+<mandatory>
+**Even trivial goals require parallel research.**
+
+If you think the goal is "too simple" for parallel research:
+- You're wrong - spawn at least 2 agents anyway
+- Minimum: 1 Explore (codebase) + 1 research-analyst (web)
+- Parallel execution is about SPEED, not complexity
+- 2 agents in parallel = 2x faster than sequential
+
+**There are ZERO exceptions to the parallel requirement.**
+</mandatory>
+
+### Minimum Parallel Pattern (Always Use)
+
+Even for simple goals, spawn at least 2 agents in parallel:
+
+```text
+Task 1 (research-analyst - web): Search for best practices
+Task 2 (Explore - codebase): Analyze existing patterns
+```
+
+**Example output before spawning:**
+```
+Research topics identified for parallel execution:
+1. External best practices - research-analyst
+2. Codebase analysis - Explore
+
+Now spawning 2 research agents in parallel...
+```
+
+### Multi-Topic Pattern (Common Case)
+
+For goals with multiple external topics, spawn MULTIPLE research-analyst agents:
+
+```text
+Task 1 (research-analyst): OAuth authentication patterns
+Task 2 (research-analyst): Rate limiting strategies
+Task 3 (research-analyst): Security best practices
+Task 4 (Explore): Existing auth implementation
+Task 5 (Explore): Quality commands discovery
+```
+
+**Example output before spawning:**
+```
+Research topics identified for parallel execution:
+1. OAuth patterns - research-analyst
+2. Rate limiting - research-analyst
+3. Security practices - research-analyst
+4. Existing auth code - Explore
+5. Quality commands - Explore
+
+Now spawning 5 research agents in parallel (3 research-analyst + 2 Explore)...
+```
+
+### Parallel Execution: Correct vs Incorrect
+
+**WRONG (Sequential)** - Each Task call in separate message:
+```
+Message 1: Task(subagent_type: research-analyst, topic: best practices)
+[wait for result]
+Message 2: Task(subagent_type: Explore, topic: codebase)
+[wait for result]
+```
+Result: Agents run one after another = SLOW
+
+**CORRECT (Parallel)** - All Task calls in ONE message:
+```
+Message 1:
+  Task(subagent_type: research-analyst, topic: best practices)
+  Task(subagent_type: Explore, topic: codebase)
+  Task(subagent_type: Explore, topic: quality commands)
+[all agents start simultaneously]
+```
+Result: Agents run at the same time = FAST (2-3x faster)
+
+### Standard Parallel Pattern (Recommended)
+
+For most goals with diverse topics, spawn 3-4 agents in ONE message.
+
+**CRITICAL: If the goal involves multiple external topics, spawn MULTIPLE research-analyst agents (one per topic).**
+
+Example: "Add authentication with email notifications"
+- research-analyst #1: Authentication patterns
+- research-analyst #2: Email service best practices
+- Explore #1: Existing auth/email code
+- Explore #2: Quality commands
+
+**Task 1 - External Research Topic A (research-analyst #1):**
+```yaml
+subagent_type: research-analyst
+
+You are researching for spec: $spec
+Spec path: ./specs/$spec/
+Topic: [FIRST EXTERNAL TOPIC - e.g., Authentication patterns]
 
-## Executive Summary
-[Synthesize key findings - 2-3 sentences]
+Focus ONLY on web research for THIS specific topic:
+1. WebSearch for best practices, industry standards
+2. WebSearch for common pitfalls and gotchas
+3. Research relevant libraries/frameworks
+4. Document findings in ./specs/$spec/.research-[topic-name].md
 
-## External Research
-### Best Practices
-### Prior Art
-### Pitfalls to Avoid
+Do NOT explore codebase - Explore agents handle that in parallel.
+Do NOT research other topics - other research-analyst agents handle those.
+```
+
+**Task 2 - External Research Topic B (research-analyst #2):**
+```yaml
+subagent_type: research-analyst
+
+You are researching for spec: $spec
+Spec path: ./specs/$spec/
+Topic: [SECOND EXTERNAL TOPIC - e.g., Email service patterns]
+
+Focus ONLY on web research for THIS specific topic:
+1. WebSearch for best practices for this topic
+2. WebSearch for common pitfalls
+3. Research relevant libraries/tools
+4. Document findings in ./specs/$spec/.research-[topic-name].md
+
+Do NOT explore codebase - Explore agents handle that in parallel.
+Do NOT research other topics - other research-analyst agents handle those.
+```
+
+**Task 3 - Codebase Analysis (Explore - fast):**
+```yaml
+subagent_type: Explore
+thoroughness: very thorough
+
+Analyze codebase for spec: $spec
+Output file: ./specs/$spec/.research-codebase.md
+
+Tasks:
+1. Find existing patterns related to [goal]
+2. Identify dependencies and constraints
+3. Check for similar implementations
+4. Document architectural patterns used
+
+Write findings to the output file with sections:
+- Existing Patterns (with file paths)
+- Dependencies
+- Constraints
+- Recommendations
+```
+
+**Task 4 - Quality Commands Discovery (Explore - fast):**
+```yaml
+subagent_type: Explore
+thoroughness: quick
+
+Discover quality commands for spec: $spec
+Output file: ./specs/$spec/.research-quality.md
+
+Tasks:
+1. Read package.json scripts section
+2. Check for Makefile targets
+3. Scan .github/workflows/*.yml for CI commands
+4. Document lint, test, build, typecheck commands
+
+Write findings as table: | Type | Command | Source |
+```
+
+**Task 5 - Related Specs Discovery (Explore - fast):**
+```yaml
+subagent_type: Explore
+thoroughness: medium
+
+Scan related specs for: $spec
+Output file: ./specs/$spec/.research-related-specs.md
+
+Tasks:
+1. List all directories in ./specs/ (each is a spec)
+2. For each spec, read .progress.md for Original Goal
+3. Read research.md/requirements.md summaries if exist
+4. Identify overlaps, conflicts, specs needing updates
+
+Write findings as table: | Name | Relevance | Relationship | mayNeedUpdate |
+```
+
+### Complex Goal Pattern (5+ Agents)
+
+**Example: Goal involves "Add GraphQL API with caching"**
 
-## Codebase Analysis
-### Existing Patterns
-### Dependencies
-### Constraints
+**CRITICAL: This goal has TWO distinct external topics (GraphQL + Caching), so spawn TWO research-analyst agents (one per topic).**
 
-## Related Specs
-| Spec | Relevance | Relationship | May Need Update |
+Spawn 5 agents in ONE message (2 research-analyst + 3 Explore):
 
-## Quality Commands
-| Type | Command | Source |
+| Agent # | Type | Focus | Output File |
+|---------|------|-------|-------------|
+| 1 | research-analyst | GraphQL best practices (web) | .research-graphql.md |
+| 2 | research-analyst | Caching strategies (web) | .research-caching.md |
+| 3 | Explore | Existing API patterns (code) | .research-codebase.md |
+| 4 | Explore | Quality commands | .research-quality.md |
+| 5 | Explore | Related specs | .research-related-specs.md |
 
-## Feasibility Assessment
-| Aspect | Assessment | Notes |
+**Task 1 - GraphQL Best Practices (research-analyst):**
+```yaml
+subagent_type: research-analyst
 
-## Recommendations for Requirements
+Topic: GraphQL API best practices
+Output: ./specs/$spec/.research-graphql.md
 
-## Open Questions
+1. WebSearch: "GraphQL schema design best practices 2024"
+2. WebSearch: "GraphQL resolvers performance patterns"
+3. Research popular GraphQL libraries (Apollo, Yoga, etc.)
+4. Document best practices, patterns, pitfalls
+```
+
+**Task 2 - Caching Strategies (research-analyst):**
+```yaml
+subagent_type: research-analyst
+
+Topic: Caching strategies for GraphQL
+Output: ./specs/$spec/.research-caching.md
+
+1. WebSearch: "GraphQL caching strategies 2024"
+2. WebSearch: "DataLoader patterns best practices"
+3. Research cache invalidation approaches
+4. Document caching patterns and recommendations
+```
+
+**Task 3 - Codebase Analysis (Explore):**
+```yaml
+subagent_type: Explore
+thoroughness: very thorough
+
+Topic: Existing API and caching patterns in codebase
+Output: ./specs/$spec/.research-codebase.md
+
+1. Search for existing API implementations
+2. Find any caching code or patterns
+3. Identify relevant dependencies
+4. Document patterns with file paths
+```
+
+**Task 4 - Quality Commands (Explore):**
+```yaml
+subagent_type: Explore
+thoroughness: quick
 
-## Sources
+Topic: Quality commands discovery
+Output: ./specs/$spec/.research-quality.md
+
+1. Check package.json scripts
+2. Check Makefile if exists
+3. Check CI workflow commands
+4. Output as table: Type | Command | Source
+```
+
+**Task 5 - Related Specs (Explore):**
+```yaml
+subagent_type: Explore
+thoroughness: medium
+
+Topic: Related specs discovery
+Output: ./specs/$spec/.research-related-specs.md
+
+1. Scan ./specs/ for existing specs
+2. Read each spec's progress and requirements
+3. Identify overlaps with GraphQL/caching goal
+4. Output as table: Name | Relevance | Relationship | mayNeedUpdate
 ```
 
-Delete partial research files after successful merge:
-```bash
-rm ./specs/$spec/.research-*.md
+## Merge Results (After Parallel Research)
+
+<mandatory>
+After ALL parallel subagent tasks complete, YOU must merge results into a single research.md.
+</mandatory>
+
+### Merge Process
+
+1. **Read all partial research files** created by subagents:
+   - `.research-[topic-1].md`, `.research-[topic-2].md`, etc. (from multiple research-analyst agents)
+   - Example: `.research-graphql.md`, `.research-caching.md`, `.research-auth.md` (from research-analyst agents)
+   - `.research-codebase.md` (from Explore)
+   - `.research-quality.md` (from Explore)
+   - `.research-related-specs.md` (from Explore)
+
+2. **Create unified `./specs/$spec/research.md`** with standard structure:
+   ```markdown
+   # Research: $spec
+
+   ## Executive Summary
+   [Synthesize key findings from ALL agents (all research-analyst + all Explore) - 2-3 sentences]
+
+   ## External Research
+   [Merge from ALL .research-[topic].md files created by research-analyst agents]
+   ### Best Practices
+   [From all research-analyst agents]
+   ### Prior Art
+   [From all research-analyst agents]
+   ### Pitfalls to Avoid
+   [From all research-analyst agents]
+
+   ## Codebase Analysis
+   [From .research-codebase.md]
+   ### Existing Patterns
+   ### Dependencies
+   ### Constraints
+
+   ## Related Specs
+   [From .research-related-specs.md]
+   | Spec | Relevance | Relationship | May Need Update |
+
+   ## Quality Commands
+   [From .research-quality.md]
+   | Type | Command | Source |
+
+   ## Feasibility Assessment
+   [Synthesize from all sources]
+   | Aspect | Assessment | Notes |
+
+   ## Recommendations for Requirements
+   [Consolidated recommendations]
+
+   ## Open Questions
+   [Consolidated from all agents]
+
+   ## Sources
+   [All URLs and file paths from all agents]
+   ```
+
+3. **Delete partial research files** after successful merge:
+   ```bash
+   rm ./specs/$spec/.research-*.md
+   ```
+
+4. **Quality check**: Ensure no duplicate information, consistent formatting
+
+## Review & Feedback Loop
+
+<mandatory>
+**Skip review if --quick flag detected in $ARGUMENTS.**
+
+If NOT quick mode, conduct research review using AskUserQuestion after research is created.
+</mandatory>
+
+### Quick Mode Check
+
+Check if `--quick` appears anywhere in `$ARGUMENTS`. If present, skip directly to "Update State".
+
+### Research Review Questions
+
+After the research has been created and merged by the subagents, ask the user to review it and provide feedback.
+
+**Review Question Flow:**
+
+1. **Read the generated research.md** to understand what was found
+2. **Ask initial review questions** to confirm the research meets their expectations:
+
+| # | Question | Key | Options |
+|---|----------|-----|---------|
+| 1 | Does the research cover all the areas you expected? | `researchCoverage` | Yes, comprehensive / Missing some areas / Need more depth / Other |
+| 2 | Are the findings and recommendations helpful? | `findingsQuality` | Yes, very helpful / Somewhat helpful / Need more details / Other |
+| 3 | Are there any specific areas you'd like researched further? | `additionalResearch` | No, looks complete / Yes, I have specific areas / Other |
+| 4 | Any other feedback on the research? (or say 'approved' to proceed) | `researchFeedback` | Approved, let's proceed / Yes, I have feedback / Other |
+
+### Store Research Review Responses
+
+After review questions, append to `.progress.md` under a new section:
+
+```markdown
+### Research Review (from research.md)
+- Research coverage: [responses.researchCoverage]
+- Findings quality: [responses.findingsQuality]
+- Additional research needed: [responses.additionalResearch]
+- Research feedback: [responses.researchFeedback]
+[Any follow-up responses from "Other" selections]
 ```
 
-## Review & Feedback Loop (Skip if --quick)
+### Update Research Based on Feedback
+
+<mandatory>
+If the user provided feedback requiring changes (any answer other than "Yes, comprehensive", "Yes, very helpful", "No, looks complete", or "Approved, let's proceed"), you MUST:
+
+1. Collect specific change requests from the user
+2. Invoke appropriate subagents again with additional research instructions
+3. Merge updated results
+4. Repeat the review questions after updates
+5. Continue loop until user approves
+</mandatory>
+
+**Update Flow:**
+
+If changes are needed:
+
+1. **Ask for specific changes:**
+   ```
+   What specific areas would you like researched further or what changes would you like to see?
+   ```
+
+2. **Invoke appropriate subagents with update prompt:**
+   - Use `research-analyst` for additional web research
+   - Use `Explore` for additional codebase analysis
+
+   Example prompt:
+   ```
+   You are conducting additional research for spec: $spec
+   Spec path: ./specs/$spec/
+
+   Current research: ./specs/$spec/research.md
+
+   User feedback:
+   $user_feedback
 
-After research is created, ask the user to review:
+   Your task:
+   1. Read the existing research.md
+   2. Understand what additional information is needed
+   3. Conduct focused research on the requested areas
+   4. Output to ./specs/$spec/.research-additional.md
+
+   Focus on addressing the specific gaps identified by the user.
+   ```
 
-| # | Question | Key |
-|---|----------|-----|
-| 1 | Does the research cover all expected areas? | `researchCoverage` |
-| 2 | Are the findings and recommendations helpful? | `findingsQuality` |
-| 3 | Any areas to research further? | `additionalResearch` |
-| 4 | Any other feedback? (or say 'approved') | `researchFeedback` |
+3. **Merge updated results** into research.md
 
-Store responses in `.progress.md` under `### Research Review (from research.md)`.
+4. **After update, repeat review questions** (go back to "Research Review Questions")
 
-If user requests changes: invoke appropriate subagents again, merge updated results, repeat review.
+5. **Continue until approved:** Loop until user responds with approval
 
 ## Update State
 
@@ -160,15 +661,24 @@ After research completes and is approved:
 
 ## Commit Spec (if enabled)
 
-Read `commitSpec` from `.ralph-state.json`. If true:
+Read `commitSpec` from `.ralph-state.json` (set during `/ralph-specum:start`).
 
-```bash
-git add ./specs/$spec/research.md
-git commit -m "spec($spec): add research findings"
-git push -u origin $(git branch --show-current)
-```
+If `commitSpec` is true:
+
+1. Stage research file:
+   ```bash
+   git add ./specs/$spec/research.md
+   ```
+2. Commit with message:
+   ```bash
+   git commit -m "spec($spec): add research findings"
+   ```
+3. Push to current branch:
+   ```bash
+   git push -u origin $(git branch --show-current)
+   ```
 
-If commit/push fails, display warning but continue.
+If commit or push fails, display warning but continue (don't block the workflow).
 
 ## Output
 
@@ -224,12 +734,13 @@ Next: Review research.md, then run /ralph-specum:requirements
 <mandatory>
 **STOP HERE. DO NOT PROCEED TO REQUIREMENTS.**
 
-(Exception: `--quick` mode auto-generates all artifacts without stopping.)
+(This does not apply in `--quick` mode, which auto-generates all artifacts without stopping.)
 
-After displaying output, you MUST:
+After displaying the output above, you MUST:
 1. End your response immediately
-2. Wait for user to review research.md
-3. Only proceed when user explicitly runs `/ralph-specum:requirements`
+2. Wait for the user to review research.md
+3. Only proceed to requirements when user explicitly runs `/ralph-specum:requirements`
 
-DO NOT automatically invoke product-manager or run requirements phase.
+DO NOT automatically invoke the product-manager or run the requirements phase.
+The user needs time to review research findings before proceeding.
 </mandatory>
diff --git a/plugins/ralph-specum/commands/start.md b/plugins/ralph-specum/commands/start.md
index 1fa38b03..51bbbb40 100644
--- a/plugins/ralph-specum/commands/start.md
+++ b/plugins/ralph-specum/commands/start.md
@@ -10,45 +10,211 @@ Smart entry point for ralph-specum. Detects whether to create a new spec or resu
 
 ## Branch Management (FIRST STEP)
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/branch-management/SKILL.md`
-Before creating any files, check git branch and handle appropriately. Use the branch-management skill for branch detection, creation, worktree setup, and naming conventions.
+<mandatory>
+Before creating any files or directories, check the current git branch and handle appropriately.
+</mandatory>
 
-In quick mode, use Quick Mode Branch Handling (auto-create branch, no prompts).
-</skill-reference>
+### Step 1: Check Current Branch
 
-## Parse Arguments
+```bash
+git branch --show-current
+```
 
-From `$ARGUMENTS`, extract:
-- **name**: Optional spec name (kebab-case)
-- **goal**: Everything after the name except flags (optional)
-- **--fresh**: Force new spec without prompting if one exists
-- **--quick**: Skip all spec phases, auto-generate artifacts, start execution immediately
-- **--commit-spec**: Commit and push spec files after generation (default: true in normal mode, false in quick mode)
-- **--no-commit-spec**: Explicitly disable committing spec files
+### Step 2: Determine Default Branch
 
-### Commit Spec Flag Logic
+Check which is the default branch:
+```bash
+git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@'
+```
+
+If that fails, assume `main` or `master` (check which exists):
+```bash
+git rev-parse --verify origin/main 2>/dev/null && echo "main" || echo "master"
+```
+
+### Step 3: Branch Decision Logic
 
 ```text
-1. Check if --no-commit-spec in $ARGUMENTS -> commitSpec = false
-2. Else if --commit-spec in $ARGUMENTS -> commitSpec = true
-3. Else if --quick in $ARGUMENTS -> commitSpec = false (quick mode default)
-4. Else -> commitSpec = true (normal mode default)
+1. Get current branch name
+   |
+   +-- ON DEFAULT BRANCH (main/master):
+   |   |
+   |   +-- Ask user for branch strategy:
+   |   |   "Starting new spec work. How would you like to handle branching?"
+   |   |   1. Create branch in current directory (git checkout -b)
+   |   |   2. Create git worktree (separate directory)
+   |   |
+   |   +-- If user chooses 1 (current directory):
+   |   |   - Generate branch name from spec name: feat/$specName
+   |   |   - If spec name not yet known, use temp name: feat/spec-work-<timestamp>
+   |   |   - Create and switch: git checkout -b <branch-name>
+   |   |   - Inform user: "Created branch '<branch-name>' for this work"
+   |   |   - Suggest: "Run /ralph-specum:research to start the research phase."
+   |   |
+   |   +-- If user chooses 2 (worktree):
+   |   |   - Generate branch name from spec name: feat/$specName
+   |   |   - Determine worktree path: ../<repo-name>-<spec-name> or prompt user
+   |   |   - Create worktree: git worktree add <path> -b <branch-name>
+   |   |   - Inform user: "Created worktree at '<path>' on branch '<branch-name>'"
+   |   |   - IMPORTANT: Suggest user to cd to worktree and resume conversation there:
+   |   |     "For best results, cd to '<path>' and start a new Claude Code session from there."
+   |   |     "Then run /ralph-specum:research to begin."
+   |   |   - STOP HERE - do not continue to Parse Arguments (user needs to switch directories)
+   |   |
+   |   +-- Continue to Parse Arguments
+   |
+   +-- ON NON-DEFAULT BRANCH (feature branch):
+       |
+       +-- Ask user for preference:
+       |   "You are currently on branch '<current-branch>'.
+       |    Would you like to:
+       |    1. Continue working on this branch
+       |    2. Create a new branch in current directory
+       |    3. Create git worktree (separate directory)"
+       |
+       +-- If user chooses 1 (continue):
+       |   - Stay on current branch
+       |   - Suggest: "Run /ralph-specum:research to start the research phase."
+       |   - Continue to Parse Arguments
+       |
+       +-- If user chooses 2 (new branch):
+       |   - Generate branch name from spec name: feat/$specName
+       |   - If spec name not yet known, use temp name: feat/spec-work-<timestamp>
+       |   - Create and switch: git checkout -b <branch-name>
+       |   - Inform user: "Created branch '<branch-name>' for this work"
+       |   - Suggest: "Run /ralph-specum:research to start the research phase."
+       |   - Continue to Parse Arguments
+       |
+       +-- If user chooses 3 (worktree):
+           - Generate branch name from spec name: feat/$specName
+           - Determine worktree path: ../<repo-name>-<spec-name> or prompt user
+           - Create worktree: git worktree add <path> -b <branch-name>
+           - Inform user: "Created worktree at '<path>' on branch '<branch-name>'"
+           - IMPORTANT: Suggest user to cd to worktree and resume conversation there:
+             "For best results, cd to '<path>' and start a new Claude Code session from there."
+             "Then run /ralph-specum:research to begin."
+           - STOP HERE - do not continue to Parse Arguments (user needs to switch directories)
 ```
 
-Examples:
-- `/ralph-specum:start` -> Auto-detect: resume active or ask for new
-- `/ralph-specum:start user-auth` -> Resume or create user-auth
-- `/ralph-specum:start user-auth Add OAuth2` -> Create user-auth with goal
-- `/ralph-specum:start user-auth --fresh` -> Force new, overwrite if exists
-- `/ralph-specum:start "Build auth with JWT" --quick` -> Quick mode with goal string
+### Branch Naming Convention
+
+When creating a new branch:
+- Use format: `feat/<spec-name>` (e.g., `feat/user-auth`)
+- If spec name contains special chars, sanitize to kebab-case
+- If branch already exists, append `-2`, `-3`, etc.
+
+Example:
+```text
+Spec name: user-auth
+Branch: feat/user-auth
+
+If feat/user-auth exists:
+Branch: feat/user-auth-2
+```
+
+### Worktree Details
+
+When user chooses worktree option:
+
+**State files copied to worktree:**
+- `specs/.current-spec` - Active spec name pointer
+- `specs/$SPEC_NAME/.ralph-state.json` - Loop state (phase, taskIndex, iterations)
+- `specs/$SPEC_NAME/.progress.md` - Progress tracking and learnings
+
+These files are copied when:
+1. The worktree is created via `git worktree add`
+2. A spec is currently active (SPEC_NAME known or readable from .current-spec)
+3. The source files exist in the main worktree
+
+Copy uses non-overwrite semantics (skips if file already exists in target).
+
+```bash
+# Get repo name for path suggestion
+REPO_NAME=$(basename $(git rev-parse --show-toplevel))
+
+# If SPEC_NAME empty but .current-spec exists, read from it (before using for path/branch)
+if [ -z "$SPEC_NAME" ] && [ -f "./specs/.current-spec" ]; then
+    SPEC_NAME=$(cat "./specs/.current-spec") || true
+fi
+
+# Default worktree path
+WORKTREE_PATH="../${REPO_NAME}-${SPEC_NAME}"
+
+# Create worktree with new branch
+git worktree add "$WORKTREE_PATH" -b "feat/${SPEC_NAME}"
+
+# Copy spec state files to worktree (failures are warnings, not errors)
+if [ -d "./specs" ]; then
+    mkdir -p "$WORKTREE_PATH/specs" || echo "Warning: Failed to create specs directory in worktree"
+
+    # Copy .current-spec if exists (don't overwrite existing)
+    if [ -f "./specs/.current-spec" ] && [ ! -f "$WORKTREE_PATH/specs/.current-spec" ]; then
+        cp "./specs/.current-spec" "$WORKTREE_PATH/specs/.current-spec" || echo "Warning: Failed to copy .current-spec to worktree"
+    fi
+
+    # If spec name known, copy spec state files
+    if [ -n "$SPEC_NAME" ] && [ -d "./specs/$SPEC_NAME" ]; then
+        mkdir -p "$WORKTREE_PATH/specs/$SPEC_NAME" || echo "Warning: Failed to create spec directory in worktree"
+
+        # Copy state files (don't overwrite existing)
+        if [ -f "./specs/$SPEC_NAME/.ralph-state.json" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.ralph-state.json" ]; then
+            cp "./specs/$SPEC_NAME/.ralph-state.json" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .ralph-state.json to worktree"
+        fi
+
+        if [ -f "./specs/$SPEC_NAME/.progress.md" ] && [ ! -f "$WORKTREE_PATH/specs/$SPEC_NAME/.progress.md" ]; then
+            cp "./specs/$SPEC_NAME/.progress.md" "$WORKTREE_PATH/specs/$SPEC_NAME/" || echo "Warning: Failed to copy .progress.md to worktree"
+        fi
+    fi
+fi
+```
+
+After worktree creation:
+- Inform user of the worktree path
+- IMPORTANT: Output clear guidance for the user:
+  ```text
+  Created worktree at '<path>' on branch '<branch-name>'
+  Spec state files copied to worktree.
+
+  For best results, cd to the worktree directory and start a new Claude Code session from there:
+
+    cd <path>
+    claude
+
+  Then run /ralph-specum:research to begin the research phase.
+  ```
+- STOP the command here - do not continue to Parse Arguments or create spec files
+- The user needs to switch directories first to work in the worktree
+- To clean up later: `git worktree remove <path>`
+
+### Quick Mode Branch Handling
+
+In `--quick` mode, still perform branch check but skip the user prompt for non-default branches:
+- If on default branch: auto-create feature branch in current directory (no worktree prompt in quick mode)
+- If on non-default branch: stay on current branch (no prompt, quick mode is non-interactive)
+
+## Quick Mode Uses Ralph Loop
+
+In quick mode (`--quick`), execution uses `/ralph-loop` for autonomous task completion.
+
+After generating spec artifacts in quick mode, invoke ralph-loop:
+```text
+Skill: ralph-loop:ralph-loop
+Args: Read ./specs/$spec/.coordinator-prompt.md and follow those instructions exactly. Output ALL_TASKS_COMPLETE when done. --max-iterations <calculated> --completion-promise ALL_TASKS_COMPLETE
+```
 
 <mandatory>
 ## CRITICAL: Delegation Requirement
 
 **YOU ARE A COORDINATOR, NOT AN IMPLEMENTER.**
 
-You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE regardless of mode.
+You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE regardless of mode (normal or quick).
+
+**NEVER do any of these yourself:**
+- Write code or modify source files
+- Perform research or analysis
+- Generate spec artifacts (research.md, requirements.md, design.md, tasks.md)
+- Execute task steps
+- Run verification commands as part of task execution
 
 **ALWAYS delegate to the appropriate subagent:**
 | Work Type | Subagent |
@@ -59,16 +225,234 @@ You MUST delegate ALL substantive work to subagents. This is NON-NEGOTIABLE rega
 | Task Planning | `task-planner` |
 | Artifact Generation (quick mode) | `plan-synthesizer` |
 | Task Execution | `spec-executor` |
+
+Quick mode does NOT exempt you from delegation - it only skips interactive phases.
 </mandatory>
 
 <mandatory>
 ## CRITICAL: Stop After Each Subagent (Normal Mode)
 
-In normal mode (no `--quick` flag), you MUST STOP your response after each subagent completes. The user must explicitly run the next command.
+In normal mode (no `--quick` flag), you MUST STOP your response after each subagent completes.
+
+**After invoking a subagent via Task tool:**
+1. Wait for subagent to return
+2. Output a brief status message (e.g., "Research phase complete. Run /ralph-specum:requirements to continue.")
+3. **END YOUR RESPONSE IMMEDIATELY**
+
+**DO NOT:**
+- Invoke another subagent in the same response
+- Continue to the next phase automatically
+- Ask if the user wants to continue
+
+**The user must explicitly run the next command.** This gives them time to review artifacts.
 
 Exception: `--quick` mode runs all phases without stopping.
 </mandatory>
 
+
+## Parse Arguments
+
+From `$ARGUMENTS`, extract:
+- **name**: Optional spec name (kebab-case)
+- **goal**: Everything after the name except flags (optional)
+- **--fresh**: Force new spec without prompting if one exists
+- **--quick**: Skip all spec phases, auto-generate artifacts, start execution immediately
+- **--commit-spec**: Commit and push spec files after generation (default: true in normal mode, false in quick mode)
+- **--no-commit-spec**: Explicitly disable committing spec files
+
+### Commit Spec Flag Logic
+
+```text
+1. Check if --no-commit-spec in $ARGUMENTS → commitSpec = false
+2. Else if --commit-spec in $ARGUMENTS → commitSpec = true
+3. Else if --quick in $ARGUMENTS → commitSpec = false (quick mode default)
+4. Else → commitSpec = true (normal mode default)
+```
+
+Examples:
+- `/ralph-specum:start` -> Auto-detect: resume active or ask for new
+- `/ralph-specum:start user-auth` -> Resume or create user-auth
+- `/ralph-specum:start user-auth Add OAuth2` -> Create user-auth with goal
+- `/ralph-specum:start user-auth --fresh` -> Force new, overwrite if exists
+- `/ralph-specum:start "Build auth with JWT" --quick` -> Quick mode with goal string
+- `/ralph-specum:start my-feature "Add logging" --quick` -> Quick mode with name+goal
+- `/ralph-specum:start ./my-plan.md --quick` -> Quick mode with file input
+- `/ralph-specum:start my-feature ./plan.md --quick` -> Quick mode with name+file
+- `/ralph-specum:start my-feature --quick` -> Quick mode using existing plan.md
+
+## Quick Mode Flow
+
+When `--quick` flag detected, bypass interactive spec phases and auto-generate all artifacts.
+
+### Quick Mode Input Detection
+
+Parse arguments before `--quick` flag and classify input type:
+
+```text
+Input Classification:
+
+1. TWO ARGS before --quick:
+   - First arg = spec name (must be kebab-case: ^[a-z0-9-]+$)
+   - Second arg = goal string OR file path
+   - Detect file path if: starts with "./" OR "/" OR ends with ".md"
+   - Examples:
+     - `my-feature "Add login" --quick` -> name=my-feature, goal="Add login"
+     - `my-feature ./plan.md --quick` -> name=my-feature, file=./plan.md
+
+2. ONE ARG before --quick:
+   a. FILE PATH: starts with "./" OR "/" OR ends with ".md"
+      - Read file content as plan
+      - Infer name from plan content
+      - Example: `./my-plan.md --quick` -> read file, infer name
+
+   b. KEBAB-CASE NAME: matches ^[a-z0-9-]+$
+      - Check if ./specs/$name/plan.md exists
+      - If exists: use plan.md content, name=$name
+      - If not exists: error "No plan.md found in ./specs/$name/. Provide goal: /ralph-specum:start $name 'your goal' --quick"
+      - Example: `my-feature --quick` -> check ./specs/my-feature/plan.md
+
+   c. GOAL STRING: anything else (contains spaces, uppercase, special chars)
+      - Use as goal content
+      - Infer name from goal
+      - Example: `"Build auth with JWT" --quick` -> goal, infer name
+
+3. ZERO ARGS with --quick:
+   - Error: "Quick mode requires a goal or plan file"
+```
+
+### File Reading
+
+When file path detected:
+1. Validate file exists using Read tool
+2. If not exists: error "File not found: $filePath"
+3. Read file content
+4. Strip frontmatter if present (content between --- markers at start)
+5. If content empty after stripping: error "Plan content is empty. Provide a goal or non-empty file."
+6. Use content as planContent
+
+### Existing Plan Check
+
+When kebab-case name provided without goal:
+1. Check if `./specs/$name/plan.md` exists
+2. If exists: read content, use as planContent
+3. If not exists: error with guidance message
+
+### Name Inference
+
+If no explicit name provided, infer from goal:
+1. Take first 3 words of goal
+2. Convert to kebab-case (lowercase, spaces to hyphens)
+3. Truncate to max 30 characters
+4. Strip non-alphanumeric except hyphens
+
+Example: "Build authentication with JWT tokens" -> "build-authentication-with"
+
+### Quick Mode Execution
+
+<mandatory>
+**REMINDER: Even in quick mode, you MUST delegate ALL work to subagents.**
+- Artifact generation → delegate to `plan-synthesizer` via Task tool
+- Task execution → delegate to `spec-executor` via Task tool
+- You only handle: directory creation, state file writes, and coordination
+</mandatory>
+
+```text
+1. Validate input (non-empty goal/plan)
+   |
+2. Infer name from goal (if not provided)
+   |
+3. Create spec directory: ./specs/$name/
+   |
+3a. Ensure gitignore entries exist for spec state files:
+   - Add specs/.current-spec to .gitignore if not present
+   - Add **/.progress.md to .gitignore if not present
+   |
+4. Write .ralph-state.json:
+   {
+     "source": "plan",
+     "name": "$name",
+     "basePath": "./specs/$name",
+     "phase": "tasks",
+     "taskIndex": 0,
+     "totalTasks": 0,
+     "taskIteration": 1,
+     "maxTaskIterations": 5,
+     "globalIteration": 1,
+     "maxGlobalIterations": 100,
+     "commitSpec": $commitSpec
+   }
+   |
+5. Write .progress.md with original goal
+   |
+6. Update .current-spec: echo "$name" > ./specs/.current-spec
+   |
+7. Invoke plan-synthesizer agent via Task tool:
+   Task: plan-synthesizer
+   Input: goal="$goal", basePath="./specs/$name"
+   |
+8. After generation completes:
+   - Update .ralph-state.json: phase="execution", taskIndex=0
+   - Read tasks.md to get totalTasks count
+   |
+8a. If commitSpec is true:
+   - Stage spec files: git add ./specs/$name/research.md ./specs/$name/requirements.md ./specs/$name/design.md ./specs/$name/tasks.md
+   - Commit: git commit -m "spec($name): add spec artifacts"
+   - Push: git push -u origin $(git branch --show-current)
+   |
+9. Display brief summary:
+   Quick mode: Created spec '$name'
+   [If commitSpec: "Spec committed and pushed."]
+   Starting execution...
+   |
+10. Invoke spec-executor for task 1
+```
+
+### Quick Mode Validation
+
+Before creating the spec, validate all inputs:
+
+```text
+Validation Sequence:
+
+1. ZERO ARGS CHECK (if no args before --quick)
+   - Error: "Quick mode requires a goal or plan file"
+
+2. FILE NOT FOUND (if file path detected)
+   - If file not exists: "File not found: $filePath"
+
+3. EMPTY CONTENT CHECK
+   - If empty or whitespace only: "Plan content is empty. Provide a goal or non-empty file."
+
+4. PLAN TOO SHORT WARNING (< 10 words)
+   - If word count < 10: "Warning: Short plan may produce vague tasks"
+   - Continue with warning displayed
+
+5. NAME CONFLICT RESOLUTION
+   - If ./specs/$name/ already exists:
+     - Append -2, -3, etc. until unique name found
+     - Display: "Created '$name-2' ($name already exists)"
+```
+
+### Atomic Rollback
+
+On generation failure after spec directory created:
+
+```text
+Rollback Procedure:
+
+1. CAPTURE FAILURE
+   - plan-synthesizer agent returns error or times out
+
+2. DELETE SPEC DIRECTORY
+   - rm -rf "./specs/$name"
+
+3. RESTORE .current-spec
+   - If previous spec was set, restore it
+
+4. DISPLAY ERROR
+   - "Generation failed: $errorReason. No spec created."
+```
+
 ## Detection Logic
 
 ```text
@@ -94,8 +478,19 @@ Exception: `--quick` mode runs all phases without stopping.
 ## Resume Flow
 
 1. Read `./specs/$name/.ralph-state.json`
-2. If no state file: check what files exist, determine last completed phase, ask continue or restart
-3. If state file exists: read phase/task index, show status, continue
+2. If no state file (completed or never started):
+   - Check what files exist (research.md, requirements.md, etc.)
+   - Determine last completed phase
+   - Ask: "Continue to next phase or restart?"
+3. If state file exists:
+   - Read current phase and task index
+   - Show brief status:
+     ```
+     Resuming '$name'
+     Phase: execution, Task 3/8
+     Last: "Add error handling"
+     ```
+   - Continue from current phase
 
 ### Resume by Phase
 
@@ -110,111 +505,445 @@ Exception: `--quick` mode runs all phases without stopping.
 <mandatory>
 ## CRITICAL: Stop After Subagent Completes
 
-After ANY subagent returns, read `.ralph-state.json`. If `awaitingApproval: true`, STOP IMMEDIATELY.
-Do NOT invoke the next phase - user must run next command explicitly.
+After ANY subagent (research-analyst, product-manager, architect-reviewer, task-planner) returns, you MUST:
+
+1. **Read the state file**: `cat ./specs/$name/.ralph-state.json`
+2. **Check awaitingApproval**: If `awaitingApproval: true`, you MUST STOP IMMEDIATELY
+3. **Do NOT invoke the next phase** - the user must explicitly run the next command
+
+```text
+Subagent returns
+↓
+Read .ralph-state.json
+↓
+awaitingApproval == true?
+↓
+YES → STOP. Output: "Phase complete. Run /ralph-specum:<next> to continue."
+NO → Continue (only in quick mode where awaitingApproval is not set)
+```
+
+**This is NON-NEGOTIABLE in normal mode.** Each phase requires user approval before proceeding.
+
+The only exception is `--quick` mode, which skips approval between phases.
 </mandatory>
 
 ## New Flow
 
-1. If no name provided, ask for spec name (kebab-case)
-2. If no goal provided, ask for goal description
+1. If no name provided, ask:
+   - "What should we call this spec?" (validates kebab-case)
+2. If no goal provided, ask:
+   - "What is the goal? Describe what you want to build."
 3. Create spec directory: `./specs/$name/`
 4. Update active spec: `echo "$name" > ./specs/.current-spec`
-5. Ensure gitignore entries for `specs/.current-spec` and `**/.progress.md`
-6. Initialize `.ralph-state.json` with phase "research"
-7. Create `.progress.md` with goal
+5. Ensure gitignore entries exist for spec state files:
+   ```bash
+   # Add .current-spec and .progress.md to .gitignore if not already present
+   if [ -f .gitignore ]; then
+     grep -q "specs/.current-spec" .gitignore || echo "specs/.current-spec" >> .gitignore
+     grep -q "\*\*/\.progress\.md" .gitignore || echo "**/.progress.md" >> .gitignore
+   else
+     echo "specs/.current-spec" > .gitignore
+     echo "**/.progress.md" >> .gitignore
+   fi
+   ```
+6. Initialize `.ralph-state.json`:
+   ```json
+   {
+     "source": "spec",
+     "name": "$name",
+     "basePath": "./specs/$name",
+     "phase": "research",
+     "taskIndex": 0,
+     "totalTasks": 0,
+     "taskIteration": 1,
+     "maxTaskIterations": 5,
+     "globalIteration": 1,
+     "maxGlobalIterations": 100,
+     "commitSpec": $commitSpec
+   }
+   ```
+6. Create `.progress.md` with goal
+7. **Goal Interview** (skip if --quick in $ARGUMENTS)
+8. Invoke research-analyst agent with goal interview context
+9. **STOP** - research-analyst sets awaitingApproval=true. Output status and wait for user to run `/ralph-specum:requirements`
 
-### Spec Scanner (Skip in Quick Mode)
+## Spec Scanner
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/spec-scanner/SKILL.md`
 Before conducting the Goal Interview, scan existing specs to find related work. This helps surface prior context and avoid duplicate effort.
 
-Skip if --quick flag detected.
-</skill-reference>
+<mandatory>
+**Skip spec scanner if --quick flag detected in $ARGUMENTS.**
+</mandatory>
+
+### Scan Steps
+
+```text
+1. List all directories in ./specs/
+   - Run: ls -d ./specs/*/ 2>/dev/null | xargs -I{} basename {}
+   - Exclude the current spec being created (if known)
+   |
+2. For each spec directory found:
+   - Read ./specs/$specName/.progress.md
+   - Extract "Original Goal" section (line after "## Original Goal")
+   - If .progress.md doesn't exist, skip this spec
+   |
+3. Keyword matching:
+   - Extract keywords from current goal (split by spaces, lowercase)
+   - Remove common words: "the", "a", "an", "to", "for", "with", "and", "or"
+   - For each existing spec, count matching keywords with its Original Goal
+   - Score = number of matching keywords
+   |
+4. Rank and filter:
+   - Sort specs by score (descending)
+   - Take top 3 specs with score > 0
+   - If no matches found, skip display step
+   |
+5. Display related specs (if any found):
+   |
+   Related specs found:
+   - spec-name-1: [first 50 chars of Original Goal]...
+   - spec-name-2: [first 50 chars of Original Goal]...
+   - spec-name-3: [first 50 chars of Original Goal]...
+   |
+   This context may inform the interview questions.
+   |
+6. Store in state file:
+   - Update .ralph-state.json with relatedSpecs array:
+     {
+       ...existing state,
+       "relatedSpecs": [
+         {"name": "spec-name-1", "goal": "Original Goal text", "score": N},
+         {"name": "spec-name-2", "goal": "Original Goal text", "score": N},
+         {"name": "spec-name-3", "goal": "Original Goal text", "score": N}
+       ]
+     }
+```
+
+### Keyword Extraction
 
-### Goal Interview (Skip in Quick Mode)
+Extract meaningful keywords from the goal:
 
-<skill-reference>
-**Apply skill**: `plugins/ralph-specum/skills/intent-classification/SKILL.md`
-Before asking interview questions, classify the user's goal to determine question depth (TRIVIAL/REFACTOR/GREENFIELD/MID_SIZED).
-</skill-reference>
+```javascript
+// Pseudocode for keyword extraction
+function extractKeywords(text) {
+  const stopWords = ["the", "a", "an", "to", "for", "with", "and", "or", "is", "it", "this", "that", "be", "on", "in", "of"];
+  return text
+    .toLowerCase()
+    .split(/\s+/)
+    .filter(word => word.length > 2)
+    .filter(word => !stopWords.includes(word));
+}
+```
 
-Apply `plugins/ralph-specum/skills/interview-framework/SKILL.md` for single-question adaptive interview loop.
+### Match Scoring
+
+Simple keyword overlap scoring:
+
+```javascript
+// Pseudocode for scoring
+function scoreMatch(currentGoalKeywords, existingGoalKeywords) {
+  let score = 0;
+  for (const keyword of currentGoalKeywords) {
+    if (existingGoalKeywords.includes(keyword)) {
+      score += 1;
+    }
+  }
+  return score;
+}
+```
 
-**Goal Interview Question Pool:**
+### Example Output
 
-| # | Question | Required | Key |
-| --- | -------- | -------- | --- |
-| 1 | What problem are you solving with this feature? | Required | `problem` |
-| 2 | Any constraints or must-haves for this feature? | Required | `constraints` |
-| 3 | How will you know this feature is successful? | Required | `success` |
-| 4 | Any other context you'd like to share? (or say 'done') | Optional | `additionalContext` |
+```text
+Related specs found:
+- user-auth: Add OAuth2 authentication with JWT tokens...
+- api-refactor: Restructure API endpoints for better...
+- error-handling: Implement consistent error handling...
 
-Store responses in `.progress.md` under `### Goal Interview (from start.md)`.
+This context may inform the interview questions.
+```
 
-8. Invoke research-analyst agent with goal interview context
-9. **STOP** - research-analyst sets awaitingApproval=true
+### Usage in Interview
 
-## Quick Mode Flow
+After scanning, if related specs were found, you may reference them when asking clarifying questions. For example:
+- "I noticed you have a spec 'user-auth' for authentication. Does this new feature relate to or depend on that work?"
+- "There's an existing 'api-refactor' spec. Should this work integrate with those changes?"
 
-Triggered when `--quick` flag detected. Skips all spec phases and auto-generates artifacts.
+## Goal Interview (Pre-Research)
 
-### Quick Mode Input Detection
+<mandatory>
+**Skip interview if --quick flag detected in $ARGUMENTS.**
+
+If NOT quick mode, conduct goal interview using AskUserQuestion before research phase.
+</mandatory>
+
+### Quick Mode Check
+
+Check if `--quick` appears in `$ARGUMENTS`. If present, skip directly to "Invoke research-analyst".
+
+### Intent Classification
+
+Before asking interview questions, classify the user's goal to determine question depth.
+
+**Classification Logic:**
+
+Analyze the goal text for keywords to determine intent type:
 
 ```text
-1. TWO ARGS before --quick: name = first, goal/file = second
-2. ONE ARG before --quick:
-   a. FILE PATH (starts with ./ or /) -> read file as plan
-   b. KEBAB-CASE NAME -> check ./specs/$name/plan.md
-   c. GOAL STRING -> infer name from goal
-3. ZERO ARGS with --quick: Error
+Intent Classification:
+
+1. TRIVIAL: Goal contains keywords like:
+   - "fix typo", "typo", "spelling"
+   - "small change", "minor"
+   - "quick", "simple", "tiny"
+   - "rename", "update text"
+   → Min questions: 1, Max questions: 2
+
+2. REFACTOR: Goal contains keywords like:
+   - "refactor", "restructure", "reorganize"
+   - "clean up", "cleanup", "simplify"
+   - "extract", "consolidate", "modularize"
+   - "improve code", "tech debt"
+   → Min questions: 3, Max questions: 5
+
+3. GREENFIELD: Goal contains keywords like:
+   - "new feature", "new system", "new module"
+   - "add", "build", "implement", "create"
+   - "integrate", "introduce"
+   - "from scratch"
+   → Min questions: 5, Max questions: 10
+
+4. MID_SIZED: Default if no clear match
+   → Min questions: 3, Max questions: 7
 ```
 
-### Name Inference
+**Confidence Threshold:**
+
+| Match Count | Confidence | Action |
+|-------------|------------|--------|
+| 3+ keywords | High | Use matched category |
+| 1-2 keywords | Medium | Use matched category |
+| 0 keywords | Low | Default to MID_SIZED |
+
+**Question Count Rules:**
+- TRIVIAL: 1-2 questions (get essentials, move fast)
+- REFACTOR: 3-5 questions (understand scope and risks)
+- GREENFIELD: 5-10 questions (full context needed)
+- MID_SIZED: 3-7 questions (balanced approach)
+
+**Store Intent:**
+After classification, store the result in `.progress.md`:
+```markdown
+## Interview Format
+- Version: 1.0
+
+## Intent Classification
+- Type: [TRIVIAL|REFACTOR|GREENFIELD|MID_SIZED]
+- Confidence: [high|medium|low] ([N] keywords matched)
+- Min questions: [N]
+- Max questions: [N]
+- Keywords matched: [list of matched keywords]
+```
 
-If no explicit name: take first 3 words of goal, kebab-case, max 30 chars.
+### Question Count by Intent
 
-### Quick Mode Execution
+Intent classification determines the question count range, not which questions to ask. All goals use the same Goal Interview Question Pool (defined below), but the number of questions varies by intent:
+
+| Intent | Min Questions | Max Questions |
+|--------|---------------|---------------|
+| TRIVIAL | 1 | 2 |
+| REFACTOR | 3 | 5 |
+| GREENFIELD | 5 | 10 |
+| MID_SIZED | 3 | 7 |
+
+**Question Selection Logic:**
 
 ```text
-1. Validate input (non-empty goal/plan)
-2. Infer name from goal (if not provided)
-3. Create spec directory: ./specs/$name/
-4. Ensure gitignore entries
-5. Write .ralph-state.json (source: "plan", phase: "tasks")
-6. Write .progress.md with goal
-7. Update .current-spec
-8. Invoke plan-synthesizer agent via Task tool
-9. After generation: update state phase="execution", read task count
-10. If commitSpec: stage, commit, push spec files
-11. Invoke spec-executor for task 1
+1. Get intent from Intent Classification step
+2. Intent determines question COUNT, not which pool to use
+3. All goals use the Goal Interview Question Pool
+4. Ask Required questions first, then Optional questions
+5. Stop when:
+   - User signals completion (after minRequired reached)
+   - All questions asked (maxAllowed reached)
+   - User selects "No, let's proceed" on optional question
 ```
 
-### Quick Mode Validation
+### Question Classification
 
-```text
-1. ZERO ARGS CHECK -> Error: "Quick mode requires a goal or plan file"
-2. FILE NOT FOUND -> Error: "File not found: $filePath"
-3. EMPTY CONTENT CHECK -> Error: "Plan content is empty"
-4. PLAN TOO SHORT WARNING (< 10 words) -> Warning but continue
-5. NAME CONFLICT RESOLUTION -> Append -2, -3, etc. if exists
+Before asking any question, classify it to determine the appropriate source for the answer.
+
+**Classification Matrix:**
+
+| Question Type | Source | Examples |
+|---------------|--------|----------|
+| Codebase fact | Explore agent | "What patterns exist?", "Where is X located?", "What dependencies are used?" |
+| User preference | AskUserQuestion | "What priority level?", "Which approach do you prefer?" |
+| Requirement | AskUserQuestion | "What must this feature do?", "What are the constraints?" |
+| Scope decision | AskUserQuestion | "Should this include X?", "What's in/out of scope?" |
+| Risk tolerance | AskUserQuestion | "How critical is backwards compatibility?" |
+| Constraint | AskUserQuestion | "Any performance requirements?", "Timeline constraints?" |
+
+<mandatory>
+**DO NOT ask user about codebase facts - use Explore agent instead.**
+
+Questions that should go to the user (AskUserQuestion):
+- Preference: "Which approach do you prefer?"
+- Requirement: "What must this feature accomplish?"
+- Scope: "Should this include feature X?"
+- Constraint: "Any performance/timeline constraints?"
+- Risk: "How important is backwards compatibility?"
+
+Questions that should use Explore agent (NOT AskUserQuestion):
+- Existing patterns: "How does the codebase handle X?"
+- File locations: "Where are the authentication modules?"
+- Dependencies: "What libraries are currently used for Y?"
+- Code conventions: "What naming patterns are used?"
+- Architecture: "How is the service layer structured?"
+
+**Before each interview question, check: Is this a codebase fact or user decision?**
+- Codebase fact → Use Explore agent to find the answer automatically
+- User decision → Ask via AskUserQuestion
+</mandatory>
+
+### Question Piping
+
+Before asking each question, replace `{var}` placeholders with values from `.progress.md` context.
+
+**Available Variables:**
+- `{goal}` - Original goal text from user
+- `{intent}` - Intent classification (TRIVIAL, REFACTOR, GREENFIELD, MID_SIZED)
+- `{problem}` - Problem description from Goal Interview
+- `{constraints}` - Constraints from Goal Interview
+- `{users}` - Primary users (not yet available in start.md, populated in later phases)
+- `{priority}` - Priority tradeoffs (not yet available in start.md, populated in later phases)
+
+**Piping Instructions:**
+1. Before each AskUserQuestion, replace `{var}` with values from `.progress.md`
+2. If variable not found, use original question text (graceful fallback)
+3. Example: "What priority tradeoffs for {goal}?" becomes "What priority tradeoffs for Add user authentication?"
+
+**Fallback Behavior:**
+- If `{goal}` not found → use "{goal}" literally (this should rarely happen since goal is always provided)
+- If `{intent}` not found → skip piping for that variable
+- Always prefer graceful degradation over errors
+
+### Goal Interview Questions (Single-Question Flow)
+
+**Interview Framework**: Apply standard single-question loop from `skills/interview-framework/SKILL.md`
+
+### Parameter Chain Note
+
+**Note**: start.md is the first phase - no prior responses exist to check.
+
+This phase initializes the interview context. Later phases (research, requirements, design, tasks) use parameter chain to skip questions already answered here.
+
+### Phase-Specific Configuration
+
+- **Phase**: Goal Interview (first phase)
+- **Available Variables**: `{goal}`, `{intent}` (others populated in later phases)
+- **Storage Section**: `### Goal Interview (from start.md)`
+- **Semantic Keys**: problem, constraints, success, additionalContext
+
+### Goal Interview Question Pool
+
+| # | Question | Required | Key | Options |
+|---|----------|----------|-----|---------|
+| 1 | What problem are you solving with this feature? | Required | `problem` | Fixing a bug or issue / Adding new functionality / Improving existing behavior / Other |
+| 2 | Any constraints or must-haves for this feature? | Required | `constraints` | No special constraints / Must integrate with existing code / Performance is critical / Other |
+| 3 | How will you know this feature is successful? | Required | `success` | Tests pass and code works / Users can complete specific workflow / Performance meets target metrics / Other |
+| 4 | Any other context you'd like to share? (or say 'done' to proceed) | Optional | `additionalContext` | No, let's proceed / Yes, I have more details / Other |
+
+### Store Goal Context
+
+After interview, update `.progress.md` with Interview Format, Intent Classification, and Interview Responses sections:
+
+```markdown
+## Interview Format
+- Version: 1.0
+
+## Intent Classification
+- Type: [TRIVIAL|REFACTOR|GREENFIELD|MID_SIZED]
+- Confidence: [high|medium|low] ([N] keywords matched)
+- Min questions: [N]
+- Max questions: [N]
+- Keywords matched: [list of matched keywords]
+
+## Interview Responses
+
+### Goal Interview (from start.md)
+- Problem: [responses.problem]
+- Constraints: [responses.constraints]
+- Success criteria: [responses.success]
+- Additional context: [responses.additionalContext]
+[Any follow-up responses from "Other" selections]
 ```
 
-### Atomic Rollback
+### Pass Context to Research
 
-On generation failure: delete spec directory, restore .current-spec, display error.
+Include goal interview context when invoking research-analyst:
 
-## Quick Mode Uses Ralph Loop
+```text
+Task delegation prompt should include:
+
+Goal Interview Context:
+- Problem: [response]
+- Constraints: [response]
+- Success criteria: [response]
+
+Use this context to focus research on relevant areas.
+```
+
+## Quick Mode Flow
+
+Triggered when `--quick` flag detected. Skips all spec phases and auto-generates artifacts.
 
-After generating spec artifacts in quick mode, invoke ralph-loop:
 ```text
-Skill: ralph-loop:ralph-loop
-Args: Read ./specs/$spec/.coordinator-prompt.md and follow those instructions exactly. Output ALL_TASKS_COMPLETE when done. --max-iterations <calculated> --completion-promise ALL_TASKS_COMPLETE
+1. Check if --quick flag present in $ARGUMENTS
+   |
+   +-- Yes: Extract args before --quick
+   |   |
+   |   +-- Two args: name = first, goal = second
+   |   |
+   |   +-- One arg: goal = first (infer name later)
+   |   |
+   |   +-- Zero args: Error "Quick mode requires a goal or plan"
+   |
+   +-- No: Continue to normal Detection Logic
 ```
 
+### Quick Mode Steps (POC)
+
+1. Parse args before `--quick`:
+   - If two args: `name` = first arg (kebab-case), `goal` = second arg
+   - If one arg: `goal` = arg, `name` = infer from goal (first 3 words, kebab-case, max 30 chars)
+2. Validate non-empty goal
+3. Create spec directory: `./specs/$name/`
+4. Initialize `.ralph-state.json` with `source: "plan"`:
+   ```json
+   {
+     "source": "plan",
+     "name": "$name",
+     "basePath": "./specs/$name",
+     "phase": "tasks",
+     "taskIndex": 0,
+     "totalTasks": 0,
+     "taskIteration": 1,
+     "maxTaskIterations": 5,
+     "globalIteration": 1,
+     "maxGlobalIterations": 100
+   }
+   ```
+5. Write `.progress.md` with goal
+6. Update `.current-spec` with name
+7. Invoke plan-synthesizer agent to generate all artifacts
+8. After generation: update state `phase: "execution"`, read task count
+9. Invoke spec-executor for task 1
+
 ## Status Display (on resume)
 
+Before resuming, show brief status:
+
 ```text
 Resuming: user-auth
 Phase: execution
@@ -226,6 +955,8 @@ Continuing...
 
 ## Output
 
+After detection and action:
+
 **New spec:**
 ```text
 Created spec 'user-auth' at ./specs/user-auth/