-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Problem
The README makes claims that need verification through the docs-as-specs pipeline.
Current State (Jan 2026)
After running /validate-docs, the coverage is better than expected:
| Category | Coverage |
|---|---|
| README features with specs | 6/7 (86%) |
| Spec assertions with tests | 34/35 (97%) |
Critical Gap: Agent Workflow
The README prominently features this workflow:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Spawn │────►│ Work │────►│ PR │────►│ Close │
│ (main) │ │ (k8s ns) │ │ (GitHub) │ │ (summary) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
This has NO spec and NO tests. It's the headline feature but completely unverified.
Other Gaps
- chat.md: "Messages persist across page reloads" - no test
- Terminology: Specs say "Sessions", some UI/tests say "INBOX"
- Skipped tests: Several spec behaviors have skipped tests due to flakiness
Infrastructure Claims (Out of Scope for E2E)
These README claims are about infrastructure, not user behavior:
- "Workers continue after browser closes" - DBOS guarantee
- "Task state persists across restarts" - DBOS guarantee
- "Workers in isolated K8s namespaces" - Deployment architecture
These should be verified via integration tests, not E2E.
Action Items
- Create
docs/specs/agent-workflow.md - Add message persistence test to chat spec coverage
- Fix skipped tests or remove spec assertions they cover
- Standardize terminology (Sessions vs INBOX)
What IS Working Well
The docs-as-specs approach is solid:
sessions.md→ 21/21 assertions tested (100%)layout.md→ 7/7 assertions tested (100%)chat.md→ 6/7 assertions tested (86%)
The pipeline works, we just need to extend it to the agent workflow.
Metadata
Metadata
Assignees
Labels
No labels