Skip to content

Test coverage gap: README claims vs actual verification #26

@oldsj

Description

@oldsj

Problem

The README makes claims that need verification through the docs-as-specs pipeline.

Current State (Jan 2026)

After running /validate-docs, the coverage is better than expected:

Category Coverage
README features with specs 6/7 (86%)
Spec assertions with tests 34/35 (97%)

Critical Gap: Agent Workflow

The README prominently features this workflow:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Spawn     │────►│    Work     │────►│     PR      │────►│    Close    │
│   (main)    │     │  (k8s ns)   │     │  (GitHub)   │     │  (summary)  │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘

This has NO spec and NO tests. It's the headline feature but completely unverified.

Other Gaps

  • chat.md: "Messages persist across page reloads" - no test
  • Terminology: Specs say "Sessions", some UI/tests say "INBOX"
  • Skipped tests: Several spec behaviors have skipped tests due to flakiness

Infrastructure Claims (Out of Scope for E2E)

These README claims are about infrastructure, not user behavior:

  • "Workers continue after browser closes" - DBOS guarantee
  • "Task state persists across restarts" - DBOS guarantee
  • "Workers in isolated K8s namespaces" - Deployment architecture

These should be verified via integration tests, not E2E.

Action Items

  1. Create docs/specs/agent-workflow.md
  2. Add message persistence test to chat spec coverage
  3. Fix skipped tests or remove spec assertions they cover
  4. Standardize terminology (Sessions vs INBOX)

What IS Working Well

The docs-as-specs approach is solid:

  • sessions.md → 21/21 assertions tested (100%)
  • layout.md → 7/7 assertions tested (100%)
  • chat.md → 6/7 assertions tested (86%)

The pipeline works, we just need to extend it to the agent workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions