feat: v2.4.0 Headless CLI Testing milestone#21
Merged
RichardHightower merged 24 commits intomainfrom Mar 6, 2026
Merged
Conversation
…apper - 6 compact single-line Copilot-native fixtures (ms timestamps, no hook_event_name/session_id) - run_copilot wrapper in cli_wrappers.bash with timeout guard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- smoke.bats: 8 tests (binary checks, daemon health, ingest, copilot CLI skip) - hooks.bats: 10 tests (all 5 event types, session synthesis, Bug #991, cleanup) - Fix jq -n to jq -nc in memory-capture.sh (multi-line JSON broke memory-ingest read_line) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SUMMARY.md with 18 tests across 2 bats files - STATE.md updated with position and decisions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- 5 tests covering full session lifecycle, TOC browse, cwd metadata, agent field preservation, concurrent session isolation - Uses direct CchEvent format with agent=copilot for deterministic testing - Mirrors gemini/pipeline.bats pattern with 5-event Copilot session helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dling tests - 7 tests covering memory-ingest and memory-capture.sh fail-open behavior - memory-ingest tests: daemon down, malformed JSON, empty stdin, unknown event type - memory-capture.sh tests: daemon down, malformed input, empty stdin (assert exit 0, no stdout) - All 30 Copilot tests pass across 4 test files (smoke, hooks, pipeline, negative) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SUMMARY.md documents 2 tasks, 2 files, 30 total Copilot tests - STATE.md updated with position, decisions, metrics Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Python3-based JUnit XML parser for all 5 CLIs - Produces markdown table with CLI x scenario pass/fail/skip - Supports both local and CI artifact directory structures - Handles missing/empty XML gracefully Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New job runs after all CLI matrix entries complete - Downloads JUnit artifacts and generates cross-CLI summary - Report output goes to GitHub Actions step summary - Uses if: always() to run even when some CLIs fail Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SUMMARY.md with execution results - STATE.md updated to phase 34 complete (100%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add adapters/codex-cli/ with .codex/skills/ for memory-query, retrieval-policy, topic-graph, bm25-search, vector-search - Each skill has YAML frontmatter (name + description) and references/command-reference.md - Add SANDBOX-WORKAROUND.md documenting macOS Seatbelt and Linux Landlock issues - Add README.md explaining no-hooks limitation (Discussion #2150) - No hooks directory -- Codex CLI does not support lifecycle hooks Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ests - Create 6 CchEvent fixtures in tests/cli/fixtures/codex/ with agent:"codex" - Add run_codex() wrapper to cli_wrappers.bash using codex exec --full-auto --json - Create smoke.bats with 8 tests (6 always-run + 2 codex-binary-dependent) - Create hooks.bats with 6 all-skipped tests annotating no-hooks limitation - Test 6 verifies adapter skills exist with valid YAML frontmatter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SUMMARY.md with 2 tasks, 22 files created, all verifications passed - STATE.md updated: plan 1/3 complete, decisions, metrics Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Session lifecycle, TOC browse, cwd metadata, agent field, concurrent isolation - Direct CchEvent format with agent=codex (no hooks) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hook tests - memory-ingest fail-open: daemon-down, malformed, empty, unknown event - Hook tests skipped with GitHub Discussion #2150 annotation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SUMMARY.md with 2 task commits documented - STATE.md updated: plan 2/3, progress, decisions, session Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Archive milestone artifacts, evolve PROJECT.md, collapse ROADMAP.md. 5 phases (30-34), 15 plans, 144 bats tests across 5 CLIs. Key: bats-core E2E harness, Codex adapter, cross-CLI matrix report. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bump workspace version from 2.3.0 to 2.4.0 for Headless CLI Testing milestone. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update macos-x86_64 runner from macos-13 (deprecated) to macos-15 - Add Cross.toml for aarch64 cross-compilation with OpenSSL - Make release job run with if: always() to handle partial build failures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
scripts/cli-matrix-report.sh) parsing JUnit XML from all 5 CLIse2e-cli.ymlwith GitHub step summary outputMilestone Stats
Test plan
bats tests/cli/codex/— all 26 Codex tests pass (or skip gracefully)bats tests/cli/claude-code/— existing 30 tests unaffectedscripts/cli-matrix-report.sh /tmp/empty— produces header with no data (graceful)cargo check --workspacepasses at v2.4.0🤖 Generated with Claude Code