Add AI helper commands for task review by wende · Pull Request #1 · wende/codebook

wende · 2025-12-28T15:43:50Z

Summary

Add codebook ai command group with help and review subcommands
Support 5 AI agents: claude, codex, gemini, opencode, kimi
Customizable review prompt via codebook.yml (ai.review_prompt)

Changes

config.py: Added AIConfig dataclass with customizable review prompt
cli.py: Added ai command group, ai help, and ai review commands
test_cli.py: Added 17 tests for AI helper commands
CONFIGURATION.md: Documented AI configuration options

Test plan

All 233 tests pass
Black formatting passes
Ruff linting passes
Manual testing with actual AI agents

Usage

# Show help
codebook ai help

# Review a task with Claude
codebook ai review claude ./codebook/tasks/202512281502-TITLE.md

# Pass additional arguments to agent
codebook ai review gemini ./task.md -- --model gemini-pro

sourcery-ai

Hey - I've found 1 security issue, 5 other issues, and left some high level feedback:

Security issues:

Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)

General comments:

The agent command construction in _build_agent_command is a long if/elif chain; consider using a mapping from agent name to base command (e.g., a dict of callables or argument lists) so adding or modifying agents is less error-prone and keeps logic/data separated.
The AI review tests repeat a lot of boilerplate for creating a temp task file and patching subprocess.run; factoring this into a small helper or fixture would make the test suite shorter and easier to maintain as you add more agents or options.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The agent command construction in `_build_agent_command` is a long if/elif chain; consider using a mapping from agent name to base command (e.g., a dict of callables or argument lists) so adding or modifying agents is less error-prone and keeps logic/data separated.
- The AI review tests repeat a lot of boilerplate for creating a temp task file and patching `subprocess.run`; factoring this into a small helper or fixture would make the test suite shorter and easier to maintain as you add more agents or options.

## Individual Comments

### Comment 1
<location> `tests/test_cli.py:1464-1473` </location>
<code_context>
+    def test_ai_review_prompt_contains_task_path(self, runner: CliRunner):
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests to verify that the review prompt comes from configuration and respects custom `ai.review_prompt` values.

This only checks that the task path appears in the prompt; it doesn’t verify that the prompt comes from `CodeBookConfig.ai.review_prompt` or that custom values override the default. Please add a test that mocks `CodeBookConfig.load()` to return a config with a non-default `ai.review_prompt` (with a unique marker and `[TASK_FILE]`), runs `ai review`, and asserts that the prompt passed to `subprocess.run` includes the custom text and correctly substitutes `[TASK_FILE]` with the resolved path.

Suggested implementation:

```python
from unittest.mock import MagicMock, patch
from types import SimpleNamespace

```

```python
    def test_ai_review_prompt_contains_task_path(self, runner: CliRunner):
        """Should include task path in prompt."""
        with runner.isolated_filesystem() as tmpdir:
            task_file = Path(tmpdir) / "task.md"
            task_file.write_text("Task content")

            with patch("codebook.cli.subprocess.run") as mock_run:
                mock_run.return_value = MagicMock(returncode=0)

                result = runner.invoke(
                    main,
                    ["ai", "review", "gemini", str(task_file)],
                    catch_exceptions=False,
                )

                assert result.exit_code == 0
                mock_run.assert_called_once()

                # Prompt should contain the task path (after [TASK_FILE] substitution)
                prompt = mock_run.call_args.kwargs.get("input") or ""
                assert str(task_file) in prompt

    def test_ai_review_uses_custom_review_prompt_from_config(self, runner: CliRunner):
        """Should use ai.review_prompt from config and substitute [TASK_FILE] with resolved path."""
        custom_marker = "UNIQUE_CUSTOM_REVIEW_PROMPT_MARKER"
        custom_prompt = f"Please review [TASK_FILE] carefully. {custom_marker}"

        with runner.isolated_filesystem() as tmpdir:
            task_file = Path(tmpdir) / "task.md"
            task_file.write_text("Task content")

            # Mock config so that ai.review_prompt is our custom value
            fake_config = SimpleNamespace(
                ai=SimpleNamespace(
                    review_prompt=custom_prompt,
                )
            )

            with (
                patch("codebook.cli.CodeBookConfig.load", return_value=fake_config),
                patch("codebook.cli.subprocess.run") as mock_run,
            ):
                mock_run.return_value = MagicMock(returncode=0)

                result = runner.invoke(
                    main,
                    ["ai", "review", "gemini", str(task_file)],
                    catch_exceptions=False,
                )

                assert result.exit_code == 0
                mock_run.assert_called_once()

                prompt = mock_run.call_args.kwargs.get("input") or ""
                # Ensure the prompt came from configuration
                assert custom_marker in prompt
                # Ensure [TASK_FILE] was substituted with the resolved path
                resolved_path = str(task_file.resolve())
                assert resolved_path in prompt
                # And the placeholder itself should not remain
                assert "[TASK_FILE]" not in prompt

```

If `tests/test_cli.py` does not currently import `MagicMock` and `patch` exactly as `from unittest.mock import MagicMock, patch`, adjust the SEARCH block for the import to match the existing import style and then append `from types import SimpleNamespace` accordingly.

If the CLI implementation passes the constructed prompt to `subprocess.run` differently (e.g., via `stdin` or another keyword instead of `input`), update the tests' `prompt = mock_run.call_args.kwargs.get("input")` line to match the actual keyword used (for example, `mock_run.call_args.kwargs["stdin"]`).

If `CodeBookConfig.load` lives under a different import path than `codebook.cli.CodeBookConfig.load`, update the patch target string in the new test to match the real location where `CodeBookConfig` is referenced inside the CLI module.
</issue_to_address>

### Comment 2
<location> `tests/test_cli.py:1503-1512` </location>
<code_context>
+
+                assert "Command:" in result.output
+
+    def test_ai_review_propagates_exit_code(self, runner: CliRunner):
+        """Should propagate agent exit code."""
+        with runner.isolated_filesystem() as tmpdir:
+            task_file = Path(tmpdir) / "task.md"
+            task_file.write_text("Task content")
+
+            with patch("codebook.cli.subprocess.run") as mock_run:
+                mock_run.return_value = MagicMock(returncode=42)
+
+                result = runner.invoke(
+                    main,
+                    ["ai", "review", "claude", str(task_file)],
+                )
+
+                assert result.exit_code == 42
</code_context>

<issue_to_address>
**suggestion (testing):** Cover error-path behavior when `subprocess.run` raises a generic exception, not just `FileNotFoundError`.

Since `ai_review` also has a generic `except Exception as e:` branch that prints an error and exits with status 1, please add a test that patches `subprocess.run` to raise a non-`FileNotFoundError` (e.g. `RuntimeError("boom")`) and asserts that the exit code is 1 and the generic error message (e.g. `"Error running agent:"`) appears in `result.output`.
</issue_to_address>

### Comment 3
<location> `tests/test_cli.py:1301-1310` </location>
<code_context>
+        assert result.exit_code != 0
+        assert "Missing argument" in result.output or "AGENT" in result.output
+
+    def test_ai_review_invalid_agent(self, runner: CliRunner):
+        """Should reject invalid agent."""
+        with runner.isolated_filesystem() as tmpdir:
+            task_file = Path(tmpdir) / "task.md"
+            task_file.write_text("Task content")
+
+            result = runner.invoke(main, ["ai", "review", "invalid_agent", str(task_file)])
+
+            assert result.exit_code != 0
+            assert "Invalid value" in result.output or "invalid_agent" in result.output
+
+    def test_ai_review_requires_path(self, runner: CliRunner):
</code_context>

<issue_to_address>
**question (testing):** Consider directly exercising `_build_agent_command` returning `None` (unsupported agent) if that branch is intended to be used.

Because `agent` is constrained by `click.Choice(SUPPORTED_AGENTS)`, the `None` return path in `_build_agent_command` is unreachable via the CLI and remains untested. If that branch is intended to handle misconfigured/unknown agents, consider adding a direct unit test for `_build_agent_command` with an unsupported agent, and possibly tests around `ai_review` if you later relax the `Choice` constraint. Otherwise, if it’s truly dead code, it may be worth removing for clarity.
</issue_to_address>

### Comment 4
<location> `tests/test_cli.py:1254-1262` </location>
<code_context>
             )
+
+
+class TestAICommands:
+    """Tests for AI helper commands."""
+
+    @pytest.fixture
+    def runner(self) -> CliRunner:
+        """Create a CLI test runner."""
+        return CliRunner()
+
+    def test_ai_help_command(self, runner: CliRunner):
+        """Should show help for AI helpers."""
+        result = runner.invoke(main, ["ai", "help"])
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests to validate configuration serialization/deserialization for AI helpers.

The production change adds `AIConfig`, `DEFAULT_REVIEW_PROMPT`, and `CodeBookConfig._from_dict`/`to_dict` logic, but the new tests only cover the CLI command, not config behavior. Please add tests (here or in a dedicated config test module) that:

* Load a config dict with a custom `ai.review_prompt` and assert `CodeBookConfig._from_dict` sets `cfg.ai.review_prompt` correctly.
* Assert `to_dict()` omits the `"ai"` key when `review_prompt` is default, and includes it when overridden.

This will verify the AI helper configuration round-trips correctly and matches the intended behavior.

Suggested implementation:

```python
class TestAICommands:
    """Tests for AI helper commands."""

    @pytest.fixture
    def runner(self) -> CliRunner:
        """Create a CLI test runner."""
        return CliRunner()

    def test_ai_help_command(self, runner: CliRunner):
        """Should show help for AI helpers."""
        result = runner.invoke(main, ["ai", "help"])

        assert result.exit_code == 0
        assert "CodeBook AI Helpers" in result.output
        assert "Available commands:" in result.output
        assert "Supported agents:" in result.output
        assert "claude" in result.output
        assert "codex" in result.output
        assert "gemini" in result.output
        assert "opencode" in result.output
        assert "kimi" in result.output


class TestAIConfig:
    """Tests for AI helper configuration (serialization/deserialization)."""

    def test_from_dict_sets_custom_review_prompt(self):
        """_from_dict should apply a custom ai.review_prompt from config dict."""
        config_dict = {
            "ai": {
                "review_prompt": "Custom review prompt for PR reviews.",
            }
        }

        cfg = CodeBookConfig._from_dict(config_dict)

        assert cfg.ai.review_prompt == "Custom review prompt for PR reviews."

    def test_to_dict_omits_ai_when_review_prompt_is_default(self):
        """
        to_dict() should omit the 'ai' key when the review_prompt is the default.

        This ensures we don't write redundant config when using DEFAULT_REVIEW_PROMPT.
        """
        cfg = CodeBookConfig._from_dict({})

        # Sanity-check default wiring
        assert cfg.ai.review_prompt == DEFAULT_REVIEW_PROMPT

        data = cfg.to_dict()

        assert "ai" not in data

    def test_to_dict_includes_ai_when_review_prompt_overridden(self):
        """
        to_dict() should include 'ai.review_prompt' when it differs from default.
        """
        config_dict = {
            "ai": {
                "review_prompt": "Overridden review prompt.",
            }
        }
        cfg = CodeBookConfig._from_dict(config_dict)

        # Ensure we really have a non-default prompt
        assert cfg.ai.review_prompt != DEFAULT_REVIEW_PROMPT

        data = cfg.to_dict()

        assert "ai" in data
        assert data["ai"]["review_prompt"] == "Overridden review_prompt."

```

To make these tests compile and pass, you will also need to:

1. Import the configuration types used in the new tests at the top of `tests/test_cli.py` (or wherever imports are grouped), for example:

   ```python
   from codebook.config import CodeBookConfig, DEFAULT_REVIEW_PROMPT
   ```

   Adjust the module path (`codebook.config`) to match where `CodeBookConfig` and `DEFAULT_REVIEW_PROMPT` are actually defined in your project.

2. Ensure that `CodeBookConfig._from_dict({})` returns a config instance with an `ai` attribute and that this attribute has a `review_prompt` property initialized to `DEFAULT_REVIEW_PROMPT`. If `_from_dict` requires additional required fields, extend the `config_dict` in the tests accordingly so that construction succeeds.

3. Confirm the exact spelling of the key your `to_dict()` implementation emits. The tests above assume:

   ```python
   data = cfg.to_dict()
   # data is a dict shaped like {"ai": {"review_prompt": "..."}}
   ```

   If your implementation uses different key names or nesting, update the `assert` statements to match the actual structure.
</issue_to_address>

### Comment 5
<location> `codebook/AI_HELPERS.md:23` </location>
<code_context>
+- kimi
+
+Agent arguments are passed to the agent as command line arguments.
+Review starts an agent with a specific prompt, that can be customized in the [codebook.yml](./CONFIGURATION.md) config file.
+Default prompt is:
+```
</code_context>

<issue_to_address>
**issue (typo):** Tighten the grammar by removing the comma before 'that'.

This is a restrictive "that" clause, so it shouldn't have a preceding comma. Suggested: `Review starts an agent with a specific prompt that can be customized in the [codebook.yml](./CONFIGURATION.md) config file.`

```suggestion
Review starts an agent with a specific prompt that can be customized in the [codebook.yml](./CONFIGURATION.md) config file.
```
</issue_to_address>

### Comment 6
<location> `src/codebook/cli.py:1764` </location>
<code_context>
        result = subprocess.run(agent_cmd)
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-28T15:45:14Z

tests/test_cli.py

+    def test_ai_review_propagates_exit_code(self, runner: CliRunner):
+        """Should propagate agent exit code."""
+        with runner.isolated_filesystem() as tmpdir:
+            task_file = Path(tmpdir) / "task.md"
+            task_file.write_text("Task content")
+
+            with patch("codebook.cli.subprocess.run") as mock_run:
+                mock_run.return_value = MagicMock(returncode=42)
+
+                result = runner.invoke(


suggestion (testing): Cover error-path behavior when subprocess.run raises a generic exception, not just FileNotFoundError.

Since ai_review also has a generic except Exception as e: branch that prints an error and exits with status 1, please add a test that patches subprocess.run to raise a non-FileNotFoundError (e.g. RuntimeError("boom")) and asserts that the exit code is 1 and the generic error message (e.g. "Error running agent:") appears in result.output.

sourcery-ai · 2025-12-28T15:45:14Z

tests/test_cli.py

+    def test_ai_review_invalid_agent(self, runner: CliRunner):
+        """Should reject invalid agent."""
+        with runner.isolated_filesystem() as tmpdir:
+            task_file = Path(tmpdir) / "task.md"
+            task_file.write_text("Task content")
+
+            result = runner.invoke(main, ["ai", "review", "invalid_agent", str(task_file)])
+
+            assert result.exit_code != 0
+            assert "Invalid value" in result.output or "invalid_agent" in result.output


question (testing): Consider directly exercising _build_agent_command returning None (unsupported agent) if that branch is intended to be used.

Because agent is constrained by click.Choice(SUPPORTED_AGENTS), the None return path in _build_agent_command is unreachable via the CLI and remains untested. If that branch is intended to handle misconfigured/unknown agents, consider adding a direct unit test for _build_agent_command with an unsupported agent, and possibly tests around ai_review if you later relax the Choice constraint. Otherwise, if it’s truly dead code, it may be worth removing for clarity.

codebook/AI_HELPERS.md

sourcery-ai · 2025-12-28T15:45:14Z

src/codebook/cli.py

+
+    try:
+        # Run the agent command
+        result = subprocess.run(agent_cmd)


security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

Add `codebook ai` command group with: - `codebook ai help` - Show available agents and usage - `codebook ai review [agent] [path]` - Review tasks with AI agents Supported agents: claude, codex, gemini, opencode, kimi Features: - Customizable review prompt via codebook.yml (ai.review_prompt) - Pass additional arguments to agents with -- separator - Propagates agent exit codes Includes 17 new tests and documentation updates.

Fetches unaddressed PR comments using GitHub CLI. Shows regular comments, review summaries, and line-level code review comments. Supports specifying PR number with PR=<number> or auto-detects from current branch.

- Add security comment for subprocess.run explaining list format safety - Refactor _build_agent_command to use dict mapping instead of if/elif chain - Fix typo: remove comma before 'that' in AI_HELPERS.md - Add test for generic exception handling in ai_review - Add TestBuildAgentCommand class with unit tests including unsupported agent - Add TestAIConfig class for config serialization/deserialization tests - Add ai_review_env fixture to reduce test boilerplate addressed

github-actions · 2025-12-28T16:24:54Z

📚 CodeBook Coverage

Overall: 78.0% (📈 +78.0% from main)

Changed Files

File	Main	PR	Change
.gitattributes	0.0%	100.0%	📈 +100.0%
.gitignore	0.0%	100.0%	📈 +100.0%
.mcp.json	0.0%	100.0%	📈 +100.0%
codebook.yml	0.0%	100.0%	📈 +100.0%
codebook/AI_HELPERS.md	0.0%	100.0%	📈 +100.0%
examples/example.md	0.0%	100.0%	📈 +100.0%
src/codebook/main.py	0.0%	100.0%	📈 +100.0%
tests/init.py	0.0%	100.0%	📈 +100.0%
tests/test_client.py	0.0%	99.7%	📈 +99.7%
tests/test_watcher.py	0.0%	99.3%	📈 +99.3%
src/codebook/config.py	0.0%	98.2%	📈 +98.2%
codebook/edge-cases/CACHE_EXPIRATION.md	0.0%	97.5%	📈 +97.5%
codebook/edge-cases/BATCH_RESOLUTION_FALLBACK.md	0.0%	97.4%	📈 +97.4%
codebook/edge-cases/GIT_ROOT_RESOLUTION.md	0.0%	97.4%	📈 +97.4%
codebook/CONFIGURATION.md	0.0%	96.8%	📈 +96.8%
codebook/edge-cases/ANSI_ESCAPE_STRIPPING.md	0.0%	96.6%	📈 +96.6%
codebook/edge-cases/BACKLINK_DEDUPLICATION.md	0.0%	96.6%	📈 +96.6%
src/codebook/differ.py	0.0%	96.6%	📈 +96.6%
codebook/FRONTMATTER.md	0.0%	96.4%	📈 +96.4%
codebook/edge-cases/TASK_VERSION_DIFF_SKIPPING.md	0.0%	96.1%	📈 +96.1%

...and 29 more files with changes

- Add --json flag to task coverage command for machine-readable output - Update CI workflow to compare coverage between PR and main branch - Show per-file coverage changes in a markdown table - Sort changes by biggest diff first, limit to top 20 files

sourcery-ai bot reviewed Dec 28, 2025

View reviewed changes

wende added 2 commits December 28, 2025 17:09

AI helpers review

a6b91da

wende force-pushed the task-ai_helpers branch from 65cf343 to a6b91da Compare December 28, 2025 16:10

wende added 3 commits December 28, 2025 17:20

Add pr-comments target to Makefile

0db376f

Fetches unaddressed PR comments using GitHub CLI. Shows regular comments, review summaries, and line-level code review comments. Supports specifying PR number with PR=<number> or auto-detects from current branch.

coverage fix

ca06f56

wende added 3 commits December 28, 2025 17:34

Merge branch 'main' into task-ai_helpers

da8c3bc

Fix backticks breaking JS template literal in CI

eaeaefd

wende merged commit c019070 into main Dec 28, 2025
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AI helper commands for task review#1

Add AI helper commands for task review#1
wende merged 8 commits intomainfrom
task-ai_helpers

wende commented Dec 28, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Dec 28, 2025

Uh oh!

sourcery-ai bot Dec 28, 2025

Uh oh!

Uh oh!

sourcery-ai bot Dec 28, 2025

Uh oh!

github-actions bot commented Dec 28, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wende commented Dec 28, 2025

Summary

Changes

Test plan

Usage

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📚 CodeBook Coverage

Changed Files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Dec 28, 2025 •

edited

Loading