Skip to content

feat(cli): add optional summarize preprocessing via local LLM#1

Merged
rbright merged 8 commits intomainfrom
feat/llm-summarize-filter
Feb 19, 2026
Merged

feat(cli): add optional summarize preprocessing via local LLM#1
rbright merged 8 commits intomainfrom
feat/llm-summarize-filter

Conversation

@rbright
Copy link
Owner

@rbright rbright commented Feb 19, 2026

Summary

  • add optional --summarize preprocessing path using pydantic-ai over an OpenAI-compatible API
  • externalize summarization instructions to src/koko_cli/prompts/summarize_for_speech.txt (domain-agnostic prompt)
  • enforce strict no-op failure semantics: summarization errors abort before synthesis/playback/output
  • add --llm-* CLI/env configuration and just smoke-llm recipe
  • add test coverage for summarization module and CLI summarize/no-op behavior
  • document summarize mode and troubleshooting in README

Verification

  • just fmt
  • just fmt-nix
  • just lint
  • just test (29 passed)
  • just smoke-e2e /tmp/koko-local-assets /tmp/koko-smoke.wav "Koko local inference smoke test"
  • just smoke-llm /tmp/koko-local-assets http://127.0.0.1:11434/v1 mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q6_K /tmp/koko-llm-smoke.wav "### Build\\n- tests passed\\n- changed 14 files"
  • uv build --wheel (verified prompt resource is packaged)

Notes

  • Nix dependency parity for pydantic-ai is intentionally deferred for a follow-up packaging pass.

Summary by CodeRabbit

  • New Features

    • Optional --summarize pre-synthesis flow (default off) with LLM options (base URL, model, API key, timeout, max input chars). Offline use requires a local base URL; summarization failure exits non-zero and produces no audio. Includes speech-focused prompt for concise conversational summaries.
  • Documentation

    • README and CLI help updated with usage, defaults, examples, and troubleshooting for summarization.
  • Tests

    • Added unit and integration tests covering summarization behavior, validators, and failure modes.
  • Chores

    • Added an LLM smoke target, updated commit/pre-push tooling stages, and adjusted packaging metadata for the optional dependency.

Add --summarize and --llm-* flags/env defaults, integrate OpenAI-compatible summarization through pydantic-ai, and enforce no-op failure semantics (abort audio generation on summarization errors).

Also externalizes the summarization prompt to src/koko_cli/prompts/summarize_for_speech.txt and adds a smoke-llm just recipe.
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

Adds an optional CLI summarization step that preprocesses input text via an OpenAI‑compatible LLM before synthesis. Adds CLI flags/env vars, settings/models fields, a new summarization module and prompt, service integration to run summarization before TTS, tests, docs, and tooling/build updates.

Changes

Cohort / File(s) Summary
Configuration & Settings
src/koko_cli/constants.py, src/koko_cli/settings.py
New DEFAULT_SUMMARIZE and DEFAULT_LLM_* constants; new KokoSettings and SettingsSnapshot fields for summarize and LLM options with validation/defaults.
CLI surface & validators
src/koko_cli/parsers.py, src/koko_cli/text.py
Added CLI flags: --summarize, --llm-base-url, --llm-model, --llm-api-key, --llm-timeout-seconds, --llm-max-input-chars; added positive_int and min_256_int validators.
Models & validation
src/koko_cli/models.py
SpeakCommand gains summarize and LLM-related fields plus split_pattern; validation enforces non-blank model/base when summarize=true.
Summarization implementation
src/koko_cli/summarization.py, src/koko_cli/prompts/summarize_for_speech.txt
New summarization module: prompt loader, truncate/normalize helpers, local‑URL detection, API key resolution, lazy pydantic‑ai component loading, and summarize_for_speech API; prompt resource added.
Service integration
src/koko_cli/service.py
run_speak conditionally calls summarize_for_speech before synthesis; enforces offline/local LLM constraint and surfaces SummarizationError to abort without audio.
Errors & types
src/koko_cli/errors.py
Added SummarizationError (subclass of KokoError).
Tests
tests/test_cli.py, tests/test_summarization.py
Updated/added tests for CLI flags, validators, summarize flow, offline/local constraints, truncation/normalization, API key resolution, and error paths.
Docs & tooling
README.md, justfile, .pre-commit-config.yaml, pyproject.toml, flake.nix
Documented new flags/env vars and behavior; added smoke-llm just task; added pydantic-ai dependency; flake excludes pydantic-ai from wheel; adjusted pre-commit/pre-push hooks.
Build metadata
pyproject.toml, flake.nix
Added pydantic-ai dependency; flake configuration removes it from runtime wheel via pythonRemoveDeps.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI Client
    participant Service as run_speak
    participant Summarizer as Summarization Module
    participant LLM as LLM Provider
    participant Synth as Speech Synthesizer

    CLI->>Service: invoke run_speak(text, summarize=true, llm_params)
    Service->>Summarizer: summarize_for_speech(text, base_url, model, api_key, timeout, max_chars)
    Summarizer->>Summarizer: truncate_summary_input()
    Summarizer->>LLM: build Agent/Model and call LLM
    LLM-->>Summarizer: model response
    Summarizer->>Summarizer: normalize_summary_output()
    alt summary produced
        Summarizer-->>Service: summarized_text
        Service->>Synth: synthesize(summarized_text)
        Synth-->>Service: audio_output
        Service-->>CLI: exit 0 (audio)
    else failure
        Summarizer-->>Service: raise SummarizationError
        Service-->>CLI: exit 1 (no audio)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I nibbled long texts down to swift cheer,
Clip, clean, and normalise so each word is clear,
I pinged a tiny LLM, waited for its say,
Returned two short lines for your voice to play,
A rabbit’s small edit — now speak, don’t delay.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main feature: adding optional summarization preprocessing via a local LLM, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/llm-summarize-filter

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 168eed7355

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
tests/test_cli.py (1)

272-292: Consider using tmp_path fixture instead of hardcoded /tmp path.

Static analysis flags the hardcoded /tmp/koko-offline-summary.wav path (S108). While acceptable in tests, using tmp_path would be more consistent with other tests in this file and avoids potential issues in restricted environments.

♻️ Proposed fix
 def test_offline_summarize_requires_local_llm_base_url(
     monkeypatch: pytest.MonkeyPatch,
     tmp_path: Path,
     capsys: pytest.CaptureFixture[str],
 ) -> None:
     monkeypatch.setattr(cli, "load_settings", lambda: make_settings(default_model_dir=tmp_path / "missing-model-dir"))
+    output_path = tmp_path / "koko-offline-summary.wav"
 
     exit_code = cli.main(
         [
             "--summarize",
             "--llm-base-url",
             "https://api.example.com/v1",
             "--no-play",
             "--output",
-            "/tmp/koko-offline-summary.wav",
+            str(output_path),
             "hello",
         ]
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli.py` around lines 272 - 292, In
test_offline_summarize_requires_local_llm_base_url replace the hardcoded
"/tmp/koko-offline-summary.wav" with a file inside the provided tmp_path fixture
(e.g. str(tmp_path / "koko-offline-summary.wav")) when calling cli.main so the
test uses an isolated temp file; update the argument passed in the cli.main
invocation and keep the rest of the test (monkeypatch, expected exit code and
error assertion) unchanged.
src/koko_cli/summarization.py (1)

95-101: Consider handling additional local address formats.

The function correctly identifies 127.0.0.1, localhost, and ::1, but may miss edge cases like:

  • 0.0.0.0 (binds to all interfaces but often used for local servers)
  • IPv6 localhost with brackets: [::1]

This is minor since the primary use case (Ollama at 127.0.0.1) is covered.

💡 Optional: Extend local address detection
 def is_local_llm_base_url(base_url: str) -> bool:
     """Return True when the model endpoint resolves to a local host."""

     parse_target = base_url if "://" in base_url else f"http://{base_url}"
     parsed = urlparse(parse_target)
     hostname = (parsed.hostname or "").lower()
-    return hostname in {"127.0.0.1", "localhost", "::1"}
+    return hostname in {"127.0.0.1", "localhost", "::1", "0.0.0.0"}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/koko_cli/summarization.py` around lines 95 - 101, The
is_local_llm_base_url function only checks for "127.0.0.1", "localhost", and
"::1"; update it to also treat "0.0.0.0" and bracketed IPv6 addresses as local
by normalizing the parsed hostname (strip surrounding brackets if present,
lower-case) and then checking against the expanded set
{"127.0.0.1","localhost","::1","0.0.0.0"}; apply this change inside
is_local_llm_base_url (use the existing parse_target/urlparse logic and hostname
variable) so bracketed IPv6 like "[::1]" and 0.0.0.0 are correctly recognized as
local.
src/koko_cli/parsers.py (1)

119-124: Minor validation gap: CLI allows values 1-255 that fail at model level.

The positive_int validator permits any value > 0, but SpeakCommand.llm_max_input_chars requires ge=256. Users passing values like --llm-max-input-chars 100 will get a less intuitive validation error from the model layer.

Consider adding a custom validator for >= 256 or documenting the minimum in the help text.

💡 Option: Add minimum to help text
     parser.add_argument(
         "--llm-max-input-chars",
         type=positive_int,
         default=settings.llm_max_input_chars,
-        help="Maximum input characters sent to summarization model (>0).",
+        help="Maximum input characters sent to summarization model (min: 256).",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/koko_cli/parsers.py` around lines 119 - 124, The CLI currently uses
positive_int for --llm-max-input-chars which allows values >0 but
SpeakCommand.llm_max_input_chars requires ge=256; update the parser.add_argument
for "--llm-max-input-chars" to enforce the model's minimum (>=256) by replacing
or wrapping positive_int with a validator that rejects values <256 (or
alternatively append "minimum 256" to the help string) so invalid inputs are
caught at the CLI level and reference the same constraint as
SpeakCommand.llm_max_input_chars.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/koko_cli/parsers.py`:
- Around line 119-124: The CLI currently uses positive_int for
--llm-max-input-chars which allows values >0 but
SpeakCommand.llm_max_input_chars requires ge=256; update the parser.add_argument
for "--llm-max-input-chars" to enforce the model's minimum (>=256) by replacing
or wrapping positive_int with a validator that rejects values <256 (or
alternatively append "minimum 256" to the help string) so invalid inputs are
caught at the CLI level and reference the same constraint as
SpeakCommand.llm_max_input_chars.

In `@src/koko_cli/summarization.py`:
- Around line 95-101: The is_local_llm_base_url function only checks for
"127.0.0.1", "localhost", and "::1"; update it to also treat "0.0.0.0" and
bracketed IPv6 addresses as local by normalizing the parsed hostname (strip
surrounding brackets if present, lower-case) and then checking against the
expanded set {"127.0.0.1","localhost","::1","0.0.0.0"}; apply this change inside
is_local_llm_base_url (use the existing parse_target/urlparse logic and hostname
variable) so bracketed IPv6 like "[::1]" and 0.0.0.0 are correctly recognized as
local.

In `@tests/test_cli.py`:
- Around line 272-292: In test_offline_summarize_requires_local_llm_base_url
replace the hardcoded "/tmp/koko-offline-summary.wav" with a file inside the
provided tmp_path fixture (e.g. str(tmp_path / "koko-offline-summary.wav")) when
calling cli.main so the test uses an isolated temp file; update the argument
passed in the cli.main invocation and keep the rest of the test (monkeypatch,
expected exit code and error assertion) unchanged.

Address review feedback by enforcing --llm-max-input-chars >= 256 at parse time, broadening local URL detection (0.0.0.0/[::1]), using tmp_path in tests, and lazy-loading pydantic-ai so base CLI remains usable when summarize deps are absent.
Remove unresolved pydantic-ai requirement from wheel metadata during Nix build until nixpkgs exposes the package. Summarize mode now lazy-loads the dependency and errors only when invoked without it.
Run fast format/lint/type hooks on pre-commit and move full lint/test/nix build gates to pre-push. Update just recipes and README examples for precommit-run/prepush-run.
@rbright
Copy link
Owner Author

rbright commented Feb 19, 2026

Addressed the review comments and CI failure in follow-up commits:

  • fixed nit comments:
    • tests/test_cli.py: replaced hardcoded /tmp test output path with tmp_path
    • src/koko_cli/summarization.py: expanded local endpoint detection to include 0.0.0.0 and bracketed IPv6
    • src/koko_cli/parsers.py: enforce --llm-max-input-chars minimum at parse time (>= 256)
  • fixed CI/Nix build failure (pydantic-ai not installed) by:
    • lazy-loading pydantic-ai in summarize path
    • adding pythonRemoveDeps = [ "pydantic-ai" ] for Nix runtime metadata check compatibility
  • split hooks by weight:
    • lightweight pre-commit hooks
    • heavier pre-push hooks (lint, test, nix build, nix run --help)

Local verification rerun:

  • just fmt
  • just fmt-nix
  • just lint
  • just test (30 passed)
  • just smoke-e2e /tmp/koko-local-assets /tmp/koko-smoke.wav "Koko local inference smoke test"
  • just smoke-llm /tmp/koko-local-assets http://127.0.0.1:11434/v1 mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q6_K /tmp/koko-llm-smoke.wav "### Build\n- tests passed\n- changed 14 files"
  • uv run prek validate-config .pre-commit-config.yaml
  • just precommit-run
  • just prepush-run
  • nix build .#koko
  • nix run .#koko -- --help

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
flake.nix (1)

49-55: Reasonable temporary workaround for missing nixpkgs dependency.

The use of pythonRemoveDeps to strip pydantic-ai from wheel metadata is appropriate given the dependency isn't yet packaged in nixpkgs. The lazy-loading strategy mentioned in PR objectives ensures the base CLI remains functional.

Consider adding a TODO or issue reference to track when pydantic-ai becomes available in nixpkgs, so this workaround can be removed:

-          # `pydantic-ai` is not yet packaged in nixpkgs; remove it from wheel
+          # TODO: Remove this workaround once pydantic-ai is packaged in nixpkgs.
+          # `pydantic-ai` is not yet packaged in nixpkgs; remove it from wheel
           # runtime metadata so Nix package checks stay green while base CLI
           # remains usable. Summarization path requires pydantic-ai at runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@flake.nix` around lines 49 - 55, Add a short TODO comment and/or an issue
tracker reference near the pythonRemoveDeps array that removes "pydantic-ai"
(the pythonRemoveDeps entry) so future maintainers know this is a temporary
workaround and can revert it once pydantic-ai lands in nixpkgs; include the
expected condition (e.g., "remove when pydantic-ai is available in nixpkgs" or a
link/ID to the tracking issue) and attach it adjacent to the pythonRemoveDeps
block to make it easy to find.
src/koko_cli/summarization.py (1)

67-70: Remove unused noqa directive.

Static analysis indicates # noqa: BLE001 is unused. The broad Exception catch is appropriate here to handle any LLM errors gracefully, but the directive can be removed.

Proposed fix
     try:
         result = agent.run_sync(prepared_text, model_settings=ModelSettings(timeout=timeout_seconds))
-    except Exception as error:  # noqa: BLE001
+    except Exception as error:
         raise SummarizationError(f"Summarization request failed: {error}") from error
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/koko_cli/summarization.py` around lines 67 - 70, Remove the unused noqa
directive on the broad exception handler: in the try/except around
agent.run_sync(prepared_text,
model_settings=ModelSettings(timeout=timeout_seconds)) delete the "# noqa:
BLE001" from the line "except Exception as error:" so the except block remains
catching Exception and raising SummarizationError(f"Summarization request
failed: {error}") from error unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@justfile`:
- Around line 68-71: The justfile targets smoke-llm (and similarly smoke-e2e)
pass the output path unquoted which breaks on paths with spaces; edit the uv run
command in the smoke-llm recipe (and the smoke-e2e recipe on the adjacent line)
to wrap the {{ output }} variable in quotes in the command that invokes koko so
the --output argument becomes "--output '{{ output }}'"; this change ensures
path safety while keeping {{ text }} handling unchanged.

In `@README.md`:
- Around line 122-127: Update the README flag description for
--llm-max-input-chars to reflect the actual enforced constraint: replace "(> 0)"
with "(>= 256)" since the code uses the min_256_int validator; locate the list
entry for --llm-max-input-chars in the flags/options section and change the
constraint text to match min_256_int.

---

Nitpick comments:
In `@flake.nix`:
- Around line 49-55: Add a short TODO comment and/or an issue tracker reference
near the pythonRemoveDeps array that removes "pydantic-ai" (the pythonRemoveDeps
entry) so future maintainers know this is a temporary workaround and can revert
it once pydantic-ai lands in nixpkgs; include the expected condition (e.g.,
"remove when pydantic-ai is available in nixpkgs" or a link/ID to the tracking
issue) and attach it adjacent to the pythonRemoveDeps block to make it easy to
find.

In `@src/koko_cli/summarization.py`:
- Around line 67-70: Remove the unused noqa directive on the broad exception
handler: in the try/except around agent.run_sync(prepared_text,
model_settings=ModelSettings(timeout=timeout_seconds)) delete the "# noqa:
BLE001" from the line "except Exception as error:" so the except block remains
catching Exception and raising SummarizationError(f"Summarization request
failed: {error}") from error unchanged.

- align README llm-max-input-chars docs with enforced >=256 constraint\n- quote just smoke recipe output paths for space-safe handling\n- add TODO on temporary pythonRemoveDeps pydantic-ai workaround in Nix\n- remove stale noqa marker from summarize exception handler
@rbright
Copy link
Owner Author

rbright commented Feb 19, 2026

Follow-up bot comments are now addressed and pushed in 0504191:

  • README.md: updated --llm-max-input-chars docs to >= 256
  • justfile: quoted {{ output }} in smoke-e2e and smoke-llm
  • flake.nix: added TODO note clarifying pythonRemoveDeps = [ "pydantic-ai" ] is temporary
  • src/koko_cli/summarization.py: removed stale # noqa: BLE001

I also resolved the corresponding review threads.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/koko_cli/summarization.py`:
- Around line 79-88: The function truncate_summary_input currently appends a
truncation notice after slicing to max_input_chars which can make the final
string exceed the intended cap; update truncate_summary_input to reserve space
for the notice by computing note = "\n\n[Input truncated before summarization.]"
and using allowed = max_input_chars - len(note) to slice normalized[:allowed]
when allowed > 0, then append the note; if allowed <= 0, return
normalized[:max_input_chars].rstrip() (i.e., omit the notice when there isn’t
room) so the returned string never exceeds max_input_chars.

Reserve truncation-note budget in truncate_summary_input so returned text never exceeds max_input_chars, and add regression tests for both notice-fit and small-cap fallback paths.
@rbright
Copy link
Owner Author

rbright commented Feb 19, 2026

Addressed the new CodeRabbit feedback in 527eea2:

  • truncate_summary_input now reserves space for the truncation note so returned text never exceeds max_input_chars
  • if the cap is too small to include the note, it now returns a hard-capped truncated string without the note
  • added regression tests for both paths in tests/test_summarization.py

Verification rerun:

  • just fmt
  • just fmt-nix
  • just lint
  • just test (31 passed)
  • just smoke-e2e /tmp/koko-local-assets /tmp/koko-smoke.wav "Koko local inference smoke test"
  • just smoke-llm /tmp/koko-local-assets http://127.0.0.1:11434/v1 mistralai_Mistral-Small-3.2-24B-Instruct-2506-Q6_K /tmp/koko-llm-smoke.wav "### Build\n- tests passed\n- changed 14 files"

Resolved the corresponding review thread.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_summarization.py`:
- Around line 62-63: The fake agent stub method run_sync currently declares
parameters as args and kwargs which triggers Ruff ARG002; rename them to _args
and _kwargs (or prefix each unused parameter with an underscore) in run_sync and
the other stub methods at the locations referenced (e.g., the similar stub
around lines 102-103) so unused parameters are prefixed with underscores and the
functions still return FakeResult() unchanged.

@rbright rbright merged commit 0a15220 into main Feb 19, 2026
3 checks passed
@rbright rbright deleted the feat/llm-summarize-filter branch February 19, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant