Skip to content

Conversation

@tombee
Copy link
Owner

@tombee tombee commented Jan 8, 2026

Summary

Implement a two-tier E2E testing strategy with mock-based tests and smoke tests to ensure workflow execution quality and validate executor functionality.

  • MockLLMProvider: Deterministic testing without external LLM dependencies
  • Test harness: Assertion helpers for flexible workflow validation
  • 25 mock E2E tests: Cover workflow execution, sequencing, conditionals, error handling, and tool integration
  • 10 smoke tests: Real integration with Ollama (free, local-first) and Anthropic APIs
  • 5 YAML workflow fixtures: Sample workflows for executor capability testing
  • New Makefile targets: test-e2e and test-smoke for easy test execution

Key Components

Mock E2E Tests

  • Simple workflow execution
  • Multi-step workflows with tool calling
  • Error handling and recovery
  • Conditional step execution
  • Integration with actions

Smoke Tests

  • Ollama-based tests (free, locally-runnable)
  • Anthropic API integration tests
  • Common smoke test utilities and setup

Testing Infrastructure

  • test/e2e/harness: Core testing framework with MockLLMProvider, test harness, assertions
  • test/e2e/testdata: YAML workflow fixtures
  • Extended testing/integration/config for mock provider setup
  • SDK options for test configuration

Test Plan

  • All mock E2E tests pass with MockLLMProvider
  • Test harness correctly executes workflows and captures results
  • Assertion helpers validate step outputs and tool executions
  • Smoke tests can run against Ollama and Anthropic (when API keys available)
  • Makefile targets work correctly (make test-e2e, make test-smoke)

tombee added 2 commits January 8, 2026 18:54
Implement a two-tier E2E testing strategy to ensure workflow execution quality:

- MockLLMProvider: Implements llm.Provider interface for deterministic testing without external LLM dependencies
- Test harness: Provides assertion helpers (AssertWorkflowOutput, AssertToolExecution, AssertStepExecution) for flexible workflow validation
- 25 mock E2E tests: Cover workflow execution, step sequencing, conditional logic, error handling, and tool integration
- 10 smoke tests: Verify real integration with Ollama (free, local-first approach) and Anthropic APIs
- 5 YAML workflow fixtures: Sample workflows for testing various executor capabilities

New Makefile targets enable easy test execution:
- test-e2e: Run all mock-based E2E tests
- test-smoke: Run integration smoke tests with real LLM providers

Updates to testing infrastructure:
- Extended testing/integration/config to support mock provider setup
- Added LLMProvider options to SDK for test configuration
@tombee tombee merged commit 909e741 into main Jan 8, 2026
1 of 4 checks passed
@tombee tombee deleted the spec/SPEC-7 branch January 8, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants