feat: add MCP contract testing for distributed AI systems by hidai25 · Pull Request #21 · hidai25/eval-view

hidai25 · 2026-02-07T20:00:04Z

Summary

Adds MCP (Model Context Protocol) contract testing to detect external server interface drift
Includes contract diffing engine, MCP adapter, and CLI integration
Adds comprehensive test suite and documentation

Test plan

Unit tests added in tests/test_mcp_contracts.py (495 lines)
Manual testing of evalview CLI with MCP contract commands
Verify action.yml changes work in CI

🤖 Generated with Claude Code

…tection Adds the ability to snapshot external MCP server tool definitions and detect breaking changes (removed tools, new required params, type changes) before running tests. This addresses Scenario 2 of distributed AI evaluation: when you don't own the MCP server code. New commands: evalview mcp snapshot/check/list/show/delete New flag: evalview run --contracts --fail-on CONTRACT_DRIFT New CI status: CONTRACT_DRIFT (joins REGRESSION, TOOLS_CHANGED, OUTPUT_CHANGED) https://claude.ai/code/session_019CvwYcAoNoitWBdhEYUbxV

1. BUG: asyncio.run() nested inside run_in_executor in _run_async (already in an async context). Fixed to just await directly. 2. BUG: Contract check ran before fail_on resolved from config.yaml defaults, so config-based fail_on: [CONTRACT_DRIFT] was ignored. Moved contract check after config loading. 3. BUG: _discover_tools_http missing notifications/initialized after init (inconsistent with stdio path, violates MCP protocol). 4. BUG: _discover_tools_http didn't check init response for errors. 5. BUG: DiffStatus docstring said "Four states" but now has five. Also fixed: unused Optional import, duplicate datetime imports, variable shadowing (adapter/result in _run_async contract block). https://claude.ai/code/session_019CvwYcAoNoitWBdhEYUbxV

… test gaps 1. action.yml: When both diff=true and contracts=true, --fail-on was appended twice. Refactored to set it once after both flags. 2. mcp_adapter: _discover_tools_http incremented _request_id before the notification. JSON-RPC notifications don't carry an id, so incrementing is semantically wrong (wastes an id for the next call). 3. Tests: Removed unused Path import. Added test for summary() with mixed breaking + informational changes. Added test for duplicate tool names in current_tools (edge case). https://claude.ai/code/session_019CvwYcAoNoitWBdhEYUbxV

claude added 3 commits February 7, 2026 09:00

hidai25 merged commit 8c66d16 into main Feb 7, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MCP contract testing for distributed AI systems#21

feat: add MCP contract testing for distributed AI systems#21
hidai25 merged 3 commits intomainfrom
claude/eval-distributed-ai-systems-vV2v0

hidai25 commented Feb 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hidai25 commented Feb 7, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants