Skip to content

Conversation

@knjiang
Copy link
Contributor

@knjiang knjiang commented Jan 15, 2026

TL;DR

Added a validation tool to compare LLM proxy responses against captured snapshots, ensuring API compatibility.

Testing - ran manually:

pnpm validate --proxy-url http://localhost:8080 --models openai,anthropic,google

> @braintrust/payload-capture@0.1.0 validate /Users/kenjiang/Development/braintrust/lingua/payloads
> tsx scripts/validate.ts --proxy-url http://localhost:8080 --models openai,anthropic,google


Validating proxy at http://localhost:8080...

chat-completions
  ✗ toolCallRequest [gemini-2.5-flash] (379ms)
    Error: Error: 400 Invalid JSON payload received. Unknown name "function" at 'tools[0]': Cannot find field.
Invalid JSON payload received. Unknown name "type" at 'tools[0]': Cannot find field.
  ✗ toolCallRequest [claude-sonnet-4-20250514] (388ms)
    Error: Error: 400 tool_choice: Input should be a valid dictionary or object to extract fields from
  ✓ simpleRequest [gpt-5-nano] (7024ms)
  ✓ simpleRequest [gemini-2.5-flash] (7442ms)
  ✓ simpleRequest [claude-sonnet-4-20250514] (8436ms)
  ✓ toolCallRequest [gpt-5-nano] (10404ms)
  ✓ reasoningRequest [claude-sonnet-4-20250514] (13769ms)
  ✓ reasoningRequest [gemini-2.5-flash] (13920ms)
  ✓ reasoningRequest [gpt-5-nano] (26758ms)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary: 7 passed, 2 failed
Total time: 88520ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ELIFECYCLE  Command failed with exit code 1.```

Why make this change?

This validation tool enables systematic testing of LLM proxies and gateways against known-good responses. It helps ensure that proxies maintain API compatibility with original providers.

Copy link
Contributor Author

knjiang commented Jan 15, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@knjiang knjiang marked this pull request as ready for review January 15, 2026 22:09
@knjiang knjiang changed the title add validation tool Add proxy validation tool Jan 15, 2026
@knjiang knjiang force-pushed the add_validate_cli_tool_to_test_payloads_against_gateways branch from 52b1449 to e51e53f Compare January 15, 2026 22:21
@knjiang knjiang requested a review from remh January 15, 2026 22:32
@knjiang knjiang force-pushed the add_validate_cli_tool_to_test_payloads_against_gateways branch from e51e53f to 7482ae1 Compare January 16, 2026 04:02
Copy link

@remh remh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff. I have a few tiny nitpicks, feel free to address the ones worth addressing and merge.


```bash
# Test with both OpenAI and Anthropic models through chat-completions format
pnpm validate --proxy-url http://localhost:8080 --models openai,anthropic
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nitpick but we should call the option provider instead of models

| `--proxy-url <url>` | Proxy URL (required) |
| `--api-key <key>` | API key for gateway |
| `--format <formats>` | Formats to test: `chat-completions`, `responses`, `anthropic` |
| `--models <models>` | Model providers: `openai`, `anthropic` |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it supports bedrock and google as well no? let's list them

})
.join("\\.");

const regex = new RegExp(`^${regexPattern}$`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely not needed now but we could cache the regex if things start getting too slow, i doubt that's the case now though

@@ -0,0 +1,202 @@
// JSON comparison utilities with field ignore support
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's generate a couple of unit tests for those functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants