fix: fail fast on tools + structured output conflict for OpenRouter (Issue #7132) #7196

ichbinlucaskim · 2026-01-31T05:18:11Z

Summary

This PR implements a short‑term, fail‑fast guardrail for issue #7132 and adds a dedicated reproduction sample plus regression tests.

When AssistantAgent is used with an OpenRouter (OpenAI‑compatible) model and output_content_type is a Pydantic model, the OpenAIChatCompletionClient currently routes requests through beta.chat.completions.parse(response_format=...). In this structured‑output mode, the OpenAI API does not support tool calling, so requests that include both response_format and tools result in tool calls being silently ignored.

This PR makes that incompatibility explicit and prevents the “silent tool drop” behavior.

Changes

1. Fail‑fast guardrail in `_process_create_args`

File: python/packages/autogen-ext/src/autogen_ext/models/openai/_openai_client.py
Method: _process_create_args

Right after converted_tools = convert_tools(tools), we now check for the incompatible combination of structured output and tools:

# Guardrail: structured output (Pydantic model) cannot be combined with tool calling.
# TODO: long-term, this could be a dedicated configuration error type (e.g. IncompatibleModelConfigurationError).
if response_format_value is not None and len(converted_tools) > 0:
    raise ValueError(
        "Cannot use structured output (output_content_type) together with function tools. "
        "The OpenAI structured output API does not support tool calling in this mode. "
        "Either remove output_content_type or remove tools."
    )

Behavior:

Triggers only when:
- response_format_value is set (Pydantic‑based structured output is enabled), and
- converted_tools is non‑empty (at least one function tool is present).
In that case, _process_create_args raises ValueError with a descriptive message.
No other behavior in _process_create_args is changed:
- Structured output without tools still uses the beta parse path.
- Tools without structured output still use the regular chat completions path.
No fallbacks or retries are introduced. Invalid configurations fail fast instead of silently dropping tools.

2. Regression tests for the guardrail

File: python/packages/autogen-ext/tests/models/test_openai_model_client.py

Added a minimal Pydantic model used only for these tests:

class Weather(BaseModel):
    """Minimal Pydantic model for structured-output guardrail tests (issue #7132)."""
    city: str = Field(description="City name.")

Added a minimal tool function:

def _dummy_tool_for_guardrail(city: str) -> str:
    """Minimal tool for testing structured-output vs tools guardrail."""
    return f"Weather in {city}"

Test 1: Pydantic `json_output` + tools → `ValueError`

test_structured_output_with_tools_raises_value_error
- Creates a BaseOpenAIChatCompletionClient with a MagicMock underlying client and model_info.structured_output=True.
- Passes:
  - messages=[UserMessage(...)]
  - tools=[FunctionTool.from_function(_dummy_tool_for_guardrail)]
  - json_output=Weather
- Calls client._process_create_args(...).
- Asserts that:
  - A ValueError is raised.
  - The error message contains "Cannot use structured output (output_content_type) together with function tools".

Test 2: Pydantic `json_output` + no tools → passes

test_structured_output_without_tools_passes
- Same client setup and Weather model.
- Passes tools=[].
- Calls client._process_create_args(...).
- Asserts that:
  - No exception is raised.
  - The returned create_params has response_format is Weather.
  - len(create_params.tools) == 0.

Both tests use a mocked underlying client and only exercise _process_create_args; they do not perform any real network calls.

Note: The existing integration tests
test_openai_structured_output_with_tool_calls and
test_openai_structured_output_with_streaming_tool_calls
now fail with this ValueError, which is the intended behavior under the new guardrail (they exercise the disallowed “tools + structured output in a single request” configuration).

3. Reproduction sample

File: python/samples/agentchat_openrouter/assistant_openrouter_output_content_type.py

Sample that reproduces issue #7132 using AssistantAgent with an OpenRouter model:

Configures a tool (e.g., get_weather).
Sets output_content_type to a Pydantic model.
Before this PR: tools were silently ignored and the agent returned a plain text response.
After this PR: the call now fails fast with the ValueError described above.

Structural / Long‑Term Note

This PR intentionally does not attempt to make “tools + structured output in one request” work, because it is not compatible with the current OpenAI API contract.

beta.chat.completions.parse(response_format=...) is designed for returning structured JSON that matches the schema in a single step.
Tool calling, on the other hand, is inherently a multi‑step protocol where the model first emits a tool_call, external code runs, and then the model is called again with tool results.

Trying to combine both in a single request is structurally incompatible with the current API behavior.

A more robust long‑term solution would be to change the agent workflow to use a two‑step protocol:

First call: regular chat completion with tools enabled (no response_format) to execute tools and collect their outputs.
Second call: structured‑output completion with response_format set and tools disabled, to turn tool outputs into a typed response.

This PR is a short‑term, fail‑fast guardrail that makes the incompatibility explicit; it does not change the higher‑level agent workflow.

Testing

Local runs:

uv run pytest packages/autogen-ext/tests/models/test_openai_model_client.py
- 69 passed, 17 skipped.
- Remaining 2 failures are the existing structured‑output + tools integration tests, which now hit the new ValueError as expected under the guardrail.

The new unit tests:

test_structured_output_with_tools_raises_value_error
test_structured_output_without_tools_passes

both pass.

…Issue microsoft#7132) - Add guardrail in OpenAIChatCompletionClient to fail fast when tools are combined with Pydantic structured outputs - Add regression tests plus an OpenRouter repro sample for issue microsoft#7132.

ichbinlucaskim mentioned this pull request Jan 31, 2026

AssistantAgent with OpenRouter can't function call if there is a output_content_type #7132

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fail fast on tools + structured output conflict for OpenRouter (Issue #7132) #7196

fix: fail fast on tools + structured output conflict for OpenRouter (Issue #7132) #7196

ichbinlucaskim commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: fail fast on tools + structured output conflict for OpenRouter (Issue #7132) #7196

Are you sure you want to change the base?

fix: fail fast on tools + structured output conflict for OpenRouter (Issue #7132) #7196

Conversation

ichbinlucaskim commented Jan 31, 2026

Summary

Changes

1. Fail‑fast guardrail in _process_create_args

2. Regression tests for the guardrail

Test 1: Pydantic json_output + tools → ValueError

Test 2: Pydantic json_output + no tools → passes

3. Reproduction sample

Structural / Long‑Term Note

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Fail‑fast guardrail in `_process_create_args`

Test 1: Pydantic `json_output` + tools → `ValueError`

Test 2: Pydantic `json_output` + no tools → passes