Skip to content

Conversation

@Sean-Kenneth-Doherty
Copy link

Why are these changes needed?

Fixes #7170 - UserMessage deserialization fails when content contains both string and Image data.

Problem

When UserMessage.content contains both string and Image in a list (e.g., [image, "describe this"]), JSON deserialization fails with:

Expected dict or Image instance, got <class 'str'>

Root Cause

The Image class's __get_pydantic_core_schema__ used core_schema.any_schema(), which accepts any input type. When Pydantic validates Union[str, Image], it would try the Image validator on strings before trying the str type, causing the validation to fail.

Solution

Changed the Image schema to use core_schema.union_schema with explicit types:

  1. core_schema.is_instance_schema(cls) - for Image instances (pass through)
  2. core_schema.dict_schema() with validator - for JSON dicts with 'data' key

This ensures the Image validator only processes dict/Image inputs, allowing strings to be handled by the str type in Union[str, Image].

Related issue number

Closes #7170

Checks

Testing

Added comprehensive test file (test_image_mixed_content.py) with 5 test cases covering:

All existing serialization tests continue to pass:

========================= 13 passed, 1 warning in 0.35s =========================

…ialization (microsoft#7170)

## Problem
When UserMessage.content contains both string and Image in a list (e.g.,
), JSON deserialization fails with:
"Expected dict or Image instance, got <class 'str'>"

## Root Cause
The Image class's  used ,
which accepts any input type. When Pydantic validates Union[str, Image], it
would try the Image validator on strings before trying the str type, causing
the validation to fail.

## Solution
Changed the Image schema to use  with explicit types:
1.  - for Image instances (pass through)
2.  with validator - for JSON dicts with 'data' key

This ensures the Image validator only processes dict/Image inputs, allowing
strings to be handled by the str type in Union[str, Image].

## Tests
Added comprehensive test file with 5 test cases covering:
- String-only content
- Image-only content
- Mixed content (Image + string) - the exact bug scenario from microsoft#7170
- String before Image
- Multiple strings and images interleaved

All existing serialization tests continue to pass.
@Sean-Kenneth-Doherty
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

when UserMessage have both string and Image data,JSON deserialization cause error

1 participant