Skip to content

[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed#488

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_vision_patch_embed
Feb 13, 2026
Merged

[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed#488
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_vision_patch_embed

Conversation

@dvsav
Copy link
Contributor

@dvsav dvsav commented Feb 12, 2026

This change introduces QuantQwen3VLVisionPatchEmbed wrapper to support post-training quantization of Qwen3VLVisionPatchEmbed module.

Why?

Qwen3VLVisionPatchEmbed module is used in the image encoder part of Qwen model.
Trying to quantize Qwen3VLVisionPatchEmbed via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLVisionPatchEmbed.

What

This change introduces:

  • Class QuantQwen3VLVisionPatchEmbed (tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py).
  • Unit tests: class TestQuantQwen3VLTextAttention (test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py) - skipped if transformers package is not installed.
  • New entry quant_vision_patch_embed in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
  • Example of Qwen3VLVisionPatchEmbed quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_patch_embed.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py -v
======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 9 items                                                                                                                                                                                  

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_activation_stats_collected PASSED                                          [ 11%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_different_batch_sizes      PASSED                                          [ 22%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_forward_diff               PASSED                                          [ 33%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_mode_transitions           PASSED                                          [ 44%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_multiple_calibration_steps PASSED                                          [ 55%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_observer_count             PASSED                                          [ 66%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_output_shape               PASSED                                          [ 77%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_proj_override              PASSED                                          [ 88%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_registration_in_registry   PASSED                                          [100%]

================================================================================== 9 passed, 2 warnings in 6.50s ===================================================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                   Stmts   Miss  Cover   Missing
----------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/nn/quant_conv3d.py                       30      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py      30      0   100%
...
----------------------------------------------------------------------------------------------------
TOTAL                                                                  10170   6520    36%

@dvsav dvsav force-pushed the quant_vision_patch_embed branch 4 times, most recently from 13abd1a to 0aac4e4 Compare February 12, 2026 12:51
@dvsav dvsav marked this pull request as ready for review February 12, 2026 13:00
@@ -0,0 +1,236 @@
# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved
# Copyright (c) 2026 Samsung Electronics Co., Ltd. All Rights Reserved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 done

Comment on lines 36 to 41
cfg = Qwen3VLVisionConfig(
hidden_size=1024, # Match Qwen3-VL's hidden size
spatial_merge_size=2,
temporal_merge_size=2,
)
model = Qwen3VLVisionPatchEmbed(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just to note) Oh...? This model looks a bit different from my vision patch embed. Maybe because spacial_merge_size ..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dayo09 How different is it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhs4670go I cannot attach image files here, see here

Copy link
Contributor

@dayo09 dayo09 Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvsav

As below are our target configuration for this layer, could you use this?

Qwen3VLVisionPatchEmbed(
  (proj): Conv3d(3, 1024, kernel_size=(2, 16, 16), stride=(2, 16, 16))
)

'args': ('Tensor(shape=[468, 1536], dtype=torch.float32)',)

The reason why is that, your current example generates some float32 ADD operator remains. (See #489 for details)
We are planning to lower above specific Conv3d operator into Conv2d+Reshape (@llFreetimell is working on it). Above specifics are derived from a use case scenario (which is not 100% fixed for now, though).
Thus, it would be good to provide quantization example with above version.

(+ Do you have any specific reason to decide your configuration of this Qwen3VLVisionPatchEmbed?)

Copy link
Contributor Author

@dvsav dvsav Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As below are our target configuration for this layer, could you use this?

@dayo09 👍 Thanks for noticing this! I've changed the code of example and added assertions checking that Conv3d has the right configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dvsav Well, after applying the config, the graph remains the same. (I am sorry that I cannot show you the image. I am not yet permitted to upload image, I will process that soon to alleviate your inconvenience)

Convolution's weight is lifted up as constant input and not constant-folded. I believe constant folding after quantization is required in this case. 😅

This change introduces QuantQwen3VLVisionPatchEmbed wrapper to support post-training quantization of Qwen3VLVisionPatchEmbed module.

TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>
@dvsav dvsav force-pushed the quant_vision_patch_embed branch from 0aac4e4 to 20d536b Compare February 13, 2026 07:18
Copy link
Contributor

@dayo09 dayo09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 6a244c2 into Samsung:main Feb 13, 2026
7 checks passed
@dvsav dvsav deleted the quant_vision_patch_embed branch February 13, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants