[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed by dvsav · Pull Request #488 · Samsung/TICO

dvsav · 2026-02-12T10:01:56Z

This change introduces QuantQwen3VLVisionPatchEmbed wrapper to support post-training quantization of Qwen3VLVisionPatchEmbed module.

Why?

Qwen3VLVisionPatchEmbed module is used in the image encoder part of Qwen model.
Trying to quantize Qwen3VLVisionPatchEmbed via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLVisionPatchEmbed.

What

This change introduces:

Class QuantQwen3VLVisionPatchEmbed (tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py).
Unit tests: class TestQuantQwen3VLTextAttention (test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py) - skipped if transformers package is not installed.
New entry quant_vision_patch_embed in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
Example of Qwen3VLVisionPatchEmbed quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_patch_embed.py).

Unit Tests

Unit tests results with coverage information:

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py -v
======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 9 items                                                                                                                                                                                  

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_activation_stats_collected PASSED                                          [ 11%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_different_batch_sizes      PASSED                                          [ 22%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_forward_diff               PASSED                                          [ 33%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_mode_transitions           PASSED                                          [ 44%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_multiple_calibration_steps PASSED                                          [ 55%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_observer_count             PASSED                                          [ 66%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_output_shape               PASSED                                          [ 77%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_proj_override              PASSED                                          [ 88%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py::TestQuantQwen3VLVisionPatchEmbed::test_registration_in_registry   PASSED                                          [100%]

================================================================================== 9 passed, 2 warnings in 6.50s ===================================================================================

Coverage info (irrelevant files skipped):

$ coverage report -m
Name                                                                   Stmts   Miss  Cover   Missing
----------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/nn/quant_conv3d.py                       30      0   100%
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_patch_embed.py      30      0   100%
...
----------------------------------------------------------------------------------------------------
TOTAL                                                                  10170   6520    36%

dayo09 · 2026-02-13T00:13:10Z

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_patch_embed.py

@@ -0,0 +1,236 @@
+# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved


Suggested change

# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved

# Copyright (c) 2026 Samsung Electronics Co., Ltd. All Rights Reserved

dayo09 · 2026-02-13T01:38:21Z

tico/quantization/wrapq/examples/qwen/quantize_qwen_vision_patch_embed.py

+    cfg = Qwen3VLVisionConfig(
+        hidden_size=1024,  # Match Qwen3-VL's hidden size
+        spatial_merge_size=2,
+        temporal_merge_size=2,
+    )
+    model = Qwen3VLVisionPatchEmbed(cfg)


(Just to note) Oh...? This model looks a bit different from my vision patch embed. Maybe because spacial_merge_size ..

@dayo09 How different is it?

@mhs4670go I cannot attach image files here, see here

@dvsav

As below are our target configuration for this layer, could you use this?

Qwen3VLVisionPatchEmbed( (proj): Conv3d(3, 1024, kernel_size=(2, 16, 16), stride=(2, 16, 16)) ) 'args': ('Tensor(shape=[468, 1536], dtype=torch.float32)',)

The reason why is that, your current example generates some float32 ADD operator remains. (See #489 for details)
We are planning to lower above specific Conv3d operator into Conv2d+Reshape (@llFreetimell is working on it). Above specifics are derived from a use case scenario (which is not 100% fixed for now, though).
Thus, it would be good to provide quantization example with above version.

(+ Do you have any specific reason to decide your configuration of this Qwen3VLVisionPatchEmbed?)

As below are our target configuration for this layer, could you use this?

@dayo09 👍 Thanks for noticing this! I've changed the code of example and added assertions checking that Conv3d has the right configuration.

@dvsav Well, after applying the config, the graph remains the same. (I am sorry that I cannot show you the image. I am not yet permitted to upload image, I will process that soon to alleviate your inconvenience)

Convolution's weight is lifted up as constant input and not constant-folded. I believe constant folding after quantization is required in this case. 😅

This change introduces QuantQwen3VLVisionPatchEmbed wrapper to support post-training quantization of Qwen3VLVisionPatchEmbed module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

dayo09

LGTM

mhs4670go

LGTM

dvsav mentioned this pull request Feb 12, 2026

Qwen3-VL: Implement quantization wrappers #483

Open

dvsav force-pushed the quant_vision_patch_embed branch 4 times, most recently from 13abd1a to 0aac4e4 Compare February 12, 2026 12:51

dvsav marked this pull request as ready for review February 12, 2026 13:00

dayo09 reviewed Feb 13, 2026

View reviewed changes

[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed

20d536b

This change introduces QuantQwen3VLVisionPatchEmbed wrapper to support post-training quantization of Qwen3VLVisionPatchEmbed module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

dvsav force-pushed the quant_vision_patch_embed branch from 0aac4e4 to 20d536b Compare February 13, 2026 07:18

dayo09 approved these changes Feb 13, 2026

View reviewed changes

mhs4670go approved these changes Feb 13, 2026

View reviewed changes

mhs4670go merged commit 6a244c2 into Samsung:main Feb 13, 2026
7 checks passed

dvsav deleted the quant_vision_patch_embed branch February 13, 2026 08:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed#488

[quantization] Introduce wrapper for Qwen3VLVisionPatchEmbed#488
mhs4670go merged 1 commit intoSamsung:mainfrom
dvsav:quant_vision_patch_embed

dvsav commented Feb 12, 2026 •

edited

Loading

Uh oh!

dayo09 Feb 13, 2026

Uh oh!

dvsav Feb 13, 2026

Uh oh!

dayo09 Feb 13, 2026

Uh oh!

mhs4670go Feb 13, 2026

Uh oh!

dayo09 Feb 13, 2026

Uh oh!

dayo09 Feb 13, 2026 •

edited

Loading

Uh oh!

dvsav Feb 13, 2026 •

edited

Loading

Uh oh!

dayo09 Feb 13, 2026

Uh oh!

dayo09 left a comment

Uh oh!

mhs4670go left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,236 @@
		# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved

	# Copyright (c) 2025 Samsung Electronics Co., Ltd. All Rights Reserved
	# Copyright (c) 2026 Samsung Electronics Co., Ltd. All Rights Reserved

Conversation

dvsav commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

What

Unit Tests

Uh oh!

dayo09 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

dvsav Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

mhs4670go Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dvsav Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dayo09 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

dayo09 left a comment

Choose a reason for hiding this comment

Uh oh!

mhs4670go left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dvsav commented Feb 12, 2026 •

edited

Loading

dayo09 Feb 13, 2026 •

edited

Loading

dvsav Feb 13, 2026 •

edited

Loading