Skip to content

[quantization] Introduce a wrapper for nn.Embedding#455

Merged
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:Embedding_PR
Feb 2, 2026
Merged

[quantization] Introduce a wrapper for nn.Embedding#455
mhs4670go merged 1 commit intoSamsung:mainfrom
stamalakhov:Embedding_PR

Conversation

@stamalakhov
Copy link
Contributor

@stamalakhov stamalakhov commented Feb 2, 2026

This commit introduces a wrapper for nn.Embedding.

./ccex test -k "quantization.wrapq.wrappers.nn.test_quant_embedding"
RUN unit tests with -k quantization.wrapq.wrappers.nn.test_quant_embedding ...
test_dtype_override (quantization.wrapq.wrappers.nn.test_quant_embedding.TestQuantEmbedding) ... ok
test_mode_transitions (quantization.wrapq.wrappers.nn.test_quant_embedding.TestQuantEmbedding) ... ok
test_quantised_output_close (quantization.wrapq.wrappers.nn.test_quant_embedding.TestQuantEmbedding) ... ok
test_weight_stats_survive (quantization.wrapq.wrappers.nn.test_quant_embedding.TestQuantEmbedding) ... ok

----------------------------------------------------------------------
Ran 4 tests in 0.005s

OK

Draft: #436
TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

self.weight_obs = self._make_obs(
"weight",
qscheme=QScheme.PER_CHANNEL_ASYMM, # tensorwise quantization breaks the model
channel_axis=0, # weight ~ (vocab_size, inner_dim) so we quantize by inner dimension so that scales ~ (1, vocab_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"inner dimension" is vocal_size here. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It's hidden_dim from LLama config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhs4670go
I mean inner_dim is dimension of internal float representation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhs4670go
Do you mean comment is wrong ? Scales are of ~ (1, vocab_size) shape for channel_axis = 0 (i've tested it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhs4670go
Finally i got inner dimension in inner dimension so .... I've tried to underscore the fact that scales will have 1, vocab_size dimension. So in inner dimension so ... may be confusing. It can be changed tp:

# weight ~ (vocab_size, inner_dim) so that scales ~ (1, vocab_size)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhs4670go
Changed to a more clear comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It's more clear!

@stamalakhov stamalakhov requested a review from mhs4670go February 2, 2026 06:57
This commit introduces a wrapper for nn.Embedding.

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
Copy link
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit bd9c9b5 into Samsung:main Feb 2, 2026
7 checks passed
@stamalakhov stamalakhov deleted the Embedding_PR branch February 2, 2026 07:57
@dvsav
Copy link
Contributor

dvsav commented Feb 13, 2026

@stamalakhov Stas, it looks like tico.quantization.wrapq.wrappers.nn.quant_embedding is not registered in tico/quantization/wrapq/wrappers/registry.py. Therefore trying to quantize torch.nn.Embedding still causes exception PTQQuantizer: no quantization wrapper for Embedding.

@stamalakhov
Copy link
Contributor Author

Therefore trying to quantize torch.nn.Embedding still causes exception PTQQuantizer: no quantization wrapper for Embedding

@dvsav
It is known as works as expected. Until there are no clients of nn.Embedding there is no point touch registry.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants