Skip to content

Enable hqq_scale_only by default for XNNPACK#211

Merged
mergennachin merged 1 commit intomainfrom
hqq_enable
Feb 4, 2026
Merged

Enable hqq_scale_only by default for XNNPACK#211
mergennachin merged 1 commit intomainfrom
hqq_enable

Conversation

@mergennachin
Copy link
Collaborator

No description provided.

@mergennachin
Copy link
Collaborator Author

cc @metascroy @psiddh @lucylq

@mergennachin
Copy link
Collaborator Author

See previous result in pytorch/executorch#14834

return Int8DynamicActivationIntxWeightConfig(
weight_dtype=torch.int4,
weight_granularity=granularity,
intx_choose_qparams_algorithm="hqq_scale_only",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add this to other quant method? Like IntxWeightOnlyConfig?

Add configurable qparams_algorithm parameter to quantize_model_() with
"hqq_scale_only" as default. HQQ (Half-Quadratic Quantization) provides
better accuracy by minimizing reconstruction error during quantization.

Applied to IntxWeightOnlyConfig and Int8DynamicActivationIntxWeightConfig.
Int4WeightOnlyConfig (4w with packing) keeps hardcoded "hqq" due to
different API. UIntxWeightOnlyConfig (fpa4w) unchanged as it lacks this
parameter.

Users can override globally via:
  from optimum.exporters.executorch import quantization
  quantization.DEFAULT_QPARAMS_ALGORITHM = "affine"
@mergennachin mergennachin merged commit 4c62ed7 into main Feb 4, 2026
66 of 83 checks passed
@mergennachin mergennachin deleted the hqq_enable branch February 4, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants