Support torchao MPS 4-bit quantization by manuelcandales · Pull Request #197 · huggingface/optimum-executorch

manuelcandales · 2025-12-16T20:25:48Z

This pull request extends the quantization and device support for the Executorch export pipeline:

Added fpa4w (floating point activation, 4-bit weight for MPS backend) as a valid choice for the --qlinear and --qlinear_encoder command-line arguments.
Added mps as a valid choice for the --device argument
Integrated the UIntxWeightOnlyConfig from torchao.experimental.quant_api.

larryliu0820 · 2025-12-16T20:35:45Z

optimum/commands/export/executorch.py

        "--qlinear",
        type=str,
-        choices=["8da4w", "4w", "8w", "8da8w", "8da4w,8da8w"],
+        choices=["8da4w", "4w", "8w", "8da8w", "8da4w,8da8w", "fpa4w"],


Does fpa4w work on backends other than metal?

No, it only works with Metal

Can you add a check then, if a user pass --qlinear fpa4w and --device mps at the same time

larryliu0820 · 2025-12-16T20:37:03Z

optimum/exporters/executorch/quantization.py

                )
+            if quant_config_key == "fpa4w":
+                # Need to import to load the ops
+                import torchao.experimental.ops.mps  # noqa: F401


nit should we import this in torchao.experimental.quant_api so that from torchao.experimental.quant_api import UIntxWeightOnlyConfig can satisfy the import requirement?

import torchao.experimental.ops.mps will raise an error if the op library isn't found. The metal ops are not built in torchao by default. For that reason, I thought it would be more clear to have an explicit import that loads the ops, rather than as a side effect of importing the config.

that's also why I import torchao.experimental.ops.mps only if quant_config_key == "fpa4w"

Support torchao MPS 4-bit quantization

0f1ed9d

larryliu0820 self-requested a review December 16, 2025 20:34

larryliu0820 reviewed Dec 16, 2025

View reviewed changes

Validate fpa4w quantization requires MPS device

d0d993a

larryliu0820 approved these changes Dec 17, 2025

View reviewed changes

JacobSzwejbka merged commit 96394e4 into huggingface:main Dec 17, 2025
43 of 83 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support torchao MPS 4-bit quantization#197

Support torchao MPS 4-bit quantization#197
JacobSzwejbka merged 2 commits intohuggingface:mainfrom
manuelcandales:manuel/mps-int4-quant

manuelcandales commented Dec 16, 2025 •

edited

Loading

Uh oh!

larryliu0820 Dec 16, 2025

Uh oh!

manuelcandales Dec 16, 2025

Uh oh!

larryliu0820 Dec 16, 2025

Uh oh!

manuelcandales Dec 16, 2025

Uh oh!

larryliu0820 Dec 16, 2025

Uh oh!

manuelcandales Dec 16, 2025

Uh oh!

manuelcandales Dec 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

manuelcandales commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

larryliu0820 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

larryliu0820 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

larryliu0820 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

manuelcandales Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manuelcandales commented Dec 16, 2025 •

edited

Loading