Optimize math related build option #2462

chunhuanMeng · 2025-12-05T05:44:29Z

Updates the SYCL kernel compilation flags in the set_build_flags macro to better control floating-point behavior and enable fused multiply-add (FMA) optimizations for both MSVC and GNU compilers.

Compiler flag changes for floating-point behavior and FMA:

For MSVC: Added /Qfma to enable FMA instructions, and /Qftz- to disable flush-to-zero mode.
For GNU: Replaced several fine-grained floating-point flags with -fno-fast-math for strict floating-point compliance and -fma to enable FMA instructions.

Copilot

Pull request overview

This PR optimizes math-related compiler flags for GNU compiler builds by simplifying floating-point behavior control. The change replaces four specific floating-point flags with two more comprehensive options.

Key Changes:

Consolidated multiple floating-point flags into -fno-fast-math and -fma for simpler and more predictable floating-point behavior
Maintained strict floating-point semantics while enabling fused multiply-add optimizations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cmake/BuildFlags.cmake

EikanWang · 2025-12-30T06:42:26Z

@chunhuanMeng , I suppose CUDA also enables FMA in the PyTorch build system, right?

chunhuanMeng · 2025-12-30T06:44:52Z

@chunhuanMeng , I suppose CUDA also enables FMA in the PyTorch build system, right?
Yes, CUDA enables FMA by default.

EikanWang · 2025-12-30T06:53:16Z

Please help collect the performance data.

EikanWang · 2026-01-01T07:32:13Z

cmake/BuildFlags.cmake

-      set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-associative-math)
-      set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-approx-func)
      set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -Wno-absolute-value)
+      set(SYCL_KERNEL_OPTIONS ${SYCL_KERNEL_OPTIONS} -fno-fast-math)


@chunhuanMeng , should we better enable -ffp-contract=fast?

-ffp-contract=fast enables floating-point contraction, allowing FMA formation, and -fma does the same.

EikanWang · 2026-01-01T07:32:50Z

Before landing the PR, let's collect performance data to obtain performance insights.

mengfei25 · 2026-01-07T06:46:59Z

No regression for performance
Test 3 suites dynamo benchmarks and overall eager is 1.000x and inductor is 1.000x compared with main branch (2ce9db8)

PR run: https://github.com/intel/torch-xpu-ops/actions/runs/20703251204
main run: https://github.com/intel/torch-xpu-ops/actions/runs/20703257119

github-actions · 2026-01-07T11:11:48Z

Performance outliers, please check!

🔴 [-1, 80%), should be regression

Category	Model	Target vs. Baseline [Eager]	Target vs. Baseline [Inductor]
torchbench_bfloat16_training	resnet18	0.882706	0.748951

optimize build option

8c85c9a

Copilot AI review requested due to automatic review settings December 5, 2025 05:44

Copilot AI reviewed Dec 5, 2025

View reviewed changes

cmake/BuildFlags.cmake Show resolved Hide resolved

chunhuanMeng added 3 commits December 8, 2025 10:19

Merge branch 'main' into meng_math_build_option

a947e51

Merge branch 'main' into meng_math_build_option

edca11b

Merge branch 'main' into meng_math_build_option

c281688

chunhuanMeng mentioned this pull request Dec 24, 2025

div_trunk_rounding accuracy gap on float64 in test_ops_xpu.py #1895

Open

chunhuanMeng added 2 commits December 24, 2025 15:46

Update math option for windows

a0be2dd

Merge branch 'main' into meng_math_build_option

613a431

chunhuanMeng added the windows_ci Only for Windows CI trigger label Dec 24, 2025

chunhuanMeng changed the title ~~Optimize mah related build option~~ Optimize math related build option Dec 25, 2025

Update math flag on windows

fb0cc8b

chunhuanMeng requested review from EikanWang and chuanqi129 December 30, 2025 06:37

EikanWang approved these changes Dec 30, 2025

View reviewed changes

chunhuanMeng mentioned this pull request Dec 30, 2025

Enable -fno-reciprocal-math to fix div accuracy. #2413

Closed

Merge branch 'main' into meng_math_build_option

1f72d29

EikanWang reviewed Jan 1, 2026

View reviewed changes

chunhuanMeng added 2 commits January 4, 2026 12:41

Update BuildFlags.cmake

41c55a7

Merge branch 'main' into meng_math_build_option

7d31cda

EikanWang requested a review from kdrozd-dev January 5, 2026 12:52

Merge branch 'main' into meng_math_build_option

f52eb5d

intel deleted a comment from github-actions bot Jan 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize math related build option #2462

Optimize math related build option #2462

Uh oh!

chunhuanMeng commented Dec 5, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

EikanWang commented Dec 30, 2025

Uh oh!

chunhuanMeng commented Dec 30, 2025

Uh oh!

EikanWang commented Dec 30, 2025

Uh oh!

EikanWang Jan 1, 2026

Uh oh!

chunhuanMeng Jan 4, 2026

Uh oh!

EikanWang commented Jan 1, 2026

Uh oh!

mengfei25 commented Jan 7, 2026

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimize math related build option #2462

Are you sure you want to change the base?

Optimize math related build option #2462

Uh oh!

Conversation

chunhuanMeng commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

EikanWang commented Dec 30, 2025

Uh oh!

chunhuanMeng commented Dec 30, 2025

Uh oh!

EikanWang commented Dec 30, 2025

Uh oh!

EikanWang Jan 1, 2026

Choose a reason for hiding this comment

Uh oh!

chunhuanMeng Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

EikanWang commented Jan 1, 2026

Uh oh!

mengfei25 commented Jan 7, 2026

Uh oh!

github-actions bot commented Jan 7, 2026

Performance outliers, please check!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chunhuanMeng commented Dec 5, 2025 •

edited

Loading