Add fltflt division and fltflt operator overloads #1121

tbensonatl · 2026-01-17T00:30:59Z

This PR includes several enhancements for float-float support:

Add division support for the float-float (fltflt) type
Add operator overloads for +, -, / , *, <, >, etc.
Add a matx::sqrt() overload for the fltflt type and introduce a dispatch function for sqrt() that can use this function
Update the sar_bp operator kernel to use the overloaded operator types
Add extensive unit testing for the float-float functions, including checks on the number of effective mantissa bits.

This PR includes several enhancements for float-float support: - Add division support for the float-float (fltflt) type - Add operator overloads for +, -, / , *, <, >, etc. - Add a matx::sqrt() overload for the fltflt type and introduce a dispatch function for sqrt() that can use this function - Update the sar_bp operator kernel to use the overloaded operator types - Add extensive unit testing for the float-float functions, including checks on the number of effective mantissa bits. Signed-off-by: Thomas Benson <tbenson@nvidia.com>

copy-pr-bot · 2026-01-17T00:31:03Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

tbensonatl · 2026-01-17T00:33:25Z

/build

greptile-apps · 2026-01-17T00:34:17Z

Greptile Summary

Enhances float-float (fltflt) support with division operations, operator overloads, and sqrt functionality. The PR modernizes the fltflt API by adding C++ operator overloads (+, -, *, /, ==, !=, <, >, <=, >=) and conversion operators, allowing more natural arithmetic expressions. The sar_bp.cuh kernel now uses these overloaded operators instead of explicit function calls, improving code readability. A new matx::sqrt overload enables the unary operator dispatch system to work with fltflt types through ADL. Comprehensive unit tests validate >=44 mantissa bits of precision.

Key Changes:

Added fltflt_div, fltflt_div_float_denom, and fltflt_div_float_numer division functions
Implemented operator overloads for all arithmetic and comparison operations
Added explicit conversion operators to double and float
Introduced detail:: namespace helpers for host/device CUDA intrinsic wrappers
Updated SAR backprojection kernel to use cleaner operator syntax
Added ADL-based sqrt dispatch in scalar_internal.h using unqualified call with using declarations
Added 1093-line comprehensive test suite with mantissa bit precision validation

Confidence Score: 5/5

This PR is safe to merge with no issues found
The code is well-architected with proper host/device abstraction, comprehensive testing, and careful attention to numerical precision. The operator overloads follow C++ best practices and maintain backward compatibility. The implementation correctly handles edge cases (division by zero, signed zero comparisons) and uses established algorithms from academic literature. The extensive test suite validates precision guarantees.
No files require special attention

Important Files Changed

Filename	Overview
include/matx/kernels/fltflt.h	Added division operators, operator overloads (+, -, *, /, comparisons), sqrt overload, and conversion operators. Code is well-structured with proper host/device abstraction.
include/matx/kernels/sar_bp.cuh	Refactored to use overloaded operators instead of explicit function calls for fltflt arithmetic. Makes code cleaner and more readable.
include/matx/operators/scalar_internal.h	Added ADL-based sqrt dispatch using unqualified call to enable matx::sqrt overload for fltflt. Clean integration with existing infrastructure.
test/00_misc/FloatFloatTests.cu	Comprehensive test suite for fltflt operations with mantissa bit precision validation. Tests addition, multiplication, subtraction, division, sqrt, abs, and all comparison operators.

Sequence Diagram

sequenceDiagram
    participant User as User Code
    participant Operator as Operator Overload
    participant FltFlt as fltflt Function
    participant Detail as detail:: Helpers
    participant CUDA as CUDA Intrinsics

    User->>Operator: a + b (fltflt)
    Operator->>FltFlt: fltflt_add(a, b)
    FltFlt->>FltFlt: fltflt_two_sum(a.hi, b.hi)
    FltFlt->>Detail: fadd_rn(a, b)
    Detail->>CUDA: __fadd_rn (device) or a+b (host)
    CUDA-->>Detail: result
    Detail-->>FltFlt: sum and error
    FltFlt->>FltFlt: fltflt_fast_two_sum
    FltFlt-->>Operator: normalized fltflt
    Operator-->>User: result

    User->>Operator: a * b (fltflt)
    Operator->>FltFlt: fltflt_mul(a, b)
    FltFlt->>FltFlt: fltflt_two_prod_fma(a.hi, b.hi)
    FltFlt->>Detail: fmaf_rn(a, b, c)
    Detail->>CUDA: __fmaf_rn (device) or ::fmaf (host)
    CUDA-->>Detail: product
    Detail-->>FltFlt: error-free product
    FltFlt-->>Operator: result
    Operator-->>User: result

    User->>Operator: a / b (fltflt)
    Operator->>FltFlt: fltflt_div(a, b)
    FltFlt->>Detail: fdividef_rn(1.0f, b.hi)
    Detail->>CUDA: __fdividef (device) or a/b (host)
    CUDA-->>Detail: approximation
    FltFlt->>FltFlt: refine with Newton iteration
    FltFlt-->>Operator: high-precision result
    Operator-->>User: result

    User->>Operator: sqrt(a) (fltflt)
    Operator->>FltFlt: matx::sqrt → fltflt_sqrt
    FltFlt->>Detail: fltflt_rsqrt(a.hi)
    Detail->>CUDA: rsqrtf (device) or 1/sqrtf (host)
    CUDA-->>Detail: reciprocal sqrt
    FltFlt->>FltFlt: refine with correction
    FltFlt-->>Operator: high-precision sqrt
    Operator-->>User: result

greptile-apps

_{6 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-17T00:34:16Z

include/matx/kernels/fltflt.h

+fltflt::operator double() const {
+    return fltflt_to_double(*this);
+}
+fltflt::operator float() const {
+    return fltflt_to_float(*this);
+}


syntax: Conversion operator definitions missing __MATX_HOST__ __MATX_DEVICE__ __MATX_INLINE__ decorators

Suggested change

fltflt::operator double() const {

return fltflt_to_double(*this);

}

fltflt::operator float() const {

return fltflt_to_float(*this);

}

__MATX_HOST__ __MATX_DEVICE__ __MATX_INLINE__ fltflt::operator double() const {

return fltflt_to_double(*this);

}

__MATX_HOST__ __MATX_DEVICE__ __MATX_INLINE__ fltflt::operator float() const {

return fltflt_to_float(*this);

}

greptile-apps · 2026-01-17T00:34:17Z

include/matx/kernels/sar_bp.cuh

            bin = static_cast<loose_compute_t>(
-                fltflt_to_float(fltflt_mul(diffR_ff, dr_inv_fltflt)) + bin_offset);
+                static_cast<loose_compute_t>(diffR_ff * dr_inv_fltflt) + bin_offset);


style: Redundant static_cast<loose_compute_t> - the inner cast already returns loose_compute_t

Suggested change

bin = static_cast<loose_compute_t>(

fltflt_to_float(fltflt_mul(diffR_ff, dr_inv_fltflt)) + bin_offset);

static_cast<loose_compute_t>(diffR_ff * dr_inv_fltflt) + bin_offset);

bin = static_cast<loose_compute_t>(diffR_ff * dr_inv_fltflt) + bin_offset;

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

tbensonatl requested a review from cliffburdick January 17, 2026 00:30

tbensonatl self-assigned this Jan 17, 2026

greptile-apps bot reviewed Jan 17, 2026

View reviewed changes

Address greptile feedback

c5d5304

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

cliffburdick approved these changes Jan 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add fltflt division and fltflt operator overloads #1121

Add fltflt division and fltflt operator overloads #1121

Uh oh!

tbensonatl commented Jan 17, 2026

Uh oh!

copy-pr-bot bot commented Jan 17, 2026

Uh oh!

tbensonatl commented Jan 17, 2026

Uh oh!

greptile-apps bot commented Jan 17, 2026 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 17, 2026

Uh oh!

greptile-apps bot Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add fltflt division and fltflt operator overloads #1121

Are you sure you want to change the base?

Add fltflt division and fltflt operator overloads #1121

Uh oh!

Conversation

tbensonatl commented Jan 17, 2026

Uh oh!

copy-pr-bot bot commented Jan 17, 2026

Uh oh!

tbensonatl commented Jan 17, 2026

Uh oh!

greptile-apps bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Jan 17, 2026 •

edited

Loading