Skip to content

Conversation

@llvm-beanz
Copy link
Collaborator

This updates the linalg spec to treat the AttributedMatrixRef objects as value objects in the SSA graph. This should address concerns about object lifetimes.

Fixes #756

This updates the linalg spec to treat the AttributedMatrixRef objects as
value objects in the SSA graph. This should address concerns about
object lifetimes.

Fixes microsoft#756
Copy link
Collaborator

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treating the matrix value as the value of a matrix will almost completely fix the problems I was having, however there are some remaining issues that need to be decided.

First, for operations that modify a matrix, we must do one of two things:

  • return the modified value (as @dx.op.matrixAccumulate does), or
  • accept a pointer to the matrix type instead of a value, forcing explicit storage to be declared - through an alloca or global (static).

Thread-scoped matrix will not fit with this scheme well, since the point is to track the connection between one operation and another. In the pure SSA case without phi/select, it's possible to track and skip creating some thread-local matrix, but the restrictions required may appear arbitrary to a user, or overly restrictive in HLSL if we attempt to define high-level language restrictions and apply them in Sema.

If a value is stored and loaded, or it's used in a phi or select, this tracking will be obfuscated, and fusing the final operation in the back end may be difficult, or even impossible.

Since such restrictions would only apply to thread-scoped matrix objects, they would appear very odd and disconnected from most basic language constructs you can reason about.

As I have thought about this for a little while, I don't think there is a good clean solution other than exposing fused operations directly to the user, rather than pretending you can have a local thread scoped matrix value of this type.

Separating the operations can introduce a variety of potential problems not only for later translation, but also for reasoning about when memory accesses occur, in what control flow branch a memory operation is performed, and so on.

The compiler will also generate a permutation of typed matrix handles with names
of the format `%dx.types.AttributedMatrixRef<mangling>`. The mangling scheme for
The compiler will generate a permutation of typed matrix handles with names of
the format `%dx.types.AttributedMatrixRef<mangling>`. The mangling scheme for
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this type represents an abstract/proxy value of a matrix, why keep the "Ref" language?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to change the naming to anything that makes sense. I was minimizing churn.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, this PR changes the meaning of the type from a reference to some memory managed elsewhere to a proxy value of a matrix. It makes breaking changes to most operations, and eliminates the generic type and bitcasting.

Therefore, this seems like the right time to change the name, otherwise it will generate more confusion down the road, and it will become harder to change the longer we leave it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of names, do we need Attribute? Why not just dx.types.Matrix?

Copy link
Contributor

@pow2clk pow2clk Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that no unattributed matrixrefs remain. Is that correct? There is still some language under "DXIL Validation" that refers both to matrixref, creatematrix and bitcasts (edit: sorry no bitcasts)

@tex3d
Copy link
Collaborator

tex3d commented Jan 12, 2026

Additional follow-up, because I think we're still not on the same page regarding this issue.

I think your intent was to keep the reference to some driver managed storage concept while making operations that produce a matrix return a new ref value. This isn't enough to solve the value-tracking/lifetime/storage issue I was referring to, when operations could modify the referred-to matrix contents (like @dx.op.matrixSetElement does still with this update).

This breaks the idea that you could track values, manage storage requirements, and optimize operations in the compiler - the memory semantics of the matrix are still opaque to the compiler.

Operations that use the ref as input still have to be considered potentially loading from any memory (ReadOnly), and operations that use the ref to modify the matrix have to be considered potentially writing any memory. Even operations that return a ref value technically have to be considered ones that write to memory, since the matrix content is not the value itself.

It would still be difficult for the compiler to track storage requirements if a matrix ref value was stored to/loaded from alloca or used in select/phi, since that's still like storing/loading a pointer, pointing to some invisibly managed mutable blob.

Instead, if we consider the matrix type a proxy of the matrix value itself, then an llvm::value is immutable by nature, and to declare a mutable matrix, you need explicit storage, such as by using an alloca or global (static). Then the memory semantics appear ordinary to the compiler, and it can reason about the operations and perform useful optimizations and transformations on them safely. If an operation needs to modify the contents of a matrix, such as with @dx.op.matrixSetElement, it can take a pointer to the mutable matrix storage. The opaque DXIL matrix proxy type can then easily be replaced with the required type in the driver, such as a vector type, and compilation can proceed as normal using standard IR constructs.

If we change this to be a value proxy instead of a reference to some external memory accessed by the operation, as it is today, then there's more to change in this document.

The real sticky problem will still be the coop-vector operations and Thread-scoped matrix. A Thread-scoped matrix is never intended to be real matrix stored per-thread, it must only be used to connect operations that the driver backend must fuse. I don't think there is any clean way to keep the existing API design and truly fix this problem. See my prior review comment for more.

llvm-beanz and others added 10 commits January 13, 2026 11:43
Co-authored-by: Ashley Coleman <ascoleman@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Not sure how that got dropped.
Copy link
Collaborator

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you addressed my concerns about treating the DXIL matrix type as a reference to external memory by updating the @dx.op.matrixSetElement operation.

The concern about coop-vector operations and thread-scoped matrix remains, but can be addressed separately later.

The type still contains an i8*, which might imply that it acts as a pointer to memory somewhere else, when it should now be a value proxy representing a particular matrix value.

I'd prefer we drop the "Ref" in the name, but that's just a name, so I'm approving these changes now.

%dx.types.MatrixRef, ; matrix destination
%dx.types.MatrixRef, ; matrix source
immarg i1, ; transpose
declare %dx.types.AttributedMatrixRef<mangling> @dx.op.copyConvertMatrix.[MatTy].[MatTy](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overload [MatTy1].[MatTy2]?

@pow2clk
Copy link
Contributor

pow2clk commented Jan 14, 2026

I actually like this better than the create/annotate model!

It does seem like a consequence of this is that all or nearly all of the header implementations will be just return __builtin_<something>(param1, param2. . . ). If that's the case, does a header implementation still gain us anything?

immarg i32, ; opcode
%dx.types.MatrixRef, ; matrix
[Ty] ; fill value
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially applicable to other calls, is this going to be readonly/readwrite/something else? If it's not readwrite, I suspect two fills that create distinct matrices will get merged. If it's not, then I don't see how we could remove unused fills.

Comment on lines +1155 to +1158
declare %dx.types.AttributedMatrixRef<mangling> @dx.op.matrixSetElement.[MatTy].[Ty](
immarg i32, ; opcode
i32, ; thread-local index
[Ty] ; value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how SetElement interacts with the AttributedMatrixRef - this is only partially modifying a matrix, so won't have a full matrix value to return?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, you caught the missing parameter!

Suggested change
declare %dx.types.AttributedMatrixRef<mangling> @dx.op.matrixSetElement.[MatTy].[Ty](
immarg i32, ; opcode
i32, ; thread-local index
[Ty] ; value
declare %dx.types.AttributedMatrixRef<mangling> @dx.op.matrixSetElement.[MatTy].[Ty](
immarg i32, ; opcode
%dx.types.AttributedMatrixRef<mangling> ; input matrix
i32, ; thread-local index
[Ty] ; value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[0035] Ill defined memory/lifetime semantics for linalg::Matrix

6 participants