CUDA solver: First attempt at linear system solver for CUDA cores #89

saikubairkota · 2025-09-18T16:21:55Z

This is the first attempt at implementing a linear system solver for CUDA cores i.e., for solving linear systems on NVIDIA GPUs. We have implemented two types of solvers: CUDADenseSolver and CUDASparseSolver, each designed for the purpose their names suggest.

The CUDADenseSolver is developed based on the cuSolverDN library, while the CUDASparseSolver uses the cuDSS library. Both solvers support 32-bit and 64-bit floating-point data and rely on direct linear system solving methods (Cholesky or LU), which the user must choose depending on the matrix type.

Currently, the implementations are verified and tested with a single MPI process (CPU process) and a single GPU instance. For other configurations, such as multiple MPI processes with a single GPU or multiple GPUs, further modifications are needed to handle data exchange, load distribution, and ensure correct solver behavior.

…implementation This is the first attempt at implementing a CUDA-based dense solver. The code doesn't compile yet since we didn't make the necessary modifications to the build system to use the CUDA compiler. We will make those changes in the next commit.

… checking for validity Removed the negative index comparison before adding the element to the stiffness matrix because the size_t type is unsigned anyways and will never be negative. So the comparison is redundant.

… compilation error For some reason, when the CUDADenseSolver.cu is compiled with the nvcc compiler, it fails to compile with the error: ``` /usr/local/pyre/include/pyre/tensor/traits.h:30:18: error: expected identifier before ‘>’ token 30 | template <int N> | ^ /usr/local/pyre/include/pyre/tensor/traits.h:41:18: error: expected identifier before ‘>’ token 41 | template <int N> | ^ /usr/local/pyre/include/pyre/tensor/algebra.icc:1232:50: error: expected identifier before ‘>’ token 1232 | template <pyre::tensor::square_matrix_c matrixT> | ^ ``` Apparently, the nvcc compiler is not able to compile the some template code in the pyre library which we are using in the CUDADenseSolver.cu to defined the real type. This could be because nvcc doesn't support all the features of the C++ standard library. We hardcoded the mito::real to double for now to avoid this compilation error. We should figure out why this is happening and fix it later.

…DenseSolver instances Enabled APIs and factories to create CUDADenseSolver instances in the CUDA backend. We also corrected the CUDADenseSolver instantiation in the factories.h file.

… solver

… test

…eric support Replaced the double data type with mito::real in the CUDADenseSolver to support generic types requested by the user. Note that the code now compiles only with nvcc compatibility changes in the pyre library as some C++20 functionalities such as lambda functions are not supported by the cuda compiler.

…ent data types Added the cusolver_traits struct to support double and float data type implementations in the CUDADenseSolver.

…ter in constructor Introduced the `SolverType` enum to distinguish between different solver types in the CUDA backend. Added solver type as a parameter in the `CUDADenseSolver` constructor so the user can choose the solver type at runtime.

… host We are storing the matrix in column-major order on the host since the cuSolver library expects matrices in column-major order and we could avoid doing a transpose later on the GPU.

…lving the linear system

Added checks with tolerances as the results were very close but not exactly equal which is fine for floating point operations.

… {mito} headers This is achieved for example ensuring class {CUDADenseSolver} uses {double} instead of {mito::real} as the underlying {real_type}.

… {real_type} This ensures that the solver can still be instantiated with a {mito::real} substitution, while keeping mito objects and cuda objects comiled separately, the formers with {gcc} and the latters with {nvcc}.

The template argument of class {CUDADenseSolver} is now required to be a type representing a real value.

…ds in header to introduce default arguments Added default arguments to the constructors and methods in `CUDADenseSolver` so we can add default arguments in the constructor and methods. There are also a few formatting edits in this commit.

…ecks

… & methods common to all CUDA solvers

Made the CUDADenseSolver a derived class of CUDASolver to leverage common functionality and improve code reuse.

…memory more easily

…ice memory Similar to the HostArray class, we added a DeviceArray class to manage device memory for allocation and deallocation more easily.

…of Host, Device Arrays & moved few common members to base class We made multiple changes in CUDASolver and CUDADenseSolver classes in this commit: 1. We moved few common methods and attributes to the base class CUDASolver so we can reuse them in future for CUDASparseSolver class. 2. Used the HostArray and DeviceArray classes for managing host and device memory respectively. This change reduced a lot of code clutter related to memory management. 3. Removed a lot of unnecessary methods to free/initialize memory as the HostArray and DeviceArray classes automatically manage memory now. 4. A few other methods/attributes are also removed which are not needed.

…ce memory for sparse matrices

…r implementation This is the first attempt at implementing a sparse linear solver in CUDA. We used the Eigen sparse matrix to store the system matrix on the host side. The memory arrays we developed for the dense solver were reused to store the right-hand side and the solution vectors on the both host and device side. We used the cuDSS library to solve the linear system.

…tem tests for the CUDA sparse solver

…cannot be negative

…lve steps to ensure data integrity & correctness

…etric test to check if solver handles it

…ng all entries so we can test dense solver more thoroughly

saikubairkota and others added 30 commits June 6, 2025 18:19

.cmake: added support for CUDA compiler and CUDA solver

8f722b5

solvers/cuda/CUDADenseSolver: enabled apis & factories to create CUDA…

1a9c41d

…DenseSolver instances Enabled APIs and factories to create CUDADenseSolver instances in the CUDA backend. We also corrected the CUDADenseSolver instantiation in the factories.h file.

solvers/cuda/CUDADenseSolver: added function to get solution from the…

2a2b8d4

… solver

tests/mito.lib/solvers/cuda_dense: added a simple linear system solve…

bfe62f6

… test

solvers/cuda/CUDADenseSolver: added cusolver_traits to support differ…

b3541e4

…ent data types Added the cusolver_traits struct to support double and float data type implementations in the CUDADenseSolver.

solvers/cuda/CUDADenseSolver: storing matrix in column-major order on…

c88a099

… host We are storing the matrix in column-major order on the host since the cuSolver library expects matrices in column-major order and we could avoid doing a transpose later on the GPU.

solvers/cuda/CUDADenseSolver: added support for Cholesky solver

5dea7bc

tests/mito.lib/solvers/cuda_dense: switched to Cholesky solver for so…

9f38bec

…lving the linear system

tests/mito.lib/solvers/cuda_dense: introduced checks with tolerances

1e8a49a

Added checks with tolerances as the results were very close but not exactly equal which is fine for floating point operations.

tests/mito.lib/solvers/cuda_dense: added test with an unsymmetric matrix

10a4314

solvers/cuda: {typedef} real type in {CUDADenseSolver} class

36160df

solvers/cuda: remove dependency of class {CUDADenseSolver} from other…

2f5557e

… {mito} headers This is achieved for example ensuring class {CUDADenseSolver} uses {double} instead of {mito::real} as the underlying {real_type}.

solvers/cuda: add concept for a type representing a real value

9a59e4a

The template argument of class {CUDADenseSolver} is now required to be a type representing a real value.

solvers/cuda: remove redundant namespace specification

4c7a268

solvers/cuda/CUDADenseSolver: added separate header for CUDA error ch…

88e15e8

…ecks

solvers/cuda/CUDASolver: added base class for implementing attributes…

0c947d4

… & methods common to all CUDA solvers

solvers/cuda/CUDADenseSolver: made derived class of CUDASolver

665a86c

Made the CUDADenseSolver a derived class of CUDASolver to leverage common functionality and improve code reuse.

solvers/cuda/utilities: added HostArray utility class to manage host …

a8d7969

…memory more easily

solvers/cuda/utilities: added DeviceArray utility class to manage dev…

3b5e210

…ice memory Similar to the HostArray class, we added a DeviceArray class to manage device memory for allocation and deallocation more easily.

solvers/cuda: removed/corrected some comments

f7d913b

solvers/cuda/utilities: added DeviceSparseMatrix class to manage devi…

3ee0e23

…ce memory for sparse matrices

saikubairkota added 5 commits September 18, 2025 14:55

tests/mito.lib/solvers/cuda_sparse: added symmetric & unsymmetric sys…

7ecb1ce

…tem tests for the CUDA sparse solver

solvers/cuda/CUDASparseSolver: corrected nnz default input as size_t …

356a00b

…cannot be negative

solvers/cuda: added synchronization steps after memory transfers & so…

a7e15e6

…lve steps to ensure data integrity & correctness

tests/mito.lib/solvers/cuda_sparse: not setting nnz per row in unsymm…

0bf8c74

…etric test to check if solver handles it

tests/mito.lib/solvers/cuda_dense: made matrices truly dense by filli…

d0e52bc

…ng all entries so we can test dense solver more thoroughly

saikubairkota requested a review from biancagi October 10, 2025 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDA solver: First attempt at linear system solver for CUDA cores #89

CUDA solver: First attempt at linear system solver for CUDA cores #89

Uh oh!

saikubairkota commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CUDA solver: First attempt at linear system solver for CUDA cores #89

Are you sure you want to change the base?

CUDA solver: First attempt at linear system solver for CUDA cores #89

Uh oh!

Conversation

saikubairkota commented Sep 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants