-
Notifications
You must be signed in to change notification settings - Fork 8
Implement csc_view, implement transposed.
#30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3da8d45 to
de12da2
Compare
|
There are a few important features missing from this PR:
These will come in a follow-up PR. For now, I think merging in GPU implementations is more pressing before adding additional features. |
| TEST(CscView, SpMV_Ascaled) { | ||
| using T = float; | ||
| using I = spblas::index_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to test SpMV_xscaled as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, just added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we should write a new test to more exhaustively examine all combinations of matrices and views. I think we should allow the vendor backends to mature a little more first, though.
|
|
||
| template <matrix M> | ||
| requires __detail::is_csr_view_v<M> | ||
| oneapi::mkl::sparse::matrix_handle_t create_matrix_handle(sycl::queue& q, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to keep calling this create_matrix_handle, thinking about future with matrix_opt, it is maybe better as create_or_retrieve_matrix_handle, which at that point, might just be more simply called get_matrix_handle ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to add get_matrix_handle once we merge your matrix_opt PR and the matrix_opt actually might have a matrix handle inside it.
| template <typename T> | ||
| armpl_status_t (*create_spmat_csc)(armpl_spmat_t*, armpl_int_t, armpl_int_t, | ||
| const armpl_int_t*, const armpl_int_t*, | ||
| const T*, armpl_int_t); | ||
| template <> | ||
| inline constexpr auto create_spmat_csc<float> = &armpl_spmat_create_csc_s; | ||
| template <> | ||
| inline constexpr auto create_spmat_csc<double> = &armpl_spmat_create_csc_d; | ||
| template <> | ||
| inline constexpr auto create_spmat_csc<std::complex<float>> = | ||
| &armpl_spmat_create_csc_c; | ||
| template <> | ||
| inline constexpr auto create_spmat_csc<std::complex<double>> = | ||
| &armpl_spmat_create_csc_z; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you prefer this kind of template specialization over the kind I have in https://github.com/SparseBLAS/spblas-reference/pull/23/files#diff-6ac61c9e281330753acb230d10581803329208dd22ef5d9fdb5492b845c8acf1R19-R42 for instance ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I actually prefer what you're doing there with IESparseBLAS. I think Chris originally wrote this code, but the two styles are pretty much equivalent.
| template <typename M> | ||
| requires(__detail::is_csc_view_v<M>) | ||
| auto column(M&& m, typename std::remove_cvref_t<M>::index_type column_index) { | ||
| using O = typename std::remove_cvref_t<M>::offset_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need to go through and update all the
using T = tensor_scalar_t<A>;
using I = tensor_index_t<A>;
using O = tensor_offset_t<A>;
usages everywhere to include a std::remove_cvref_t<> call similar to what you are doing here ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually tensor_scalar_t<A>, etc. will automatically apply a std::remove_cvref_t. I think the only reason I don't use it here is include order.
| template <matrix M> | ||
| requires(__detail::is_csr_view_v<M>) | ||
| auto transposed(M&& m) { | ||
| return csc_view<tensor_scalar_t<M>, tensor_index_t<M>, tensor_offset_t<M>>( | ||
| m.values(), m.rowptr(), m.colind(), {m.shape()[1], m.shape()[0]}, | ||
| m.size()); | ||
| } | ||
|
|
||
| template <matrix M> | ||
| requires(__detail::is_csc_view_v<M>) | ||
| auto transposed(M&& m) { | ||
| return csr_view<tensor_scalar_t<M>, tensor_index_t<M>, tensor_offset_t<M>>( | ||
| m.values(), m.colptr(), m.rowind(), {m.shape()[1], m.shape()[0]}, | ||
| m.size()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is where the relationship between CSR and CSC through transpose (not conjugate transpose though) comes into play with the switching of nRows/nCols and reinterpretting rowptr <-> colptr and rowind <-> colind ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you had
csr_view<float> a(/* ... */);
// Explicit transpose class
transposed_view<csr_view<float>> a_t = transposed(a);
// Versus csc_view
csc_view<float> a_csc_t = transposed(a);If a.shape() was (10, 20), wouldn't both a_t.shape() and a_csc_t.shape() both be (20, 10)? It seems to me the two should behave identically (except the explicit one has a .base() and the CSC one doesn't).
| // y = A * (alpha * x) | ||
| multiply(a, scaled(2.f, x), y); | ||
|
|
||
| fmt::print("{}\n", spblas::__backend::values(y)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure we want to print out the 100x10 = 1000 values of y ? maybe we want to have a "print dense matrix" routine that prints the first 4 rows (first 4 columns, ... , last 4 columns) and then ... then last 4 rows (first 4 columns, ... , last 4 columns) in a nicely printed and tab aligned to commas view ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do similar thing with dense vector, only printing out first 4 elements, then last 4 elements. And also a similar thing for sparse matrix with printing first 4 rows and last 4 rows ...
maybe it could take in a parameter that represents block_size (default = 4) to be printed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think writing a pretty printer and then using that in the examples would be nice. (We can implement a ostream / fmt formatter.) I think that probably belongs in another PR, though.
spencerpatty
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should talk through some of my suggestions and questions posted in the review before I give full approval.
Co-authored-by: Spencer Patty <spencer.patty@intel.com>
Co-authored-by: Spencer Patty <spencer.patty@intel.com>
Implement
csc_view, which supports CSC, as well astransposed, which currently transposes between CSR <-> CSC.This PR:
csc_view.transposed, which transposes between CSR <-> CSC.