-
Notifications
You must be signed in to change notification settings - Fork 105
Description
I'm curious if there were any recent discussions on migrating the storage type from cuco::pair_type<cuda::atomic<key_type>, cuda::atomic<mapped_type>>* to coco::pair_type<key_type, mapped_type>*and then use libcu++'s new cuda::atomic_ref<T> for thread-safe table manipulation when needed. (also addressed in NVIDIA/libcudacxx#110)
Let's start this thread to collect everything related to this topic in one place.
I will update the top post regularly so people don't have to scroll through everything.
-
Starting with release 1.9.0, libcu++'s
atomic_refwill support floating point types (see Add atomics for floating point types. libcudacxx#286), as well as <4B types (they should be already available iirc). Though FP16 is still WIP. -
Large type support (>8B) is currently blocked by Enable user-provided lock table for
atomic_ref<T>cccl#990 and Added support for most of <mutex> libcudacxx#113. The latter is planned to land in 1.9.0. -
Constraints imposed by the standard:
The lifetime of an object must exceed the lifetime of all atomic_refs that references the object. While any atomic_ref instances referencing an object exists, the object must be exclusively accessed through these atomic_ref instances. No subobject of an object referenced by an atomic_ref object may be concurrently referenced by any other atomic_ref object.
Atomic operations applied to an object through an atomic_ref are atomic with respect to atomic operations applied through any other atomic_ref referencing the same object.
- We need to refactor some components, e.g.,
probe_sequencedefines the slot type as a pair ofcuda::atomics. However, it seems that it is not actually using any of the atomic operations.
As I am currently refactoring the static_reduction_map PR (#98), I would suggest using this as an opportunity to test this atomic_ref approach in our codebase and see if we run into any problems before refactoring all of the data structures.