Skip to content

Commit 65ff57f

Browse files
leofangkkraus14cpcloudCopilot
authored
Prepare for cuda.core v0.5.0 release (#1392)
Co-authored-by: leofang <5534781+leofang@users.noreply.github.com> Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com> Co-authored-by: Phillip Cloud <417981+cpcloud@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
1 parent f83eff2 commit 65ff57f

File tree

5 files changed

+75
-50
lines changed

5 files changed

+75
-50
lines changed

cuda_core/cuda/core/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
#
33
# SPDX-License-Identifier: Apache-2.0
44

5-
__version__ = "0.4.2"
5+
__version__ = "0.5.0"

cuda_core/docs/source/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ CUDA runtime
2626
Event
2727
MemoryResource
2828
DeviceMemoryResource
29+
GraphMemoryResource
2930
PinnedMemoryResource
3031
ManagedMemoryResource
3132
LegacyPinnedMemoryResource
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
.. SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
.. SPDX-License-Identifier: Apache-2.0
3+
4+
.. currentmodule:: cuda.core.experimental
5+
6+
``cuda.core`` 0.5.0 Release Notes
7+
=================================
8+
9+
10+
Highlights
11+
----------
12+
13+
- Added memory management support (allocation, deallocation, copy, and fill) for CUDA graphs.
14+
- Added :class:`PinnedMemoryResource` and :class:`ManagedMemoryResource` for advanced memory management.
15+
- Added peer access control to :class:`DeviceMemoryResource`.
16+
- Reduced Python overhead and improved performance for calling :func:`launch`, constructing :class:`LaunchConfig`, and accessing :class:`DeviceMemoryResource` attributes.
17+
18+
19+
Breaking Changes
20+
----------------
21+
22+
The support for setting :attr:`VirtualMemoryResourceOptions.handle_type` to ``"win32"`` is removed. Please reach out to us on GitHub if you have a use case.
23+
24+
The following APIs have been deprecated and will be removed in 0.6.0:
25+
26+
- ``cuda.core.experimental.system.driver_version`` has been replaced with
27+
``cuda.core.experimental.system.get_driver_version()``.
28+
- ``cuda.core.experimental.system.num_devices`` has been replaced with
29+
``cuda.core.experimental.system.get_num_devices()``.
30+
- ``cuda.core.experimental.system.devices`` has been replaced with
31+
``cuda.core.experimental.Device.get_all_devices()``.
32+
33+
Other changes:
34+
35+
- The :meth:`utils.StridedMemoryView.__init__` constructor is deprecated in favor of the new ``from_*`` classmethods, see below.
36+
- Support for Python 3.9 and 3.13t is dropped.
37+
38+
39+
New features
40+
------------
41+
42+
- Added :class:`GraphMemoryResource` for allocating and deallocating memory when building a CUDA graph.
43+
- Added :class:`PinnedMemoryResource` and :class:`PinnedMemoryResourceOptions` for managing host-pinned memory pools with optional IPC support.
44+
- Added :class:`ManagedMemoryResource` and :class:`ManagedMemoryResourceOptions` for managing unified memory pools accessible from both host and device.
45+
- Added :meth:`Buffer.fill` method for efficient memory initialization, supporting ``int``, ``bytes``, and general buffer protocol objects.
46+
- :class:`Buffer` can now wrap external memory allocations with an owner object.
47+
- Added alternative constructors :meth:`~utils.StridedMemoryView.from_buffer`, :meth:`~utils.StridedMemoryView.from_dlpack`, and :meth:`~utils.StridedMemoryView.from_cuda_array_interface`
48+
and a new property :attr:`~utils.StridedMemoryView.size` for :class:`~utils.StridedMemoryView`.
49+
- Added :meth:`ProgramOptions.as_bytes` and :meth:`LinkerOptions.as_bytes` public APIs for converting options to backend-specific byte representations.
50+
- Updated :class:`Device` constructor to accept either a :class:`Device` instance or a device ordinal (``int``).
51+
- Added :meth:`Device.get_all_devices` classmethod.
52+
- IPC-imported buffers can now be re-exported to other processes.
53+
54+
55+
New examples
56+
------------
57+
58+
None.
59+
60+
61+
Fixes and enhancements
62+
----------------------
63+
64+
- Most CUDA resources can be hashed now.
65+
- Python ``bool`` objects are now converted to C++ ``bool`` type when passed as kernel arguments (previously converted to ``int``).
66+
- Restored v0.3.x :class:`MemoryResource` behaviors and missing MR attributes for backward compatibility.
67+
- Added warning when multiprocessing start method is set to ``'fork'``.
68+
- Fixed potential memory leaks when DLPack capsule creation is interrupted.
69+
- Fixed :class:`VirtualMemoryResource` on Windows platforms.
70+
- Fixed NVRTC program name handling on Windows to avoid filesystem issues.
71+
- Improved test determinism by replacing OS sleep with GPU nanosleep kernel in event timing tests.
72+
- Fixed CUDA graph issues with ``cuda-python==12.6.*``.

cuda_core/docs/source/release/0.5.x-notes.rst

Lines changed: 0 additions & 48 deletions
This file was deleted.

cuda_core/pixi.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ cu12 = { features = ["cu12", "test", "cython-tests"], solve-group = "cu12" }
6868
# TODO: check if these can be extracted from pyproject.toml
6969
[package]
7070
name = "cuda-core"
71-
version = "0.4.2"
71+
version = "0.5.0"
7272

7373
[package.build]
7474
backend = { name = "pixi-build-python", version = "*" }

0 commit comments

Comments
 (0)