|
| 1 | +.. SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
| 2 | +.. SPDX-License-Identifier: Apache-2.0 |
| 3 | +
|
| 4 | +.. currentmodule:: cuda.core.experimental |
| 5 | + |
| 6 | +``cuda.core`` 0.5.0 Release Notes |
| 7 | +================================= |
| 8 | + |
| 9 | + |
| 10 | +Highlights |
| 11 | +---------- |
| 12 | + |
| 13 | +- Added memory management support (allocation, deallocation, copy, and fill) for CUDA graphs. |
| 14 | +- Added :class:`PinnedMemoryResource` and :class:`ManagedMemoryResource` for advanced memory management. |
| 15 | +- Added peer access control to :class:`DeviceMemoryResource`. |
| 16 | +- Reduced Python overhead and improved performance for calling :func:`launch`, constructing :class:`LaunchConfig`, and accessing :class:`DeviceMemoryResource` attributes. |
| 17 | + |
| 18 | + |
| 19 | +Breaking Changes |
| 20 | +---------------- |
| 21 | + |
| 22 | +The support for setting :attr:`VirtualMemoryResourceOptions.handle_type` to ``"win32"`` is removed. Please reach out to us on GitHub if you have a use case. |
| 23 | + |
| 24 | +The following APIs have been deprecated and will be removed in 0.6.0: |
| 25 | + |
| 26 | +- ``cuda.core.experimental.system.driver_version`` has been replaced with |
| 27 | + ``cuda.core.experimental.system.get_driver_version()``. |
| 28 | +- ``cuda.core.experimental.system.num_devices`` has been replaced with |
| 29 | + ``cuda.core.experimental.system.get_num_devices()``. |
| 30 | +- ``cuda.core.experimental.system.devices`` has been replaced with |
| 31 | + ``cuda.core.experimental.Device.get_all_devices()``. |
| 32 | + |
| 33 | +Other changes: |
| 34 | + |
| 35 | +- The :meth:`utils.StridedMemoryView.__init__` constructor is deprecated in favor of the new ``from_*`` classmethods, see below. |
| 36 | +- Support for Python 3.9 and 3.13t is dropped. |
| 37 | + |
| 38 | + |
| 39 | +New features |
| 40 | +------------ |
| 41 | + |
| 42 | +- Added :class:`GraphMemoryResource` for allocating and deallocating memory when building a CUDA graph. |
| 43 | +- Added :class:`PinnedMemoryResource` and :class:`PinnedMemoryResourceOptions` for managing host-pinned memory pools with optional IPC support. |
| 44 | +- Added :class:`ManagedMemoryResource` and :class:`ManagedMemoryResourceOptions` for managing unified memory pools accessible from both host and device. |
| 45 | +- Added :meth:`Buffer.fill` method for efficient memory initialization, supporting ``int``, ``bytes``, and general buffer protocol objects. |
| 46 | +- :class:`Buffer` can now wrap external memory allocations with an owner object. |
| 47 | +- Added alternative constructors :meth:`~utils.StridedMemoryView.from_buffer`, :meth:`~utils.StridedMemoryView.from_dlpack`, and :meth:`~utils.StridedMemoryView.from_cuda_array_interface` |
| 48 | + and a new property :attr:`~utils.StridedMemoryView.size` for :class:`~utils.StridedMemoryView`. |
| 49 | +- Added :meth:`ProgramOptions.as_bytes` and :meth:`LinkerOptions.as_bytes` public APIs for converting options to backend-specific byte representations. |
| 50 | +- Updated :class:`Device` constructor to accept either a :class:`Device` instance or a device ordinal (``int``). |
| 51 | +- Added :meth:`Device.get_all_devices` classmethod. |
| 52 | +- IPC-imported buffers can now be re-exported to other processes. |
| 53 | + |
| 54 | + |
| 55 | +New examples |
| 56 | +------------ |
| 57 | + |
| 58 | +None. |
| 59 | + |
| 60 | + |
| 61 | +Fixes and enhancements |
| 62 | +---------------------- |
| 63 | + |
| 64 | +- Most CUDA resources can be hashed now. |
| 65 | +- Python ``bool`` objects are now converted to C++ ``bool`` type when passed as kernel arguments (previously converted to ``int``). |
| 66 | +- Restored v0.3.x :class:`MemoryResource` behaviors and missing MR attributes for backward compatibility. |
| 67 | +- Added warning when multiprocessing start method is set to ``'fork'``. |
| 68 | +- Fixed potential memory leaks when DLPack capsule creation is interrupted. |
| 69 | +- Fixed :class:`VirtualMemoryResource` on Windows platforms. |
| 70 | +- Fixed NVRTC program name handling on Windows to avoid filesystem issues. |
| 71 | +- Improved test determinism by replacing OS sleep with GPU nanosleep kernel in event timing tests. |
| 72 | +- Fixed CUDA graph issues with ``cuda-python==12.6.*``. |
0 commit comments