Comparison with expandable_segments in pytorch/c10?

https://github.com/pytorch/pytorch/pull/96995

https://github.com/pytorch/pytorch/blob/95a86ed9ca107329151e0dc172386d50dd3471c6/c10/cuda/CUDACachingAllocator.cpp#L311-L324
> The expandable_segments:True option is used to enable/disable this behavior. We
use cuda's low-level memory APIs, which are similar to mmap, to extend the
memory segments. These APIs separate the allocation of physical memory
(cuMemCreate) from the allocation of virtual address space (cuMemAddressReserve)
and the associate between them cuMemMap/cuMemSetAccess.
>
> When we allocate a new segment, we allocate enough address space to map
basically the entire physical memory of the GPU (there is 256TiB of address
space), but we only map enough physical memory to handle the current amount of
memory needed by the program. As more is requested, we add more physical memory
to the segment. This can work at the granularity of GPU pages which are 2MiB
currently.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparison with expandable_segments in pytorch/c10? #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Comparison with expandable_segments in pytorch/c10? #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions