Skip to content

Solution for CUDA out of memory, trying to allocate 133792483MiB #27

@liuyifan22

Description

@liuyifan22

Thank the authors for the great work. I have met this "trying to allocate several hundred TB GPU memory" before。

This is often because you compiled this lib on on type of GPU, and try to call this lib on another type. If these two type of GPUs do not have same architecture, the call will fail and request an amount of memory only a top cluster can provide.

The solution is: to uninstall your current lib, remove the wheels entirely (can directly remove the whole folder and clone again), set this:

export TORCH_CUDA_ARCH_LIST="8.0;8.6;8.9;9.0"

Choose the architectures according to your need. If your test machine has 8.0 and computation machine has 9.0, then write 8.0;9.0

The solution is provided by https://blog.csdn.net/m0_57143158/article/details/143103426?spm=1001.2014.3001.8078#comments_35737374
(Chinese readers can directly refer to this post)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions