Skip to content

CUDA out of memory #5

@kimyoungji

Description

@kimyoungji

I run unconditioned inference as,

$python main/test.py --cfg_dir utils/config/samples/cascaded_ldm

I got following error

Traceback (most recent call last):
File "main/test.py", line 58, in
trainer.run_test()
File "/home/diffindscene/trainer/cascaded_ldm_trainer.py", line 244, in run_test
self.run_test_uncond()
File "/home/diffindscene/trainer/cascaded_ldm_trainer.py", line 306, in run_test_uncond
outdata_dict = model.restoration(data)
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/diffindscene/model/ms_ldm/multiscale_latent_diffusion.py", line 451, in restoration
occ_l2 = self.decode_occ_lv2(result, occ_l1, quant2_voxel)
File "/home/diffindscene/model/ms_ldm/multiscale_latent_diffusion.py", line 340, in decode_occ_lv2
quant1, diff_b, id_b = self.first_stage_module.quantize1(quant1_)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/diffindscene/model/auto_encoder/ms_vqgan/quantize.py", line 102, in forward
soft_one_hot = F.gumbel_softmax(logits, tau=temp, dim=1, hard=hard)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py", line 1902, in gumbel_softmax
ret = y_hard - y_soft.detach() + y_soft
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 23.65 GiB total capacity; 20.89 GiB already allocated; 1.23 GiB free; 21.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions