-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Description:
When an HPO embedding cache already exists, the process raises a “Can't initialize NVML” warning. As a result, the computation falls back to CPU (see Example 1).
However, if no embedding exists, the error does not occur, and inference runs on GPU as expected (see Example 2).
Environment:
This behavior was observed on a Slurm-managed HPC cluster (HMS Biogrid). The exact environment setup may be a contributing factor, but I’m not entirely sure.
Example1
Executing: python.phenogpt2 /home/ch262025/PhenoGPT2/inference.py -i "/home/ch262025/PhenoGPT2/data/example/task_list_subset.json" -o "/home/ch262025/PhenoGPT2/data/results/example_testing" -model_dir "/programs/local/biogrids/phenogpt2/models/PhenoGPT2-EHR" -index "0" -negation --text_only
/programs/x86_64-linux/phenogpt2/51acdf1/.pixi/envs/default/lib/python3.11/site-packages/torch/cuda/__init__.py:129: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 803: system has unsupported display driver / cuda driver combination (Triggered internally at /home/conda/feedstock_root/build_artifacts/libtorch_1744247799952/work/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0
/programs/x86_64-linux/phenogpt2/51acdf1/.pixi/envs/default/lib/python3.11/site-packages/torch/cuda/__init__.py:734: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
`torch_dtype` is deprecated! Use `dtype` instead!
Detected existing HPO Database Embeddings
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:00<00:00, 79.09it/s]
start phenogpt2
/home/ch262025/PhenoGPT2/data/results/example_testing
use_vision: False
0%| | 0/10 [00:00<?, ?it/s]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Example 2
Executing: python.phenogpt2 /home/ch262025/PhenoGPT2/inference.py -i "/home/ch262025/PhenoGPT2/data/example/task_list_subset.json" -o "/home/ch262025/PhenoGPT2/data/results/example_testing" -model_dir "/programs/local/biogrids/phenogpt2/models/PhenoGPT2-EHR" -index "0" -negation --text_only
No existing HPO Database Embeddings are stored - Running embedding now
Embedding HPO database:: 0%| | 0/40451 [00:00<?, ?it/s]Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
Embedding HPO database:: 0%| | 1/40451 [00:00<1:59:52, 5.62it/s]
Embedding HPO database:: 0%| | 19/40451 [00:00<08:13, 81.94it/s]
...
Embedding HPO database:: 100%|██████████| 40451/40451 [03:04<00:00, 219.69it/s]
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards: 25%|██▌ | 1/4 [00:01<00:03, 1.13s/it]
Loading checkpoint shards: 50%|█████ | 2/4 [00:02<00:02, 1.11s/it]
Loading checkpoint shards: 75%|███████▌ | 3/4 [00:03<00:01, 1.10s/it]
Loading checkpoint shards: 100%|██████████| 4/4 [00:03<00:00, 1.26it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:03<00:00, 1.10it/s]
start phenogpt2
/home/ch262025/PhenoGPT2/data/results/example_testing
use_vision: False
0%| | 0/10 [00:00<?, ?it/s]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
2025-10-02 16:14:02.715 | DEBUG | PyRuSH.PyRuSHSentencizer:predict:100 - ....
...
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
Merging all chunks together: 0it [00:00, ?it/s]�[A
Merging all chunks together: 1it [00:00, 32768.00it/s]
10%|█ | 1/10 [00:06<00:58, 6.53s/it]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.
...
Merging all chunks together: 0it [00:00, ?it/s]�[A
Merging all chunks together: 1it [00:00, 38130.04it/s]
100%|██████████| 10/10 [00:38<00:00, 3.29s/it]
100%|██████████| 10/10 [00:38<00:00, 3.89s/it]
Metadata
Metadata
Assignees
Labels
No labels