Question: How do we  monitoring pod/processes  GPU usage 

Thank you for your dedication to developing a GPU memory oversubscription solution, which has immensely beneficial to our work.

I've conducted local tests involving various processes; however, the GPU utilization data obtained via nvidia-smi appears to be rather granular. Upon reviewing the README, I didn't discover a more refined monitoring approach, akin to Prometheus metrics.

Could you offer some suggestions for GPU usage by individual pods and processes?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: How do we monitoring pod/processes GPU usage #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question: How do we monitoring pod/processes GPU usage #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions