Handle multiple GPUs in CDI spec generation from CSV#1461
Handle multiple GPUs in CDI spec generation from CSV#1461elezar merged 9 commits intoNVIDIA:mainfrom
Conversation
2f9fcb8 to
57ef289
Compare
ArangoGutierrez
left a comment
There was a problem hiding this comment.
LGTM, just 2 non-blocking nits
b12a6f1 to
b827872
Compare
Pull Request Test Coverage Report for Build 20024248479Details
💛 - Coveralls |
ArangoGutierrez
left a comment
There was a problem hiding this comment.
LGTM - I'll now proceed to review the spin-off PR's
b416704 to
a8d7b65
Compare
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the way we construct a discoverer for tegra systems to be more flexible in terms of how the SOURCES of the mount specs can be specified. This allows for subsequent changes like adding (or removing) mount specs at the point of construction. Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows CDI specs to be generated for multiple devices when using CSV mode. This can be used in cases where a Tegra-based system consists of an iGPU and dGPU. This behavior can be opted out of using the disable-multiple-csv-devices feature flag. This can be specified by adding the --feaure-flags=disable-multiple-csv-devices command line option to the nvidia-ctk cdi generate command or to the automatic CDI spec generation by adding NVIDIA_CTK_CDI_GENERATE_FEATURE_FLAGS=disable-multiple-csv-devices to the /etc/nvidia-container-toolkit/nvidia-cdi-refresh.env file. Signed-off-by: Evan Lezar <elezar@nvidia.com>
| func isIntegratedGPUID(id device.Identifier) bool { | ||
| _, err := uuid.Parse(string(id)) | ||
| return err == nil | ||
| } |
There was a problem hiding this comment.
Question -- would this method not also return true for a discrete GPU identifier? The isIntegratedGPUID method name seems to indicate this is unique to discrete GPUs...
EDIT: Okay I think I see why this method is needed. Based on reading other parts of the code, I am assuming id.IsGpuUUID() returns false for integrated GPU UUIDs? Some more context would help here.
There was a problem hiding this comment.
This is one of those heuristics I keep mentioning. Currently, UUIDs for discrete GPUs have a GPU- prefix (MIG- for MIG devices), and on all Tegra-based systems that I have had access to the UUID is a "standard" UUID for example:
$ nvidia-smi -L
GPU 0: Orin (nvgpu) (UUID: 1833c8b5-9aa0-5382-b784-68b7e77eb185)
We have been pushing the NVML team for an "IsIntegrated" API, but have not had a commitment.
Let me update the function comment.
There was a problem hiding this comment.
Ah okay, this makes sense now! Thanks for the additional context and raising the PR.
| if pciInfo.Bus != 1 { | ||
| return false, nil | ||
| } | ||
| return pciInfo.Device == 0, nil |
There was a problem hiding this comment.
Question -- does this mean that integrated GPUs (even though they are not attached to the PCI bus) always appear to have a PCI address of 0000:01:00 (domain:bus:device)?
nit: as a reader, this may be easier to grok if rewritten as
| return pciInfo.Device == 0, nil | |
| if pciInfo.Domain == 0 && pciInfo.Bus == 1 && pci.Device == 0 { | |
| return true, nil | |
| } | |
| return false, nil |
There was a problem hiding this comment.
At least in the case of Thor-based systems that I have had access to, this has been the case. Orin-based systems that I have had access to do not support getting PCI information. I will update the implementation for clarity.
| csvDeviceNodeDiscoverer, | ||
| }, | ||
| featureFlags: l.featureFlags, | ||
| }) |
There was a problem hiding this comment.
Question -- What are the differences between the device specs generated for dGPUs and iGPUs? Is the addition of control device nodes (e.g. nvidiactl, nvidia-uvm) the main difference?
There was a problem hiding this comment.
The device specs generated for iGPUs depend entirely on the contents of the /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv file that is constructed by the platform team. For example, on an Orin-based system I have:
$ cat /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv
dev, /dev/dri/card*
dev, /dev/dri/renderD*
dir, /dev/dri/by-path
dev, /dev/fb0
dev, /dev/fb1
dev, /dev/host1x-fence
dev, /dev/nvhost-as-gpu
dev, /dev/nvhost-ctrl-gpu
dev, /dev/nvhost-ctrl-nvdla0
dev, /dev/nvhost-ctrl-nvdla1
dev, /dev/nvhost-ctrl-pva0
dev, /dev/nvhost-ctxsw-gpu
dev, /dev/nvhost-dbg-gpu
dev, /dev/nvhost-gpu
dev, /dev/nvhost-nvsched-gpu
dev, /dev/nvhost-power-gpu
dev, /dev/nvhost-prof-ctx-gpu
dev, /dev/nvhost-prof-dev-gpu
dev, /dev/nvhost-prof-gpu
dev, /dev/nvhost-sched-gpu
dev, /dev/nvhost-tsg-gpu
dev, /dev/nvgpu/igpu0/as
dev, /dev/nvgpu/igpu0/channel
dev, /dev/nvgpu/igpu0/ctrl
dev, /dev/nvgpu/igpu0/ctxsw
dev, /dev/nvgpu/igpu0/dbg
dev, /dev/nvgpu/igpu0/nvsched
dev, /dev/nvgpu/igpu0/power
dev, /dev/nvgpu/igpu0/prof
dev, /dev/nvgpu/igpu0/prof-ctx
dev, /dev/nvgpu/igpu0/prof-dev
dev, /dev/nvgpu/igpu0/sched
dev, /dev/nvgpu/igpu0/tsg
dev, /dev/nvidia-modeset
dev, /dev/nvidia0
dev, /dev/nvidiactl
dev, /dev/nvmap
dev, /dev/nvsciipc
dev, /dev/v4l2-nvdec
dev, /dev/v4l2-nvenc
This file is provided by the nvidia-l4t-init package:
$ dpkg -S /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv
nvidia-l4t-init: /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv
Note that this includes /dev/nvidia0 and /dev/nvidiactl for this system. In the case of Thor-systems, this would include /dev/nvidia0, /dev/nvidia1, and /dev/nvidiactl.
For the purpose of this discussion then, the primary difference between the device nodes for the two devices are that the dGPU includes the /dev/nvidia-uvm and /dev/nvidia-uvm-tools devices that are required for actually running CUDA applications. On a Thor-based system using nvgpu, the container also needs access to the OTHER device nodes mentioned in the CSV file. We currently include all of them, but this list could probably be reduced.
Also note that on a Thor-based system that includes a dGPU, the second (rendering) device node for the iGPU is /dev/nvidia2 and NOT /dev/nvidia1.
| // device level. | ||
| additionalDiscoverers: []discover.Discover{ | ||
| (*nvmllib)(l).controlDeviceNodeDiscoverer(), | ||
| csvDeviceNodeDiscoverer, |
There was a problem hiding this comment.
Question -- Conceptually speaking, why do we have to add the csvDeviceNodeDiscoverer here? I ask sicne the fullGPUDeviceSpecGenerator will, by default, construct and use a device node discoverer here.
There was a problem hiding this comment.
We add this because in addition to the "Standard" dGPU device nodes that are returned by the fullGPUDIscovererer that we construct as linked, we ALSO need access to (at least some of) the device nodes defined in the CSV file. The csvDeviceNodeDiscoverer in this case should be filtering out the specific device nodes (e.g. /dev/nvidia0 and /dev/nvidia2) associated with the iGPU.
This change allows CDI specs to be generated for multiple
devices when using CSV mode. This can be used in cases where
a Tegra-based system consists of an iGPU and dGPU.
This behavior can be opted out of using the
disable-multiple-csv-devicesfeature flag. This can be specified by adding the
command line option to the nvidia-ctk cdi generate command or to the
automatic CDI spec generation by adding
to the /etc/nvidia-container-toolkit/nvidia-cdi-refresh.env file.