Skip to content

Handle multiple GPUs in CDI spec generation from CSV#1461

Merged
elezar merged 9 commits intoNVIDIA:mainfrom
elezar:dgpu-on-nvgpu
Dec 8, 2025
Merged

Handle multiple GPUs in CDI spec generation from CSV#1461
elezar merged 9 commits intoNVIDIA:mainfrom
elezar:dgpu-on-nvgpu

Conversation

@elezar
Copy link
Member

@elezar elezar commented Nov 17, 2025

This change allows CDI specs to be generated for multiple
devices when using CSV mode. This can be used in cases where
a Tegra-based system consists of an iGPU and dGPU.

This behavior can be opted out of using the disable-multiple-csv-devices
feature flag. This can be specified by adding the

            --feaure-flags=disable-multiple-csv-devices

command line option to the nvidia-ctk cdi generate command or to the
automatic CDI spec generation by adding

    NVIDIA_CTK_CDI_GENERATE_FEATURE_FLAGS=disable-multiple-csv-devices

to the /etc/nvidia-container-toolkit/nvidia-cdi-refresh.env file.

Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just 2 non-blocking nits

@coveralls
Copy link

coveralls commented Dec 3, 2025

Pull Request Test Coverage Report for Build 20024248479

Details

  • 30 of 366 (8.2%) changed or added relevant lines in 9 files are covered.
  • 4 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.6%) to 37.031%

Changes Missing Coverage Covered Lines Changed/Added Lines %
internal/platform-support/tegra/tegra.go 0 7 0.0%
internal/platform-support/tegra/csv.go 4 12 33.33%
pkg/nvcdi/full-gpu-nvml.go 0 9 0.0%
pkg/nvcdi/common-nvml.go 0 13 0.0%
internal/platform-support/tegra/options.go 0 28 0.0%
internal/platform-support/tegra/filter.go 10 44 22.73%
internal/platform-support/tegra/mount_specs.go 15 63 23.81%
pkg/nvcdi/lib-csv.go 0 189 0.0%
Files with Coverage Reduction New Missed Lines %
internal/platform-support/tegra/tegra.go 1 0.0%
pkg/nvcdi/lib-csv.go 1 0.0%
pkg/nvcdi/full-gpu-nvml.go 2 18.85%
Totals Coverage Status
Change from base Build 20024159196: -0.6%
Covered Lines: 5197
Relevant Lines: 14034

💛 - Coveralls

@elezar
Copy link
Member Author

elezar commented Dec 3, 2025

I have split two of the commits originally included here into their own PRs: #1511 and #1512

ArangoGutierrez
ArangoGutierrez previously approved these changes Dec 4, 2025
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I'll now proceed to review the spin-off PR's

@elezar elezar force-pushed the dgpu-on-nvgpu branch 2 times, most recently from b416704 to a8d7b65 Compare December 4, 2025 13:25
@ArangoGutierrez ArangoGutierrez self-requested a review December 4, 2025 13:46
@ArangoGutierrez ArangoGutierrez dismissed their stale review December 4, 2025 13:47

Updated commits

@elezar elezar added the tegra label Dec 8, 2025
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change updates the way we construct a discoverer for tegra systems
to be more flexible in terms of how the SOURCES of the mount specs can
be specified. This allows for subsequent changes like adding (or removing)
mount specs at the point of construction.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change allows CDI specs to be generated for multiple
devices when using CSV mode. This can be used in cases where
a Tegra-based system consists of an iGPU and dGPU.

This behavior can be opted out of using the disable-multiple-csv-devices
feature flag. This can be specified by adding the

	--feaure-flags=disable-multiple-csv-devices

command line option to the nvidia-ctk cdi generate command or to the
automatic CDI spec generation by adding

NVIDIA_CTK_CDI_GENERATE_FEATURE_FLAGS=disable-multiple-csv-devices

to the /etc/nvidia-container-toolkit/nvidia-cdi-refresh.env file.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elezar elezar merged commit 923fa9b into NVIDIA:main Dec 8, 2025
16 checks passed
@elezar elezar deleted the dgpu-on-nvgpu branch December 8, 2025 13:27
Comment on lines +295 to +298
func isIntegratedGPUID(id device.Identifier) bool {
_, err := uuid.Parse(string(id))
return err == nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- would this method not also return true for a discrete GPU identifier? The isIntegratedGPUID method name seems to indicate this is unique to discrete GPUs...

EDIT: Okay I think I see why this method is needed. Based on reading other parts of the code, I am assuming id.IsGpuUUID() returns false for integrated GPU UUIDs? Some more context would help here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those heuristics I keep mentioning. Currently, UUIDs for discrete GPUs have a GPU- prefix (MIG- for MIG devices), and on all Tegra-based systems that I have had access to the UUID is a "standard" UUID for example:

$ nvidia-smi -L
GPU 0: Orin (nvgpu) (UUID: 1833c8b5-9aa0-5382-b784-68b7e77eb185)

We have been pushing the NVML team for an "IsIntegrated" API, but have not had a commitment.

Let me update the function comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #1674 to update the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, this makes sense now! Thanks for the additional context and raising the PR.

if pciInfo.Bus != 1 {
return false, nil
}
return pciInfo.Device == 0, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- does this mean that integrated GPUs (even though they are not attached to the PCI bus) always appear to have a PCI address of 0000:01:00 (domain:bus:device)?

nit: as a reader, this may be easier to grok if rewritten as

Suggested change
return pciInfo.Device == 0, nil
if pciInfo.Domain == 0 && pciInfo.Bus == 1 && pci.Device == 0 {
return true, nil
}
return false, nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the case of Thor-based systems that I have had access to, this has been the case. Orin-based systems that I have had access to do not support getting PCI information. I will update the implementation for clarity.

csvDeviceNodeDiscoverer,
},
featureFlags: l.featureFlags,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- What are the differences between the device specs generated for dGPUs and iGPUs? Is the addition of control device nodes (e.g. nvidiactl, nvidia-uvm) the main difference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The device specs generated for iGPUs depend entirely on the contents of the /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv file that is constructed by the platform team. For example, on an Orin-based system I have:

$ cat /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv
dev, /dev/dri/card*
dev, /dev/dri/renderD*
dir, /dev/dri/by-path
dev, /dev/fb0
dev, /dev/fb1
dev, /dev/host1x-fence
dev, /dev/nvhost-as-gpu
dev, /dev/nvhost-ctrl-gpu
dev, /dev/nvhost-ctrl-nvdla0
dev, /dev/nvhost-ctrl-nvdla1
dev, /dev/nvhost-ctrl-pva0
dev, /dev/nvhost-ctxsw-gpu
dev, /dev/nvhost-dbg-gpu
dev, /dev/nvhost-gpu
dev, /dev/nvhost-nvsched-gpu
dev, /dev/nvhost-power-gpu
dev, /dev/nvhost-prof-ctx-gpu
dev, /dev/nvhost-prof-dev-gpu
dev, /dev/nvhost-prof-gpu
dev, /dev/nvhost-sched-gpu
dev, /dev/nvhost-tsg-gpu
dev, /dev/nvgpu/igpu0/as
dev, /dev/nvgpu/igpu0/channel
dev, /dev/nvgpu/igpu0/ctrl
dev, /dev/nvgpu/igpu0/ctxsw
dev, /dev/nvgpu/igpu0/dbg
dev, /dev/nvgpu/igpu0/nvsched
dev, /dev/nvgpu/igpu0/power
dev, /dev/nvgpu/igpu0/prof
dev, /dev/nvgpu/igpu0/prof-ctx
dev, /dev/nvgpu/igpu0/prof-dev
dev, /dev/nvgpu/igpu0/sched
dev, /dev/nvgpu/igpu0/tsg
dev, /dev/nvidia-modeset
dev, /dev/nvidia0
dev, /dev/nvidiactl
dev, /dev/nvmap
dev, /dev/nvsciipc
dev, /dev/v4l2-nvdec
dev, /dev/v4l2-nvenc

This file is provided by the nvidia-l4t-init package:

$ dpkg -S  /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv
nvidia-l4t-init: /etc/nvidia-container-runtime/host-files-for-container.d/devices.csv

Note that this includes /dev/nvidia0 and /dev/nvidiactl for this system. In the case of Thor-systems, this would include /dev/nvidia0, /dev/nvidia1, and /dev/nvidiactl.

For the purpose of this discussion then, the primary difference between the device nodes for the two devices are that the dGPU includes the /dev/nvidia-uvm and /dev/nvidia-uvm-tools devices that are required for actually running CUDA applications. On a Thor-based system using nvgpu, the container also needs access to the OTHER device nodes mentioned in the CSV file. We currently include all of them, but this list could probably be reduced.

Also note that on a Thor-based system that includes a dGPU, the second (rendering) device node for the iGPU is /dev/nvidia2 and NOT /dev/nvidia1.

// device level.
additionalDiscoverers: []discover.Discover{
(*nvmllib)(l).controlDeviceNodeDiscoverer(),
csvDeviceNodeDiscoverer,
Copy link
Contributor

@cdesiniotis cdesiniotis Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question -- Conceptually speaking, why do we have to add the csvDeviceNodeDiscoverer here? I ask sicne the fullGPUDeviceSpecGenerator will, by default, construct and use a device node discoverer here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add this because in addition to the "Standard" dGPU device nodes that are returned by the fullGPUDIscovererer that we construct as linked, we ALSO need access to (at least some of) the device nodes defined in the CSV file. The csvDeviceNodeDiscoverer in this case should be filtering out the specific device nodes (e.g. /dev/nvidia0 and /dev/nvidia2) associated with the iGPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants