Skip to content

Add pkg/nvpassthrough for binding GPUs to the vfio-pci driver#83

Open
cdesiniotis wants to merge 1 commit intoNVIDIA:mainfrom
cdesiniotis:nvpassthrough
Open

Add pkg/nvpassthrough for binding GPUs to the vfio-pci driver#83
cdesiniotis wants to merge 1 commit intoNVIDIA:mainfrom
cdesiniotis:nvpassthrough

Conversation

@cdesiniotis
Copy link
Contributor

@cdesiniotis cdesiniotis commented Jan 26, 2026

This is mostly a direct port from https://github.com/NVIDIA/k8s-driver-manager/tree/fd043d8f5f74a26b04f83f1eb11b659d402e94de/internal/nvpassthrough

The idea is to reuse this code in any component that needs to prepare GPUs for passthrough, e.g. by binding them to the vfio-pci driver. The NVIDIA DRA driver is an example of such component -- it needs to switch between the nvidia driver and vfio-pci driver when allocating GPUs for passthrough (as opposed to standard containers).

For reference, here is some sample code that uses this Go module:
https://github.com/NVIDIA/k8s-driver-manager/blob/fd043d8f5f74a26b04f83f1eb11b659d402e94de/cmd/vfio-manage/bind.go#L125-L147
https://github.com/NVIDIA/k8s-driver-manager/blob/fd043d8f5f74a26b04f83f1eb11b659d402e94de/cmd/vfio-manage/unbind.go#L114-L135

@cdesiniotis cdesiniotis force-pushed the nvpassthrough branch 2 times, most recently from 8fbb237 to efd6002 Compare January 26, 2026 22:28
@cdesiniotis
Copy link
Contributor Author

cc @varunrsekar

@cdesiniotis cdesiniotis requested a review from zvonkok January 26, 2026 23:34
Comment on lines +99 to +109
modAliasPath := filepath.Join(device.Path, "modalias")
modAliasContent, err := os.ReadFile(modAliasPath)
if err != nil {
return "", fmt.Errorf("failed to read modalias file for %s: %w", device.Address, err)
}

modAliasStr := strings.TrimSpace(string(modAliasContent))
modAlias, err := parseModAliasString(modAliasStr)
if err != nil {
return "", fmt.Errorf("failed to parse modalias string %q for device %q: %w", modAliasStr, device.Address, err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't have to be in the "copy" commit, but does it make sense to factor this into a function. It seems as if it's just the modAlias that we're actually interested in.

Comment on lines +111 to +128
kernelVersion, err := getKernelVersion()
if err != nil {
return "", fmt.Errorf("failed to get kernel version: %w", err)
}

modulesAliasFilePath := filepath.Join(libModulesRoot, kernelVersion, "modules.alias")
modulesAliasContent, err := os.ReadFile(modulesAliasFilePath)
if err != nil {
return "", fmt.Errorf("failed to read file %s: %w", modulesAliasFilePath, err)
}

// Get all vfio aliases from the modules.alias file
// (all lines starting with 'alias vfio_pci:')
vfioAliases := getVFIOAliases(string(modulesAliasContent))
if len(vfioAliases) == 0 {
n.logger.Debugf("No vfio_pci entries found in modules.alias file, falling back to default vfio-pci driver")
return vfioPCIDriverName, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems like it should be a getVfioAliases function.

vfioAliases := getVFIOAliases(string(modulesAliasContent))
if len(vfioAliases) == 0 {
n.logger.Debugf("No vfio_pci entries found in modules.alias file, falling back to default vfio-pci driver")
return vfioPCIDriverName, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Since we return early here, should we get the available vfio alias first before getting the device alias?

// the vfio-pci driver is loaded first and that an auxiliary graphics
// device also get bound to the vfio-pci driver.
func (n *nvpassthrough) BindToVFIODriver(device *nvpci.NvidiaPCIDevice) error {
vfioDriverName, err := n.FindBestVFIOVariant(device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case in which we would want to specify a SPECIFIC driver instead of always using "best"?

Comment on lines +171 to +179
driverDir := filepath.Join(pciDriversRoot, vfioDriverName)
if _, err := os.Stat(driverDir); err != nil {
vfioDriverNameNormalized := strings.ReplaceAll(vfioDriverName, "_", "-")
driverDir = filepath.Join(pciDriversRoot, vfioDriverNameNormalized)
if _, err := os.Stat(driverDir); err != nil {
return fmt.Errorf("failed to find directory for vfio driver %s at %s, is the module loaded?", vfioDriverName, pciDriversRoot)
}
vfioDriverName = vfioDriverNameNormalized
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a function? Alternatively, should we have a vfioDriver type that encapsulates the differences in names when loading the module and checking for the driver directory?

if _, err := os.Stat(driverDir); err != nil {
return fmt.Errorf("failed to find directory for vfio driver %s at %s, is the module loaded?", vfioDriverName, pciDriversRoot)
}
vfioDriverName = vfioDriverNameNormalized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log this change? Something like Binding ORIGINAL as MODIFIED?

Comment on lines +200 to +212
if auxDev.Driver == vfioDriverName {
return nil
}

n.logger.Infof("Binding graphics auxiliary device %s to driver: %s", auxDev.Address, vfioDriverName)

if err := unbind(auxDev.Address); err != nil {
return fmt.Errorf("failed to unbind graphics auxiliary device %s: %w", auxDev.Address, err)
}
if err := bind(auxDev.Address, vfioDriverName); err != nil {
return fmt.Errorf("failed to bind graphics auxiliary device %s to %s: %w", auxDev, vfioDriverName, err)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is that same as for the original device. Does it make sense to implement a function that does this. (we may have to implement it against a local interface that returns the current driver and address of a device).

// UnbindFromDriver unbinds the provided NVIDIA PCI Device from
// any driver it is currently bound to. This function also ensures
// an auxiliary graphics device is also unbound.
func (n *nvpassthrough) UnbindFromDriver(device *nvpci.NvidiaPCIDevice) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this function name, I would expect the device to be unbound from a specific driver. Should we rename the function?

return nil
}

func bind(device string, driver string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: bind seems to be called with an address? Device means something else in the context of this package. Does renaming the device paramter to address make things clearer?

Comment on lines +291 to +299
parts := strings.Split(entry.Name(), consumerPrefix)
if len(parts) != 2 {
continue
}

address := parts[1]
if address == "" {
continue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings.Cut function was recently pointed out to me. Does:

Suggested change
parts := strings.Split(entry.Name(), consumerPrefix)
if len(parts) != 2 {
continue
}
address := parts[1]
if address == "" {
continue
}
_, address, ok := strings.Cut(entry.Name(), consumerPrefix)
if !ok || address == "" {
continue
}

make this simpler to read and maintain?

Comment on lines +307 to +316
auxDev := &nvidiaPCIAuxDevice{
Path: path,
Address: address,
}

driver, err := getDriver(path)
if err != nil {
return nil, fmt.Errorf("failed to get driver for graphics auxiliary device %s: %w", address, err)
}
auxDev.Driver = driver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: What makes this different from any other device? Is the (Path, Address, Driver) tuple not common to ANY PCI device? Does this mean that we could refactor nvpci to implement such a device and then use it here. (Or is there possibly already an upstream implementation that we can leverage for this logic)?

return fmt.Errorf("failed to clear driver_override for %s: %w", device, err)
}

driverPath := filepath.Join(pciDevicesRoot, device, "driver")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We also have the getDriver function that accepts filepath.Join(root, addr).

Comment on lines +324 to +332
func getDriver(devicePath string) (string, error) {
driver, err := filepath.EvalSymlinks(filepath.Join(devicePath, "driver"))
switch {
case os.IsNotExist(err):
return "", nil
case err == nil:
return filepath.Base(driver), nil
}
return "", err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems duplicated over the codebase including the nvpci and nvmdev packages. Does it make sense to assess whether a refactor of the three packages would be beneficial?

return km
}

func (km *kernelModules) list(searchKey string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why is the list function not used?


package nvpassthrough

type basicLogger interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should this be in an internal/logger package?

ma := &modAlias{}
var before, after string
var found bool
after = input[1:] // cut leading 'v'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why not strings.TrimPrefix(input, "v")? That would make the intent clear from the code and not require a comment.

Comment on lines +62 to +66
split := strings.SplitN(input, ":", 2)
if len(split) != 2 {
return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input)
}
input = split[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
split := strings.SplitN(input, ":", 2)
if len(split) != 2 {
return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input)
}
input = split[1]
_, input, ok := strings.Cut(input, "pci:", 2)
if !ok {
return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input)
}

Alternatively, sicne we're not checking for the pci: prefix at all anyway, does it not make sense to just use:

    input := strings.TrimPrefix(input, "pci:")

var found bool
after = input[1:] // cut leading 'v'

before, after, found = strings.Cut(after, "d")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the spec for the string we're processing, it seems we're dealing with fixed-length segments. Should we use these lengths when parsing?

expectedError: true,
},
{
description: "more than one semicolon delimiter",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: "more than one semicolon delimiter",
description: "more than one colon delimiter",

},
{
description: "no wildcards",
input: "pci:v000010DEd00002941sv000010DEsd00002046bc03sc02i00",
Copy link
Member

@elezar elezar Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are tests that have varying length non-wildcard strings between the signifiers valid?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants