Add pkg/nvpassthrough for binding GPUs to the vfio-pci driver#83
Add pkg/nvpassthrough for binding GPUs to the vfio-pci driver#83cdesiniotis wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
8fbb237 to
efd6002
Compare
This is mostly a direct port from https://github.com/NVIDIA/k8s-driver-manager/tree/fd043d8f5f74a26b04f83f1eb11b659d402e94de/internal/nvpassthrough Signed-off-by: Christopher Desiniotis <cdesiniotis@nvidia.com>
efd6002 to
58de427
Compare
|
cc @varunrsekar |
| modAliasPath := filepath.Join(device.Path, "modalias") | ||
| modAliasContent, err := os.ReadFile(modAliasPath) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to read modalias file for %s: %w", device.Address, err) | ||
| } | ||
|
|
||
| modAliasStr := strings.TrimSpace(string(modAliasContent)) | ||
| modAlias, err := parseModAliasString(modAliasStr) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to parse modalias string %q for device %q: %w", modAliasStr, device.Address, err) | ||
| } |
There was a problem hiding this comment.
This doesn't have to be in the "copy" commit, but does it make sense to factor this into a function. It seems as if it's just the modAlias that we're actually interested in.
| kernelVersion, err := getKernelVersion() | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to get kernel version: %w", err) | ||
| } | ||
|
|
||
| modulesAliasFilePath := filepath.Join(libModulesRoot, kernelVersion, "modules.alias") | ||
| modulesAliasContent, err := os.ReadFile(modulesAliasFilePath) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to read file %s: %w", modulesAliasFilePath, err) | ||
| } | ||
|
|
||
| // Get all vfio aliases from the modules.alias file | ||
| // (all lines starting with 'alias vfio_pci:') | ||
| vfioAliases := getVFIOAliases(string(modulesAliasContent)) | ||
| if len(vfioAliases) == 0 { | ||
| n.logger.Debugf("No vfio_pci entries found in modules.alias file, falling back to default vfio-pci driver") | ||
| return vfioPCIDriverName, nil | ||
| } |
There was a problem hiding this comment.
This also seems like it should be a getVfioAliases function.
| vfioAliases := getVFIOAliases(string(modulesAliasContent)) | ||
| if len(vfioAliases) == 0 { | ||
| n.logger.Debugf("No vfio_pci entries found in modules.alias file, falling back to default vfio-pci driver") | ||
| return vfioPCIDriverName, nil |
There was a problem hiding this comment.
Question: Since we return early here, should we get the available vfio alias first before getting the device alias?
| // the vfio-pci driver is loaded first and that an auxiliary graphics | ||
| // device also get bound to the vfio-pci driver. | ||
| func (n *nvpassthrough) BindToVFIODriver(device *nvpci.NvidiaPCIDevice) error { | ||
| vfioDriverName, err := n.FindBestVFIOVariant(device) |
There was a problem hiding this comment.
Is there any case in which we would want to specify a SPECIFIC driver instead of always using "best"?
| driverDir := filepath.Join(pciDriversRoot, vfioDriverName) | ||
| if _, err := os.Stat(driverDir); err != nil { | ||
| vfioDriverNameNormalized := strings.ReplaceAll(vfioDriverName, "_", "-") | ||
| driverDir = filepath.Join(pciDriversRoot, vfioDriverNameNormalized) | ||
| if _, err := os.Stat(driverDir); err != nil { | ||
| return fmt.Errorf("failed to find directory for vfio driver %s at %s, is the module loaded?", vfioDriverName, pciDriversRoot) | ||
| } | ||
| vfioDriverName = vfioDriverNameNormalized | ||
| } |
There was a problem hiding this comment.
Should this be a function? Alternatively, should we have a vfioDriver type that encapsulates the differences in names when loading the module and checking for the driver directory?
| if _, err := os.Stat(driverDir); err != nil { | ||
| return fmt.Errorf("failed to find directory for vfio driver %s at %s, is the module loaded?", vfioDriverName, pciDriversRoot) | ||
| } | ||
| vfioDriverName = vfioDriverNameNormalized |
There was a problem hiding this comment.
Should we log this change? Something like Binding ORIGINAL as MODIFIED?
| if auxDev.Driver == vfioDriverName { | ||
| return nil | ||
| } | ||
|
|
||
| n.logger.Infof("Binding graphics auxiliary device %s to driver: %s", auxDev.Address, vfioDriverName) | ||
|
|
||
| if err := unbind(auxDev.Address); err != nil { | ||
| return fmt.Errorf("failed to unbind graphics auxiliary device %s: %w", auxDev.Address, err) | ||
| } | ||
| if err := bind(auxDev.Address, vfioDriverName); err != nil { | ||
| return fmt.Errorf("failed to bind graphics auxiliary device %s to %s: %w", auxDev, vfioDriverName, err) | ||
| } | ||
|
|
There was a problem hiding this comment.
This logic is that same as for the original device. Does it make sense to implement a function that does this. (we may have to implement it against a local interface that returns the current driver and address of a device).
| // UnbindFromDriver unbinds the provided NVIDIA PCI Device from | ||
| // any driver it is currently bound to. This function also ensures | ||
| // an auxiliary graphics device is also unbound. | ||
| func (n *nvpassthrough) UnbindFromDriver(device *nvpci.NvidiaPCIDevice) error { |
There was a problem hiding this comment.
With this function name, I would expect the device to be unbound from a specific driver. Should we rename the function?
| return nil | ||
| } | ||
|
|
||
| func bind(device string, driver string) error { |
There was a problem hiding this comment.
nit: bind seems to be called with an address? Device means something else in the context of this package. Does renaming the device paramter to address make things clearer?
| parts := strings.Split(entry.Name(), consumerPrefix) | ||
| if len(parts) != 2 { | ||
| continue | ||
| } | ||
|
|
||
| address := parts[1] | ||
| if address == "" { | ||
| continue | ||
| } |
There was a problem hiding this comment.
The strings.Cut function was recently pointed out to me. Does:
| parts := strings.Split(entry.Name(), consumerPrefix) | |
| if len(parts) != 2 { | |
| continue | |
| } | |
| address := parts[1] | |
| if address == "" { | |
| continue | |
| } | |
| _, address, ok := strings.Cut(entry.Name(), consumerPrefix) | |
| if !ok || address == "" { | |
| continue | |
| } |
make this simpler to read and maintain?
| auxDev := &nvidiaPCIAuxDevice{ | ||
| Path: path, | ||
| Address: address, | ||
| } | ||
|
|
||
| driver, err := getDriver(path) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get driver for graphics auxiliary device %s: %w", address, err) | ||
| } | ||
| auxDev.Driver = driver |
There was a problem hiding this comment.
Minor nit: What makes this different from any other device? Is the (Path, Address, Driver) tuple not common to ANY PCI device? Does this mean that we could refactor nvpci to implement such a device and then use it here. (Or is there possibly already an upstream implementation that we can leverage for this logic)?
| return fmt.Errorf("failed to clear driver_override for %s: %w", device, err) | ||
| } | ||
|
|
||
| driverPath := filepath.Join(pciDevicesRoot, device, "driver") |
There was a problem hiding this comment.
nit: We also have the getDriver function that accepts filepath.Join(root, addr).
| func getDriver(devicePath string) (string, error) { | ||
| driver, err := filepath.EvalSymlinks(filepath.Join(devicePath, "driver")) | ||
| switch { | ||
| case os.IsNotExist(err): | ||
| return "", nil | ||
| case err == nil: | ||
| return filepath.Base(driver), nil | ||
| } | ||
| return "", err |
There was a problem hiding this comment.
This logic seems duplicated over the codebase including the nvpci and nvmdev packages. Does it make sense to assess whether a refactor of the three packages would be beneficial?
| return km | ||
| } | ||
|
|
||
| func (km *kernelModules) list(searchKey string) error { |
There was a problem hiding this comment.
Question: Why is the list function not used?
|
|
||
| package nvpassthrough | ||
|
|
||
| type basicLogger interface { |
There was a problem hiding this comment.
nit: should this be in an internal/logger package?
| ma := &modAlias{} | ||
| var before, after string | ||
| var found bool | ||
| after = input[1:] // cut leading 'v' |
There was a problem hiding this comment.
Question: Why not strings.TrimPrefix(input, "v")? That would make the intent clear from the code and not require a comment.
| split := strings.SplitN(input, ":", 2) | ||
| if len(split) != 2 { | ||
| return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input) | ||
| } | ||
| input = split[1] |
There was a problem hiding this comment.
| split := strings.SplitN(input, ":", 2) | |
| if len(split) != 2 { | |
| return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input) | |
| } | |
| input = split[1] | |
| _, input, ok := strings.Cut(input, "pci:", 2) | |
| if !ok { | |
| return nil, fmt.Errorf("unexpected number of parts in modalias after trimming 'pci:' prefix: %s", input) | |
| } |
Alternatively, sicne we're not checking for the pci: prefix at all anyway, does it not make sense to just use:
input := strings.TrimPrefix(input, "pci:")
| var found bool | ||
| after = input[1:] // cut leading 'v' | ||
|
|
||
| before, after, found = strings.Cut(after, "d") |
There was a problem hiding this comment.
From the spec for the string we're processing, it seems we're dealing with fixed-length segments. Should we use these lengths when parsing?
| expectedError: true, | ||
| }, | ||
| { | ||
| description: "more than one semicolon delimiter", |
There was a problem hiding this comment.
| description: "more than one semicolon delimiter", | |
| description: "more than one colon delimiter", |
| }, | ||
| { | ||
| description: "no wildcards", | ||
| input: "pci:v000010DEd00002941sv000010DEsd00002046bc03sc02i00", |
There was a problem hiding this comment.
Are tests that have varying length non-wildcard strings between the signifiers valid?
This is mostly a direct port from https://github.com/NVIDIA/k8s-driver-manager/tree/fd043d8f5f74a26b04f83f1eb11b659d402e94de/internal/nvpassthrough
The idea is to reuse this code in any component that needs to prepare GPUs for passthrough, e.g. by binding them to the
vfio-pcidriver. The NVIDIA DRA driver is an example of such component -- it needs to switch between thenvidiadriver andvfio-pcidriver when allocating GPUs for passthrough (as opposed to standard containers).For reference, here is some sample code that uses this Go module:
https://github.com/NVIDIA/k8s-driver-manager/blob/fd043d8f5f74a26b04f83f1eb11b659d402e94de/cmd/vfio-manage/bind.go#L125-L147
https://github.com/NVIDIA/k8s-driver-manager/blob/fd043d8f5f74a26b04f83f1eb11b659d402e94de/cmd/vfio-manage/unbind.go#L114-L135