Skip to content

Add configurable --sysfs-root for NUMA detection and PCI discovery#1836

Open
giuliocalzo wants to merge 1 commit into
NVIDIA:mainfrom
giuliocalzo:feat/sysfs-root
Open

Add configurable --sysfs-root for NUMA detection and PCI discovery#1836
giuliocalzo wants to merge 1 commit into
NVIDIA:mainfrom
giuliocalzo:feat/sysfs-root

Conversation

@giuliocalzo

Copy link
Copy Markdown

Summary

  • Adds --sysfs-root / $SYSFS_ROOT (default /sys) as a shared flag for the device plugin and GFD, matching the pattern used by --nvidia-driver-root
  • Threads the sysfs root into GetNumaNode for GPU and MIG device NUMA topology, and into GFD pciutil for vGPU PCI scanning
  • Adds Helm values (sysfsRoot, sysfsHostPath) for injecting SYSFS_ROOT and optionally mounting a synthetic PCI sysfs tree

Closes #1833

Motivation

Mock-GPU test environments (e.g. k8s-test-infra render-pci-sysfs) need the plugin to read NUMA affinity from a synthetic PCI sysfs tree rather than the hardcoded /sys path. Without this, GetNumaNode silently reports no NUMA and devices are advertised without topology.

Test plan

  • go test ./internal/rm/ ./internal/vgpu/ ./api/config/v1/
  • go build ./cmd/nvidia-device-plugin/ ./cmd/gpu-feature-discovery/
  • Helm template renders SYSFS_ROOT and sysfs-root volume when sysfsRoot + sysfsHostPath are set
  • k8s-test-infra integration with synthetic PCI sysfs tree (downstream)

Enable mock-GPU test environments to supply synthetic PCI sysfs trees
for device-plugin NUMA topology and GFD vGPU PCI scanning without
changing production defaults.

Signed-off-by: Giulio Calzolari <gcalzolari@nvidia.com>
@giuliocalzo

Copy link
Copy Markdown
Author

@tariq1890 @cdesiniotis Could you please review? This implements #1833 (--sysfs-root for NUMA detection and GFD PCI scanning).

@tariq1890

Copy link
Copy Markdown
Contributor

Thanks @giuliocalzo ! Is this primarily motivated to serve test frameworks? I was wondering if there are scenarios where a live environment would have a custom sysfs root.

@giuliocalzo

giuliocalzo commented Jun 8, 2026

Copy link
Copy Markdown
Author

hi @tariq1890 — the primary motivation is test frameworks, specifically mock-GPU environments in k8s-test-infra that use synthetic PCI sysfs trees from render-pci-sysfs (see #1833 and NVIDIA/k8s-test-infra#264 for the DRA driver counterpart).

Today GetNumaNode reads a hardcoded /sys/bus/pci/devices/<busID>/numa_node. When NVML reports bus IDs that aren't backed by host sysfs (simulated GPUs), the read fails silently and devices are advertised without NUMA topology. The same applies to GFD vGPU PCI scanning in internal/vgpu/pciutil.go.

I don't expect live production clusters to set a custom sysfs root — the default stays /sys, so existing deployments are unchanged. The flag is there so CI/test infra can mount a rendered tree at an alternate path (e.g. /rendered-sysfs) without shadowing the real /sys mount per-pod.

Happy to trim scope or adjust naming if you'd prefer to keep this test-infra-only in documentation.

@tariq1890

Copy link
Copy Markdown
Contributor

Thank you for the context. In general, I'd like to avoid making significant changes to the code if it is to get the project to work with mocking and test frameworks.

Have you looked into afero? Linking the docs here.

@eliranw

eliranw commented Jun 9, 2026

Copy link
Copy Markdown

Hi, just chiming in that we also plan on using this for e2e testing for KAI-scheduler together with NVML-Mock for NUMA awareness, so this would be really helpful for us. It looks like the change is in line with the pattern used by --nvidia-driver-root, as @giuliocalzo wrote. Happy to talk through it more if that would help.

@tariq1890

Copy link
Copy Markdown
Contributor

It looks like the change is in line with the pattern used by --nvidia-driver-root, as @giuliocalzo wrote.

Configurable nvidia driver root is needed to support the driver container use case, where as custom sysfs root is not motivated by a use-case that could exist in live environments. Modifying the k8s-device-plugin CLI interface to accommodate a test framework seems like an architectural smell to me.

@dims

dims commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@tariq1890 looking at https://www.kernel.org/doc/html/v4.19/admin-guide/sysfs-rules.html i see:

sysfs is always at /sys
Parsing /proc/mounts is a waste of time. Other mount points are a system configuration bug you should not try to solve. For test cases, possibly support a SYSFS_PATH environment variable to overwrite the application’s behavior, but never try to search for sysfs. Never try to mount it, if you are not an early boot script.

Could we strip down the changes in this PR to a minimal SYSFS_PATH env var as suggested there?

Here's an attempt with a few more suggestions:
dims@a1dd26b

including:

  • using go-nvlib/nvpci
  • call NVML's GetNumaNodeId

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add a configurable --sysfs-root for device NUMA detection (GetNumaNode)

4 participants