feat(sandbox): add GPU sandbox support for WSL2 by elezar · Pull Request #608 · NVIDIA/OpenShell

elezar · 2026-03-25T14:04:29Z

Summary

Adds GPU sandbox support for WSL2-based systems. On WSL2, NVIDIA GPUs are exposed through the DXG kernel driver (/dev/dxg) rather than the native nvidia* devices, and GPU libraries are injected by CDI into /usr/lib/wsl/ rather than standard Linux paths.

Two changes are required:

Device plugin version bump — bumps ghcr.io/nvidia/k8s-device-plugin to v0.19.1, which includes upstream fixes for WSL2 CDI spec compatibility. See wsl: report a single "all" device to kubelet k8s-device-plugin#1671.
Landlock baseline — has_gpu_devices() previously only checked for /dev/nvidiactl, which does not exist on WSL2, so GPU enrichment never ran. This left /dev/dxg (the WSL2 GPU device node) and /proc write access (required by CUDA for thread naming) unpermitted by Landlock. Fixes by extending GPU detection to also check /dev/dxg, adding it to the read-write baseline, and adding /usr/lib/wsl to the read-only baseline for CDI-injected GPU libraries.

The existing path existence checks in the enrichment logic ensure all new baseline entries are silently skipped on native Linux where these paths do not exist.

Related Issue

Closes #404

Depends on #495 and #503.

Changes

deploy/kube/gpu-manifests/nvidia-device-plugin-helmchart.yaml: bump device plugin Helm chart to v0.19.1
crates/openshell-sandbox/src/lib.rs: extend has_gpu_devices() to detect /dev/dxg; add /dev/dxg to GPU_BASELINE_READ_WRITE and /usr/lib/wsl to GPU_BASELINE_READ_ONLY

Testing

mise run pre-commit passes
E2E tested on WSL2 with GPU passthrough (test_gpu_sandbox_reports_available_gpu passes)

Checklist

Follows Conventional Commits
Commits are signed off (DCO)
Architecture docs updated (if applicable)

elezar · 2026-03-25T18:05:40Z

I'm marking this as ready, but it depends on the mentioned PR.

elezar · 2026-04-15T15:22:50Z

/ok-to-test 8865107

copy-pr-bot · 2026-04-16T07:52:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

elezar · 2026-04-16T08:07:39Z

/ok-to-test 482aae3

elezar · 2026-04-16T08:45:37Z

/ok-to-test 126b554

elezar · 2026-04-16T09:18:55Z

/ok-to-test 5c01d5b

elezar · 2026-04-16T12:11:52Z

/ok-to-test 2105c21

elezar · 2026-04-16T12:57:35Z

/ok-to-test 2bff2d8

elezar · 2026-04-16T14:46:23Z

    "/dev/nvidia-uvm",
    "/dev/nvidia-uvm-tools",
    "/dev/nvidia-modeset",
+    "/dev/dxg", // WSL2: DXG device (GPU via DirectX kernel driver, injected by CDI)


@pimlock when considering Tegra-based systems as in #625, the list of device nodes (and other paths) is much longer and are also system dependent. As such, I don't think that hardcoding this list would be feasible. Would it be possible to process the container config instead to get a list of device nodes that we would expect to access?

elezar · 2026-04-24T13:22:18Z

@pimlock I have updated this PR to use the v0.19.1 release of the Device Plugin instead of a SHA. The e2e test failures seem unrelated to the WSL2 changes (although they may be due to the device plugin version bump).

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671. Signed-off-by: Evan Lezar <elezar@nvidia.com>

On WSL2, NVIDIA GPUs are exposed through the DXG kernel driver (/dev/dxg) rather than the native nvidia* devices. CDI injects /dev/dxg as the sole GPU device node, plus GPU libraries under /usr/lib/wsl/. has_gpu_devices() previously only checked for /dev/nvidiactl, which does not exist on WSL2, so GPU enrichment never ran. This meant /dev/dxg was never permitted by Landlock and /proc write access (required by CUDA for thread naming) was never granted. Fix by: - Extending has_gpu_devices() to also detect /dev/dxg - Adding /dev/dxg to GPU_BASELINE_READ_WRITE (device nodes need O_RDWR) - Adding /usr/lib/wsl to GPU_BASELINE_READ_ONLY for CDI-injected GPU library bind-mounts that may not be covered by the /usr parent rule across filesystem boundaries The existing path existence check in enrich_proto_baseline_paths() ensures all new entries are silently skipped on native Linux where these paths do not exist. Signed-off-by: Evan Lezar <elezar@nvidia.com>

… checks Add ClusterRole and ClusterRoleBinding so the openshell service account can list nodes at the cluster scope, which is required by the GPU node capacity check in the Kubernetes driver. Signed-off-by: Evan Lezar <elezar@nvidia.com>

Signed-off-by: Evan Lezar <elezar@nvidia.com>

elezar self-assigned this Mar 25, 2026

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 2f1232c to 31ea520 Compare March 25, 2026 14:16

elezar requested review from pimlock March 25, 2026 14:38

elezar marked this pull request as ready for review March 25, 2026 18:05

elezar requested a review from a team as a code owner March 25, 2026 18:05

pimlock mentioned this pull request Apr 6, 2026

ci(gpu): add separate GPU test workflows #773

Merged

5 tasks

elezar force-pushed the feat/wsl-cdi-spec-watcher branch 2 times, most recently from ac398f0 to a7828f8 Compare April 15, 2026 09:35

elezar changed the title ~~feat(gpu): add WSL2 CDI spec watcher for GPU passthrough~~ feat(gpu): pin device plugin image with WSL2 CDI spec fixes Apr 15, 2026

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from a7828f8 to 8865107 Compare April 15, 2026 09:36

elezar added the test:e2e-gpu Requires GPU end-to-end coverage label Apr 15, 2026

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 33834f3 to 482aae3 Compare April 16, 2026 07:53

elezar changed the title ~~feat(gpu): pin device plugin image with WSL2 CDI spec fixes~~ feat(sandbox): add GPU sandbox support for WSL2 Apr 16, 2026

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 64c9d25 to 126b554 Compare April 16, 2026 08:44

elezar force-pushed the feat/wsl-cdi-spec-watcher branch 2 times, most recently from 9b5317e to 5c01d5b Compare April 16, 2026 09:14

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 2105c21 to 2bff2d8 Compare April 16, 2026 12:55

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 2bff2d8 to 482aae3 Compare April 16, 2026 14:37

elezar commented Apr 16, 2026

View reviewed changes

elezar mentioned this pull request Apr 17, 2026

nvidia-ctk cdi generate: libdxcore.so not found on WSL2 despite being present NVIDIA/nvidia-container-toolkit#1739

Open

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 482aae3 to 921e95c Compare April 17, 2026 07:31

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from 921e95c to e4e76b5 Compare April 24, 2026 12:36

elezar added 3 commits April 24, 2026 15:34

feat(gpu): bump device plugin to v0.19.1

37cbb4d

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671. Signed-off-by: Evan Lezar <elezar@nvidia.com>

elezar force-pushed the feat/wsl-cdi-spec-watcher branch from fb74ed2 to 35935e6 Compare April 24, 2026 13:35

docs(sandbox): document GPU Landlock baseline paths and WSL2 detection

1415b6f

Signed-off-by: Evan Lezar <elezar@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sandbox): add GPU sandbox support for WSL2#608

feat(sandbox): add GPU sandbox support for WSL2#608
elezar wants to merge 4 commits intomainfrom
feat/wsl-cdi-spec-watcher

elezar commented Mar 25, 2026 •

edited

Loading

Uh oh!

elezar commented Mar 25, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

copy-pr-bot Bot commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar Apr 16, 2026

Uh oh!

elezar commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elezar commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Testing

Checklist

Uh oh!

elezar commented Mar 25, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

copy-pr-bot Bot commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar commented Apr 16, 2026

Uh oh!

elezar Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

elezar commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

elezar commented Mar 25, 2026 •

edited

Loading