I am running Kubernetes on Docker Desktop with WSL2 and trying to set up GPU monitoring using the NVIDIA GPU Operator and NVIDIA Device Plugin.
What I Have Done:
GPU Confirmed Working in WSL2
nvidia-smi
works correctly and detects my NVIDIA RTX 4070 GPU.
Running a CUDA container works fine:docker run --rm --gpus all nvidia/cuda:12.6.2-base-ubuntu22.04 nvidia-smi
✅ Shows correct CUDA version and GPU details.
Issue: GPU Not Detected in Kubernetes
kubectl get nodes -o=jsonpath='{.items[*].status.allocatable}'
does not show GPU.kubectl logs -n gpu-operator -l app=nvidia-device-plugin-daemonset
shows NVML not found.NVIDIA GPU Operator and Device Plugin are running but not detecting GPU.
What I Have Tried:
Ensured nvidia-container-runtime
is set correctly
Edited
/etc/docker/daemon.json
{ "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } } }
What I Need Help With:
Why is Kubernetes not detecting the GPU ?
Why does
nvidia-device-plugin
fail withcould not load NVML library
?Is there a special configuration needed for WSL2 to work with Kubernetes GPU Operator?
Are there any alternative debugging steps to confirm NVML is correctly installed?
System Information:
OS: Ubuntu 24.04 LTS (WSL2)
Kubernetes: Docker Desktop with WSL2
NVIDIA Driver: 566.36
CUDA Version: 12.7 (Confirmed in
nvidia-smi
)NVIDIA Container Toolkit: Installed (
nvidia-container-toolkit
latest version)NVIDIA GPU: RTX 4070 Laptop GPU
Docker Runtime:
docker info | grep -i runtime
Output:
Runtimes: io.containerd.runc.v2
nvidia runc Default Runtime: runc
Any Help is Appreciated!
If anyone has successfully set up NVIDIA GPU Operator in WSL2 with Kubernetes, please share insights!