-
Notifications
You must be signed in to change notification settings - Fork 348
Issues: NVIDIA/gpu-operator
NOTICE: Containers losing access to GPUs with error: "Failed ...
#485
opened Feb 7, 2023 by
cdesiniotis
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
For some reason, the ubuntu24.04 daemonset is selecting a 22.04 binary driver image (reopens #722)
#1436
opened May 10, 2025 by
doctorpangloss
Issue: GPU Operator Fails on Jetson Orin (ARM64) — Needed for Kai Scheduler
#1433
opened May 8, 2025 by
Ashwinraj2000
Prometheus unable to scrape stats as the scrape port annotation is not set on the dcgm-exporter service
#1421
opened Apr 25, 2025 by
DominicWatson
Facing issue with DCGM exporter due to Nvidia GPU Operator initialization problem
#1405
opened Apr 23, 2025 by
jaipreetnagpal
Multi-GPU allocation with precise control in shared environment
#1400
opened Apr 16, 2025 by
FourierMourier
Is it possible to enable MIG only on specific nodes when using the GPU Operator?
#1399
opened Apr 14, 2025 by
larcane97
Everything seems to be ok, but it doesn't work Ubuntu 24.04, Operator v25.3.0
#1398
opened Apr 12, 2025 by
blumfontein
GPU Operator v25.3.0 with DCGM exporter v4.1.1-2: DCGM_FI_PROF_GR_ENGINE_ACTIVE': metric not enabled
#1397
opened Apr 12, 2025 by
gseidlerhpe
4 tasks
Clarification on Automatic Component Cleanup When Node Labels Change (e.g.,
container
↔ vm-passthrough
)
#1392
opened Apr 10, 2025 by
kingeasternsun
Nvidia Operator fails to detect the vGPU devices on OpenShift Cluster with A100 GPU node
#1375
opened Mar 30, 2025 by
sderohan
MicroK8s containerd-template.toml is wrong when docker is installed in parallel
#1367
opened Mar 27, 2025 by
s-bernhardt
nvidia-operator-validator toolkit-validation fails Init:CrashLoopBackOff
#1364
opened Mar 25, 2025 by
RangaSamudrala
Containers get stuck starting up after driver upgrade from 560 to 570
#1361
opened Mar 24, 2025 by
dasantonym
Previous Next
ProTip!
Adding no:label will show everything without a label.