You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Important Note: NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case here.
Describe the bug
We are trying to install the operator on OKD, but we get this error:
{"level":"error","ts":"2025-05-19T13:49:04Z","msg":"Reconciler error","controller":"clusterpolicy-controller","object":{"name":"gpu-cluster-policy"},"namespace":"","name":"gpu-cluster-policy","reconcileID":"a535d3ea-ebc7-4c22-9e62-7c372c6814c0","error":"failed to handle OpenShift Driver Toolkit Daemonset for version 39.20240210.3.0: ERROR: failed to get destination directory for custom repo config: distribution not supported"}
We have an air-gapped environment, so trying to use repo config option:
repoConfig:
configMapName: repo-config
But we noticed that "fedora" is missing from the Map:
Uh oh!
There was an error while loading. Please reload this page.
Important Note: NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case here.
Describe the bug
We are trying to install the operator on OKD, but we get this error:
We have an air-gapped environment, so trying to use repo config option:
But we noticed that "fedora" is missing from the Map:
gpu-operator/internal/state/driver_volumes.go
Lines 33 to 39 in 349cf4f
Details:
The ID=fedora
Is it a bug, or on purpose? Is OKD fedora supported?
To Reproduce
Install the operator on an air-gapped environment with custom repo config.,
Expected behavior
Successful install on air-gapped OKD.
Environment (please provide the following information):
Information to attach (optional if deemed irrelevant)
kubectl get pods -n OPERATOR_NAMESPACE
kubectl get ds -n OPERATOR_NAMESPACE
kubectl describe pod -n OPERATOR_NAMESPACE POD_NAME
kubectl logs -n OPERATOR_NAMESPACE POD_NAME --all-containers
nvidia-smi
from the driver container:kubectl exec DRIVER_POD_NAME -n OPERATOR_NAMESPACE -c nvidia-driver-ctr -- nvidia-smi
journalctl -u containerd > containerd.log
Collecting full debug bundle (optional):
NOTE: please refer to the must-gather script for debug data collected.
This bundle can be submitted to us via email: [email protected]
Thanks a lot.
The text was updated successfully, but these errors were encountered: