Skip to content

custom host/driver/dev roots change

Jiaming Xu requested to merge Dragoncell/gpu-operator:master-gke into master

Add custom host, driver, and dev root change.

Use Cases:

  1. hostDriver= "false", hostRoot = "/", driver Root = "/run/nvidia/driver", dev Root = "/run/nvidia/driver/dev"
  2. hostDriver= "true", hostRoot = "/", driver Root(system based, e.g linux) = "/usr/bin", dev Root = "/"
  3. hostDriver= "true", hostRoot = "/", driver Root = "/home/kubernetes/bin/nvidia", dev Root = "/"
  4. hostDriver= "true", hostRoot = "/custom-root", driver Root(system based, e.g linux) = "/custom-root/usr/bin", dev Root = "/custom-root/dev"

Changes for detail:

  1. Add custom configuration for host, driver, dev roots through helm
  2. In container toolkit, introduce the dev root based on https://github.com/NVIDIA/nvidia-container-toolkit/issues/209, https://github.com/NVIDIA/nvidia-container-toolkit/pull/360
  3. In device plugin, configure extra the NVIDIA_CTK_PATH, LD_LIBRARY_PATH, PATH for custom driver root cases
  4. In validator, introduce checkChrootForDriverRoot to validation a driver can be chroot or not based on dev dir

Issues mentioned in github: https://github.com/NVIDIA/gpu-operator/issues/659

Merge request reports