Skip to content

Update DSL to output the first element instead of the last in case of failure

Renaud Gaubert requested to merge dsl into master

This is a followup of PR: https://gitlab.com/nvidia/container-toolkit/nvidia-container-runtime/merge_requests/18

The goal is to make error reporting clearer. This change enables us to output better error messages. Today when someone with a driver supporting CUDA 10.0 tries to run a 10.1 container he/she will get:

➜  libnvidia-container git:(master) ✗ docker run --rm -it --gpus all nvidia/cuda:10.1-devel-ubuntu18.04 bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --video --compute --utility --require=cuda>=10.1 brand=tesla,driver>=384,driver<385 brand=tesla,driver>=396,driver<397 brand=tesla,driver>=410,driver<411 --pid=17195 /var/lib/docker/overlay2/0e229add40237047a6b3d95593733dbff530cf35bc7e8b8d832630c7e1a18d40/merged]\\\\nnvidia-container-cli: requirement error: unsatisfied condition: brand = tesla\\\\n\\\"\"": unknown.

With this specific change the error message becomes:

➜  libnvidia-container git:(master) ✗ docker run --rm -it --gpus all nvidia/cuda:10.1-devel-ubuntu18.04 bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --video --compute --utility --require=cuda>=10.1 brand=tesla,driver>=384,driver<385 brand=tesla,driver>=396,driver<397 brand=tesla,driver>=410,driver<411 --pid=17195 /var/lib/docker/overlay2/0e229add40237047a6b3d95593733dbff530cf35bc7e8b8d832630c7e1a18d40/merged]\\\\nnvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.1\\\\n\\\"\"": unknown.

With the two changes, this becomes:

➜  libnvidia-container git:(master) ✗ docker run --rm -it --gpus all nvidia/cuda:10.1-devel-ubuntu18.04 bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=10.1\\n": unknown.

Signed-off-by: Renaud Gaubert rgaubert@nvidia.com

Edited by Renaud Gaubert

Merge request reports