Enable GPU limit requests for kubernetes executor
Status update (2023-01-24)
-
There is a new issue to follow Specify GPU resource limits in the Kubernetes executor that is currently a candidate for 15.10. I am not sure yet if we can still use the pattern established in this MR to solve the problem.
-
The one risk is that in 15.9, we have two other high-priority and in-depth features to add to the k8s executor. Use Kubernetes Secrets instead of ConfigMaps and Support activeDeadlineSeconds for pods spawned by kubernetes executor
-
Its one of the reasons why the GPU resource limits feature is in 15.10 at the earliest.
What does this MR do?
This enables the user to ask for one or more GPUs (or similar resources, from any vendor) in the kubernetes executor. It also builds on the support for resource overwrites in !874 (merged) and extends this to GPUs, since at least for our case we prefer to let the long-running compiles execute on any hosts, after which we ask for a GPU-equipped one for the shorter test execution.
Fixes #3464.
Why was this MR needed?
The default kubernetes executor does not support any GPU resource specification.
Are there points in the code the reviewer needs to double check?
Not specifically in the code, but it might make sense to merge !874 (merged) first (I have posted a link in that discussion that fixes the remaining build issues), after which this change will be much smaller.
Does this MR meet the acceptance criteria?
-
Documentation created/updated -
Added tests for this feature/bug -
In case of conflicts with master
- branch was rebased