Extensible Runner execution
Problem to solve
Some runner adminsitrators need the ability to inject custom code as part of runner execution; this can be used to provision or set up the build environment in a secure way, to execute some enterprise compliance process, configure secure variables, or otherwise set up the execution in a way that would be very specific to that runner (for example, physical hardware resets/configuration for a physical testing device). This is not currently possible - the only execution entrypoint is via the
.gitlab-ci.yml and so is not able to be injected at runtime in a secure, per-runner, consistent and tamper-proof way.
There are other situations as well where the user needs to use a kind of environment we don't support (for example,
VMware, etc.), where the runner would need to provision an environment to be used later. By adding customer script execution shims, this also becomes possible and also removes the burden of GitLab maintaining these as official/deeply-integrated executors. We've received several classes of requests for these:
- Using shared hardware in specific complaint ways when abstractions such as Kubernetes and Docker are not available. For example:
- Setting the UID of the user before running user scripts
- Restricting environments in different ways
- Preparing a non-docker container-type environment (LXC, podman, etc.)
- HPC (high-performance computing) use cases where there are specific hardware and security requirements that may require use-based
- Virtualization or container hardware or software that we don't license today or can't include in our GitLab.com infrastructure. For example:
The primary users of this feature will be enterprise administrators/ops teams who are injecting scripts to handle compliance operations for their teams, and/or are managing unique virtualization or hardware environments that are not officially supported.
We will add the ability to shim in a script to be run prior to each stage of the runner's job execution (in a mode we are calling the generic executor). There are three stages that any runner implements:
- Prepare: This is before the
.gitlab-ci.ymlis run, and would the point to interact with
libvirtor any other environment privisioning tool to create the execution environment.
- Run: This is once the environment is provisioned, and
.gitlab-ci.ymlcontent is just about to be executed. If you have a special build environment that needs to be set up (within the provisioned environment), for example secure tokens, physical hardware resets, this would be the place to implement it. The
run_scriptshim is executed for each step within the run stage:
build_script- Explain the
scriptare joined together,
upload_artifacts_on_failure. Each step name is available as a arguments to the custom script, so per-step behaviors can be defined.
- Cleanup: This is when the job has completed (any status, pass or fail) and allows for inserting code to deprovision any environments that were built.
This will be implemented on the runner side, and will look like the following in the runner configuration:
[[runners]] executor = "custom" [runners.custom] prepare_exec = "/path/to/prepare/script.sh" run_exec = "/path/to/execute/script.sh" cleanup_exec = "/path/to/cleanup/script.sh"
For the above, the execution would be to:
- execute prepare to create an environment for given identifier,
- multiple times run execute to run the specific shell commands,
- at the end run cleanup to teardown environment and free resources,
Each of these scripts would receive all configuration and script to execute:
- unique identifier of job,
- path to user script to execute,
- information about current execution stage: cloning, artifacts, user provided script, etc.
An administrator would be responsible for maintaining the lifecycle of resource, including removing stale resources if runner is killed or dead.
Permissions and Security
This does not change the security model of the GitLab runner and its interaction with GitLab. Users with access to the runner itself/modifying its configuration are already capable of executing code there and with the same priveleges.
While this issue should generally document the interface and how to use the Generic Executor, there also is a follow-up issue to document some specific examples to give the wider community a jump start in coding against this executor. #4257 (closed) will cover both the examples and the documentation of the custom executor
What does success look like, and how can we measure that?
For the MVC, we must handle the vital case of being able to set the UID of the job that is running. After that, we should determine the priority of other use cases and if this method did or did not answer those use cases.