Extensible Runner execution

Problem to solve

Some runner adminsitrators need the ability to inject custom code as part of runner execution; this can be used to provision or set up the build environment in a secure way, to execute some enterprise compliance process, configure secure variables, or otherwise set up the execution in a way that would be very specific to that runner (for example, physical hardware resets/configuration for a physical testing device). This is not currently possible - the only execution entrypoint is via the .gitlab-ci.yml and so is not able to be injected at runtime in a secure, per-runner, consistent and tamper-proof way.

There are other situations as well where the user needs to use a kind of environment we don't support (for example, libvirt, VMware, etc.), where the runner would need to provision an environment to be used later. By adding customer script execution shims, this also becomes possible and also removes the burden of GitLab maintaining these as official/deeply-integrated executors. We've received several classes of requests for these:

Using shared hardware in specific complaint ways when abstractions such as Kubernetes and Docker are not available. For example:
- Setting the UID of the user before running user scripts
- Restricting environments in different ways
- Preparing a non-docker container-type environment (LXC, podman, etc.)
HPC (high-performance computing) use cases where there are specific hardware and security requirements that may require use-based
- Batch jobs through technology such as slurm and DRMMA
Virtualization or container hardware or software that we don't license today or can't include in our GitLab.com infrastructure. For example:
- VMware
- libvirt
- z/OS
- podman
- LXC
- oVirt
- jails

Intended users

The primary users of this feature will be enterprise administrators/ops teams who are injecting scripts to handle compliance operations for their teams, and/or are managing unique virtualization or hardware environments that are not officially supported.

Proposal

We will add the ability to shim in a script to be run prior to each stage of the runner's job execution (in a mode we are calling the generic executor). There are three stages that any runner implements:

Prepare: This is before the .gitlab-ci.yml is run, and would the point to interact with libvirt or any other environment privisioning tool to create the execution environment.
Run: This is once the environment is provisioned, and .gitlab-ci.yml content is just about to be executed. If you have a special build environment that needs to be set up (within the provisioned environment), for example secure tokens, physical hardware resets, this would be the place to implement it. The run_script shim is executed for each step within the run stage: prepare_executor, prepare_script, get_sources, restore_cache, download_artifacts, build_script - Explain the before_script + script are joined together, after_script, archive_cache, upload_artifacts_on_success, upload_artifacts_on_failure. Each step name is available as a arguments to the custom script, so per-step behaviors can be defined.
Cleanup: This is when the job has completed (any status, pass or fail) and allows for inserting code to deprovision any environments that were built.

This will be implemented on the runner side, and will look like the following in the runner configuration:

[[runners]]
executor = "custom"

[runners.custom]
prepare_exec = "/path/to/prepare/script.sh"
run_exec = "/path/to/execute/script.sh"
cleanup_exec = "/path/to/cleanup/script.sh"

For the above, the execution would be to:

execute prepare to create an environment for given identifier,
multiple times run execute to run the specific shell commands,
at the end run cleanup to teardown environment and free resources,

Each of these scripts would receive all configuration and script to execute:

unique identifier of job,
path to user script to execute,
information about current execution stage: cloning, artifacts, user provided script, etc.

An administrator would be responsible for maintaining the lifecycle of resource, including removing stale resources if runner is killed or dead.

Permissions and Security

This does not change the security model of the GitLab runner and its interaction with GitLab. Users with access to the runner itself/modifying its configuration are already capable of executing code there and with the same priveleges.

Documentation

While this issue should generally document the interface and how to use the Generic Executor, there also is a follow-up issue to document some specific examples to give the wider community a jump start in coding against this executor. #4257 (closed) will cover both the examples and the documentation of the custom executor

What does success look like, and how can we measure that?

For the MVC, we must handle the vital case of being able to set the UID of the job that is running. After that, we should determine the priority of other use cases and if this method did or did not answer those use cases.

Links / references

See a list of related issues and merge requests below.
#4346 (closed) (scheduled for the same release) is a related item for supporting setting the build folder, which is needed for some use cases.

Edited Jul 16, 2019 by Steve Xuereb