Proposal: concrete+git integration
Summary
With the introduction of concrete (behind FF_CONCRETE) (!6410 (merged)), we are migrating job execution stages (get_sources, cache, artifacts, cleanup) from the abstract shell into a step-runner function that runs entirely inside the build container. This removes the need for the "predefined" helper container to orchestrate these stages on behalf of the build.
However, the predefined container previously provided key dependencies (git, CA certificates, bash) that concrete now needs access to from within the build container. We need a strategy for making these dependencies available that is reliable, efficient, and does not regress existing user workflows.
Background
How things work today (abstract shell)
The Docker and Kubernetes executors run two containers:
-
Predefined (helper) container: based on our helper image (
gitlab-runner-helper). Contains git, SSL certs, bash, and the runner helper binary. Responsible for get_sources, cache operations, and artifact transfers. -
Build container: the user-provided image. Runs user scripts (
script,before_script,after_script).
The abstract shell generates shell scripts and decides which container executes each one. This split causes several long-standing problems:
- Files created by the predefined container may have different ownership than those expected by the build container, requiring UID/GID detection and
chownoperations. - The runner has no knowledge of the target environment (OS, filesystem layout), making path handling fragile.
- Each script runs in isolation, so the runner must micromanage execution order.
- Every new feature must work across bash, sh, pwsh, and powershell.
How step-runner and concrete changes this
Concrete runs everything in the build container. The step-runner is bootstrapped into the build environment, and all stages execute there. This eliminates the file ownership problem entirely and gives us full knowledge of the execution environment.
The trade-off is that we now need to bring our dependencies into the build container, rather than relying on the predefined container having them pre-installed.
Problem
The build container is user-provided and may not contain git, appropriate CA certificates, or other dependencies that the helper image previously supplied. We need to bootstrap these into the build environment.
Additionally, users can customise the helper image in two ways:
- Self-hosted registries: Users mirror our images to internal registries. They expect flavour selection (alpine, ubuntu) to continue working.
-
Extended helper images: Users install additional tools (e.g.,
ssh,rsync) into custom helper images so thatpre_get_sources_scriptorpost_get_sources_scripthave access to them.
For case (1), once we add the bundled dependencies to our standard flavours, self-hosted users will pick them up on their next image pull.
For case (2), concrete changes where pre/post clone scripts execute. They now run in the build container, which does not have the user's custom helper image tools. This is a breaking change for those users.
Existing solution
Bundle git and CA certificates for concrete runner (!6504 - merged) provides an existing solution.
We introduce a new concrete helper image flavour that bundles a statically-linked git and CA certificates. During bootstrap, these are copied alongside the step-runner binary into the build container. All stages that previously ran in the predefined container (git operations, cache-archiver/extractor, artifacts-uploader/downloader) are wrapped to use the bundled git and CA certs via PATH and SSL_CERT_FILE manipulation.
This works for the common case but does not address users who extended the helper image with additional tools for pre/post clone scripts.
Handling the breaking change for extended helper images
Preferred approach: Detect, warn, and deprecate
We can detect when a user has both overridden the helper image and configured a pre_get_sources_script or post_get_sources_script. When this combination is present, we print a warning directing the user to an issue where they can see the suggested migration path and provide feedback, regardless of whether the change actually affects them.
This follows the same pattern we used for the removal of the Windows cmd shell. We would:
- Add detection logic that identifies affected configurations.
- Print a deprecation warning in the job log with a link to a migration guide.
- Give users plenty of notice.
- Make the breaking change at a major milestone.
This is my preferred route because the number of users who both override the helper image and rely on custom tools in pre/post clone scripts is likely small, and we are allowed to make breaking changes where necessary with appropriate notice.
Alternative: proot for helper filesystem isolation
If the deprecation approach is not acceptable and we need to preserve backwards compatibility, we could use proot to run pre/post clone scripts with the helper image's filesystem as their root.
proot is a userspace implementation of chroot that uses ptrace to intercept filesystem syscalls. It requires no kernel features beyond ptrace, which is available in nearly all container environments, including locked-down Docker and Kubernetes pods. We would:
- Extract the helper image filesystem during bootstrap.
- Bundle a static
prootbinary alongside git and the step-runner. - Run pre/post clone scripts under
proot -r <helper_rootfs>, giving them access to all tools the user installed in their custom helper image.
The performance overhead from syscall interception is real but tolerable for pre/post clone scripts, which are typically short-lived. The main open question is whether ptrace is reliably permitted across the range of container security profiles (seccomp, AppArmor, SELinux) our users run. This needs validation.
Alternative: Sidecar command proxy
Run a lightweight daemon in a container based on the user's helper image and proxy pre/post clone script execution to it.
I don't think this is ideal. It reintroduces the file ownership problem for any files the sidecar creates in shared volumes, adds significant architectural complexity, and arguably undermines the core benefit of step-runner+concrete (everything running in one container).
Proposal
Ship the current concrete flavour with bundled git and CA certs. Pursue the detect-and-deprecate approach for users with extended helper images, targeting a major milestone for the breaking change. If user feedback during the deprecation period reveals that this would cause significant disruption, try falling back to the proot approach.
Open questions:
- Do we have data on how many users extend the helper image with custom tools vs. simply mirroring it?
- Should we add the git and CA cert bundles to all standard flavours (alpine, ubuntu), or deprecate flavour selection entirely and consolidate on the
concreteimage? If concrete provides everything the runner needs (git, certs, step-runner), the other flavours exist only to support users who extended them with custom tools. Deprecating flavours would simplify the image matrix and CI pipeline significantly, but would need the same detect-warn-deprecate cycle described above. - Which major milestone should we target for the breaking change?