Proposal: Stop manually installing/building 3rd party dependencies directly in runner docker images
Summary
Instead of manually installing/building 3rd party packages directly in the Dockerfile for runner images as we do now, build rpm/deb/apk
packages for those 3rd party packages, push them to package-cloud (or some other package repository), point the image OSs package managers to that package repository, and install those packages via the package manager.
Current Situation
In a number of the docker images created in the runner project, we resort to installing 3rd party packages manually. Installing packages manually means one of:
- Downloading binary tarball from the upstream project, extracting it, and moving to final location (and maybe poking around with permissions).
- Downloading source tarball from the upstream project, building it, and moving to final location. This also requires first installing the required build tool-chain into the image.
All this happens in https://gitlab.com/gitlab-org/gitlab-runner/-/tree/main/dockerfiles, for example:
- https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/dockerfiles/ci/Dockerfile#L23
- https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/dockerfiles/runner-helper/Dockerfile.fips#L10
- https://gitlab.com/gitlab-org/gitlab-runner/-/blob/main/dockerfiles/runner/ubi-fips/Dockerfile#L15
We do this for a number of reasons:
- FedRamp compliance: the latest release of a package provided by the image OS's package manager has known vulnerabilities. A newer version fixes the vulnerabilities, but they take time to make it to the OS's package repo, and for some image versions (e.g. some older
alpine
s), the images are never updated upstream to include the package versions with the vuln fixes. - The latest release of a package provided by the image OS's package manager is older than we need; we need specific features in releases newer than that provided by the package manager.
- The image OS's package manager does not provide a package at all.
Problems With This Approach
- Cluttered Dockerfiles that either "manually" install a package directly in the Dockerfile, or call a script to do so. Cluttered in some cases also applies to the final image, which may include tools used to extract or build the package, that are not necessary at runtime but end up in the final image.
- In some cases manual installation makes pipeline jobs long (e.g. building
git
from source took ~3 hours, and often timed out). - Not particularly scalable or flexible WRT dealing with CVE vulnerability reports when the best solution is to install a package manually. This exacerbates the above two problems.
- Prevents running the runner pipeline in so called "air gapped" environments for customers that want to do this.
Proposal
Instead of manually installing 3rd party packages directly in the Dockerfile as we do now, we could:
- Create a new project in which we build
rpm/deb/apk
s of the packages we currently install manually. We'd have to do this for all OS/arch combinations for which we currently install manually. - Push these packages to our package-cloud instance, or some other package repository. If using our existing package-cloud instance, we would probably want these packages to be in a separate, dedicated namespace to avoid unintentionally clobbering 3rd party packages when e.g. installing runner via package-cloud (e.g.
runner-image-deps
). - In the runner images, install our package repository into the image OSs package manager, and install the packages via the package manager as usual.
Benefits
- Would enable us to better deal with future CVE issues for which the best solution is to install a version of a package NOT provided by the image OS's package manager. In this case we'd add this 3rd party package to the project that builds and publishes the rpm/deb/apk packages, and install it via the OS's package manager as usual.
- Separate building of 3rd party packages (which only has to happen once, and can be slow) from installing them. This would avoid long runner pipeline jobs (at least for this reason).
- Would facilitate building in air gapped environments by mirroring our package repo (with artifactory or similar), which is fairly common industry practice. Not all customers may like this option, but it is a workable option.
- Puts in in a better position WRT the next FedRamp certified base image since we can easily build packages for any app for all OS we support.
Risks
- Making rpm/deb/apk packages might be involved and difficult to get right (e.g. getting dependencies right might be tricky).
- if our package-cloud instance is not a good place to host packages, we'd have to find another place to host them.
- package-cloud claims to support
apk
repositories, but our instance appears to not be set up forapk
.
- package-cloud claims to support
Possible Implementation Plan
- De-risk by implementing proposal end-to-end for one 3rd party package, for one package type (
rpm
).- Not including updating runner images at this point.
- Add some minimal CI testing to ensure build packages are successfully installed an run.
- If that goes well, extend to same package for other package types (
deb/apk
). - If that goes well expand to create packages for other dependencies for each package type.
- Add testing:
- Update Runner docker images to install 3rd party packages from package-cloud.
References
Edited by Axel von Bertoldi