Backend: Define pull policy for image in gitlab-ci.yml

Release notes

Pull policy allows you to define different behavior when pulling Docker images, always 9the default behavior) will ensure that the image is always pulled, never disables images pulling completely, if-not-present pulls an image if a local version does not exists. Previously, users were able to define the pull policy on the runner level only, in this release we've added the pull_policy keyword to .gitlab-ci.yml which will allows you to define different pull policies on the pipeline level, this feature does not support shared runner.

Description

Currently, image pull policy is defined on the GitLab runner level. gitlab-runner!115 (merged)

Proposal

It should be possible to define it in gitlab-ci.yml. Example:

build:
  image:
    name: $CI_REGISTRY/dreamhost/kubectl-helm-docker
    pull_policy: always # available: always, if-not-present, never
  script:
  - ...

The desired pull policy is context-dependent. We need to use the very latest images in some cases, and we can use a cached version in other cases. Since it's context-dependent, it perfectly belongs to .gitlab-ci.yml. 😃

Because of the security concerns which are mentioned in this issue description this feature will be opt-out from .com users that are using shared runner, the feature will be available for self managed instance and .com users with registered runners only, for more information please read through the security consideration and this comment

Security considerations

Let's consider that we have a shared runner (e.g. a group runner or an instance runner) that can be used by multiple groups. A runner that is shared and that allows to execute many subsequent jobs on the same environment (e.g. a docker+machine executor with MaxBuilds greater than 1).

And let's consider that one of the groups - group A - wants to use an image named registry.gitlab.example.com/group-a/private-project:latest as the base of the job. The image contains a private code that should be not revealed and because of that is built and stored in a private project. To pull this image one must be properly registered into GitLab's images registry and must have access to the group-a/private-project project.

A good example is a developer developer-A being a member of group-a/private-project who pushes a code to a public project group-a/public-project. The pipeline and jobs are created with the user developer-A inheriting all of his permissions, including permissions to group-a/private-project project. The job that uses image: registry.gitlab.example.com/group-a/private-project:latest authenticates to the registry with gitlab-ci-token:${CI_JOB_TOKEN} credentials, get's the permissions of developer-A and properly pulls the image, starts the job and goes forward.

Now, let's consider that another developer - developer-B - who is not a member of group-a nor group-a/private-project, want's to get access to a propriety, secret content of group-a/private-project through the content of the built image. The user creates a project developer-B/steal-content-of-group-A-private-project with a job defined with image: registry.gitlab.example.com/group-a/private-project:latest and in the job script scans all of the directories and tries to get as much details as possible from there.

No, what are the possible scenarios:

pull_policy set to if-not-present

Runner first asks Docker if an image registry.gitlab.example.com/group-a/private-project:latest is present locally. If it is, then it's used as is and the container is created. There is no docker pull equivalent which means that anyone who have access to such Docker Engine is able to use any locally available image. In context of GitLab CI and our history above it means that developer-B can successfully run his job which uses registry.gitlab.example.com/group-a/private-project:latest image (the one where normally he have no access to!) as the base, allowing the user to get all of the secret content from it.
pull_policy set to always

Runner doesn't care about the presence or absence of the registry.gitlab.example.com/group-a/private-project:latest image. docker pull equivalent is executed to pull the image from the registry. If the image was present locally the positive side is that the pull will be limited to compare what layers should be downloaded and if there was no update to the image in the registry, no real downloading will happen. But in any case, the authentication will be done.

This means that in developer-A - the allowed one - case, authentication using CI_JOB_TOKEN credentials will pass and the image will be pulled (or confirmed that it's in the newest version locally).

But in the developer-B - the thief - case, authentication using CI_JOB_TOKEN will fail, which will fail the job and pulling nor execution of the script will never happen!.

So the definition of pull_policy may, or may not create a security gate allowing an unauthorized actor to get access to a private code. And this must be taken into consideration if we want to make pull_policy configurable also from the .gitlab-ci.yml file.

While I totally see all of the benefits of having this setting configurable in .gitlab-ci.yml, especially for "private" Runners that are handling specific cases of a specific team, the administrator of the Runner should be able to decide whether support for this feature is allowed or not.

It's like with the namespace overwrite feature in Kubernetes executor. It's useful and powerful, but it creates a security breach. And because of that we've added a configuration field in config.toml where the runner administrator can decide whether namespaces overriding is allowed and - optionally - what other namespaces can be set.

The above security concerns are documented at https://docs.gitlab.com/runner/security/#usage-of-private-docker-images-with-if-not-present-pull-policy and https://docs.gitlab.com/runner/executors/docker.html#how-pull-policies-work.

Implementation

GitLab (backend-weight2 for Rails)
- Add support for parsing pull_policy property in image element of .gitlab-ci.yml and returning it in JobResponse.
Runner
- Add pull_policy field in JobResponse struct in Runner (gitlab-runner!966 (closed)).
- Leverage pull_policy in Kubernetes executor (gitlab-runner!966 (closed)).
- Leverage pull_policy in Docker executor.
- Allow GitLab Runner administrators to restrict allowed pull policies by adding an allowed_pull_policies configuration under Docker and k8s executor configuration sections in config.toml.

Documentation

The documentation should mention that this feature is available for self managed and .com users that are not using shared runners (have runners registered to their group/project) since it requires an admin access to the runner

Links / references

gitlab-runner!115 (merged)

@markpundsack @ayufan @grzesiek @markglenfletcher

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited May 29, 2022 by Dov Hershkovitch