Backend: Define pull policy for image in gitlab-ci.yml
### Release notes
Pull policy allows you to define different behavior when pulling Docker images, `always` 9the default behavior) will ensure that the image is always pulled, `never` disables images pulling completely, `if-not-present` pulls an image if a local version does not exists. Previously, users were able to define the [pull policy](https://docs.gitlab.com/runner/executors/docker.html#how-pull-policies-work) on the runner level only, in this release we've added the pull_policy keyword to .gitlab-ci.yml which will allows you to define different pull policies on the pipeline level, this feature does not support shared runner.
### Description
Currently, image pull policy is defined on the GitLab runner level. https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/115
### Proposal
It should be possible to define it in `gitlab-ci.yml`. Example:
```yaml
build:
image:
name: $CI_REGISTRY/dreamhost/kubectl-helm-docker
pull_policy: always # available: always, if-not-present, never
script:
- ...
```
The desired pull policy is context-dependent. We need to use the very latest images in some cases, and we can use a cached version in other cases. Since it's context-dependent, it perfectly belongs to `.gitlab-ci.yml`. :smiley:
Because of the security concerns which are mentioned in this issue description this feature will be opt-out from .com users that are using shared runner, the feature will be available for self managed instance and .com users with registered runners only, for more information please read through the [security consideration](https://gitlab.com/gitlab-org/gitlab/-/issues/21619#security-considerations) and [this comment](https://gitlab.com/gitlab-org/gitlab/-/issues/21619#note_416546568)
### Security considerations
Let's consider that we have a shared runner (e.g. a group runner or an instance runner) that can be used by multiple groups. A runner that is shared and that allows to execute many subsequent jobs on the same environment (e.g. a `docker+machine` executor with `MaxBuilds` greater than 1).
And let's consider that one of the groups - `group A` - wants to use an image named `registry.gitlab.example.com/group-a/private-project:latest` as the base of the job. The image contains a private code that should be not revealed and because of that is built and stored in a private project. To pull this image one must be properly registered into GitLab's images registry **and** must have access to the `group-a/private-project` project.
A good example is a developer `developer-A` being a member of `group-a/private-project` who pushes a code to a public project `group-a/public-project`. The pipeline and jobs are created with the user `developer-A` inheriting all of his permissions, including permissions to `group-a/private-project` project. The job that uses `image: registry.gitlab.example.com/group-a/private-project:latest` authenticates to the registry with `gitlab-ci-token:${CI_JOB_TOKEN}` credentials, get's the permissions of `developer-A` and properly pulls the image, starts the job and goes forward.
Now, let's consider that another developer - `developer-B` - who is not a member of `group-a` nor `group-a/private-project`, want's to get access to a propriety, secret content of `group-a/private-project` through the content of the built image. The user creates a project `developer-B/steal-content-of-group-A-private-project` with a job defined with `image: registry.gitlab.example.com/group-a/private-project:latest` and in the job script scans all of the directories and tries to get as much details as possible from there.
No, what are the possible scenarios:
1. **`pull_policy` set to `if-not-present`**
Runner first asks Docker if an image `registry.gitlab.example.com/group-a/private-project:latest` is present locally. **If it is, then it's used as is and the container is created**. There is no `docker pull` equivalent which means that anyone who have access to such Docker Engine is able to use any locally available image. In context of GitLab CI and our history above it means that `developer-B` can successfully run his job which uses `registry.gitlab.example.com/group-a/private-project:latest` image (the one where normally he have no access to!) as the base, allowing the user to get all of the secret content from it.
1. **`pull_policy` set to `always`**
Runner doesn't care about the presence or absence of the `registry.gitlab.example.com/group-a/private-project:latest` image. `docker pull` equivalent is executed to pull the image from the registry. If the image was present locally the positive side is that the pull will be limited to compare what layers should be downloaded and if there was no update to the image in the registry, no real downloading will happen. **But in any case, the authentication will be done**.
This means that in `developer-A` - the allowed one - case, authentication using `CI_JOB_TOKEN` credentials will pass and the image will be pulled (or confirmed that it's in the newest version locally).
**But in the `developer-B` - the thief - case, authentication using `CI_JOB_TOKEN` will fail, which will fail the job and pulling nor execution of the script will never happen!**.
So the definition of `pull_policy` may, or may not create a security gate allowing an unauthorized actor to get access to a private code. And this must be taken into consideration if we want to make `pull_policy` configurable also from the `.gitlab-ci.yml` file.
While I totally see all of the benefits of having this setting configurable in `.gitlab-ci.yml`, especially for "private" Runners that are handling specific cases of a specific team, the administrator of the Runner should be able to decide whether support for this feature is allowed or not.
It's like with the `namespace overwrite` feature in Kubernetes executor. It's useful and powerful, but it creates a security breach. And because of that we've added a configuration field in `config.toml` where the runner administrator can decide whether namespaces overriding is allowed and - optionally - what other namespaces can be set.
The above security concerns are documented at https://docs.gitlab.com/runner/security/#usage-of-private-docker-images-with-if-not-present-pull-policy and https://docs.gitlab.com/runner/executors/docker.html#how-pull-policies-work.
### Implementation
- GitLab (~"backend-weight::2" for Rails)
- [ ] Add support for parsing `pull_policy` property in [`image` element](https://gitlab.com/gitlab-org/gitlab/-/blob/798e3475429800a09716819075635d3426298225/lib/gitlab/ci/config/entry/image.rb) of `.gitlab-ci.yml` and [returning it](https://gitlab.com/gitlab-org/gitlab/blob/master/app/services/ci/register_job_service.rb) in `JobResponse`.
- Runner
- [ ] Add [`pull_policy` field](https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/966/diffs#784d16d046262daaf0974ce8a61882a52cbfdc85_195_195) in [`JobResponse` struct](https://gitlab.com/gitlab-org/gitlab-runner/blob/3b94d48643a8bbbe6373ea68461a9414faed5979/common/network.go#L292) in Runner (gitlab-runner!966).
- [ ] Leverage `pull_policy` in Kubernetes executor (gitlab-runner!966).
- [ ] Leverage `pull_policy` in Docker executor.
- [ ] Allow GitLab Runner administrators to restrict allowed pull policies by adding an `allowed_pull_policies` configuration under Docker and k8s executor configuration sections in `config.toml`.
### Documentation
The documentation should mention that this feature is available for self managed and .com users that are not using shared runners (have runners registered to their group/project) since it requires an admin access to the runner
### Links / references
https://gitlab.com/gitlab-org/gitlab-runner/merge_requests/115
@markpundsack @ayufan @grzesiek @markglenfletcher
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
*This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.*
<!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
issue