Make pipeline permissions more controllable and flexible

Problem to solve

At the moment, pipelines use the security access of the user (personal access token) that is running the pipeline, but with hard-coded limitations applied in order to reduce the risk of security problems. For example, write_repository is limited, which prevents a pipeline from pushing changes back to the repository it is working on.

These are sensible basic defaults, but some pipelines need the ability to do things like write to the repository, write to the registry, access to the API, and so on. GitLab needs a way for project administrators to bump up the level of access back up to the full capabilities of the running user in situations where they are needed.

It's important to call out that it will not be possible to elevate a users permissions beyond the permission they have using this method.

Communication with GitLab

Users use Personal Access Tokens to communicate with GitLab, but these are not secure enough.

The problem here is that in order to communicate with GitLab users need to use Personal Access Tokens. Another way could be using a $CI_JOB_TOKEN but these tokens can not access API because it is not possible to configure their scopes, like it is possible for PATs.

Because a $CI_JOB_TOKEN, that is injected by GitLab into the context of a CI/CD build, can not be used with API, users are using personal access tokens, and there are few important differences between PATs and job tokens.

The most important difference is that $CI_JOB_TOKEN is an ephemeral token, it can be revoked after a build is complete, it might not be used until a build is running. Another important difference is that a $CI_JOB_TOKEN is different for every build, and for every user.

Users that are injecting PATs into their builds, inject the same PAT all the time, and their colleagues can easily see that token, thus it is not secure enough. Every builds receives the same PAT, the token is not revoked when a build is done.

Intended users

Devon (DevOps Engineer)

Further details

This issue originated out of the comment thread at #20416 (comment 216069060).

The current set of scopes that are possible are as follows:

Proposal

Gitlab is already an OAUTH2.0 authorization/resource server and tokens are already oauth2 access token, so we can use its full power to solve this issue.

There are three main components to us delivering this feature:

Change the backend to use JWT instead of access tokens for build permissions
Expose the necessary JWKs endpoint(s) and facilities to allow external systems to validate the authenticity of the minted tokens
Design and implement a UX for people to control what scopes (things like read_repository, read_registry, read_api, write_repository, etc.. basically the ones you can choose from in the access token dialog) are given for the resources.

A resource is defined as a project or maybe group, and then scopes are applied to resources.

A user will never be able to get more permissions than the user has through this mechanism, so even if read_api is turned on in the scope, if the user does not have that ability they don't get it for free. This could maybe be something that comes in a future iteration, but we are not touching it now.

We are also not supporting different scope evaluation depending on the target environment, branch, etc., but will limit all destructive operations (all writes on API, except registry image creation) to be locked to protected jobs. This is because pipelines can run a ton of untrusted code, and with fork pipelines running in parent we will run even more.

New Scopes

There are a few scopes that may need to be added for additional functionality of this feature:

read_package_registry: this is a new scope, but currently users can grant read/write permission only via the api scope
write_package_registry: also a new scope, which will help avoid granting full api scope for writing to the package registries.

Implementation Plan

From @ayufan at #20416 (comment 230466323)

Change ci_builds.token to be JWT,
The JWT expiry time is max(runner_timeout, project_timeout)+1h?,
Make JWT to hold information about build_id/pipeline_id/project_id/user_id and assigned scopes and abilities granted, similar to how we do that for container registry,
Make the other JWT scopes to be constructed from project settings where you are allowed to specify additional abilities (a scopes) for another projects that you want to access.
UI up to choosing I guess, but ideally I would see it like: adding a new project, choosing the scope (likely simple CRUD interface).

The biggest change is change of token to JWT. This is to make the further processing of scope much simpler to implement on all parts of the system. It improves the security of runner communication by creating short-living token, and makes this flexible to handle in generic way addition of new scopes and projects. This is also efficient, as handling JWT is very cheap to do.

Permissions and Security

Even for the MVC we need a way to disable this feature at the group and instance level. This is a requirement in compliance driven environments where elevating access (even to the user's full access level) may not be permitted. It should be on by default, but easy to disable under these scenarios.

Documentation

Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Future Improvements

In the near term, implementing permission controls using OAuth2 scopes could likely be done in a way sufficiently robust enough to handle operations against a single Project. I think this becomes much trickier if a given job wanted to, say, clone another repo (or repos), update it with changes, and push it back up. Policy-based Access Control seems like a better fit in the mid / long term as it presents an opportunity to unify the different token types. Maybe we could express the scopes as policies under the hood to make this easier in the future.
The fractured CI identity model is likely going to need to be addressed regardless of how permissions are expressed. When a CI_JOB_TOKEN hits the API, who does the API think it is? It's not the user. It is a robot with some subset of the users permissions. There doesn't yet exist an entity in GitLab to represent this. (plug for #24123 (comment 215458587))
This feature and the necessary refactoring present a good opportunity to introduce a first-class expirable credential API.

Links / references

Edited Jun 04, 2020 by Grzegorz Bizon | on PTO until August 19th