Developers need to perform actions on multiple projects at once. e.g. running CI on a project that uses sub-modules from another project. Or using a docker image from one project in a CI pipeline for another project. We need to provide some way to allow this access in a secure and convenient way.
Per-build CI tokens act as the user that triggers the build, but with read-only access outside of the current project.
It works for git clone, git-LFS, git sub-modules, docker pull, and images: foo, but not for triggers (because they're user-less) or any API calls (e.g. fetching build artifacts from another SHA/project).
The scope is read-only outside of the current project, and the token expires when the build is finished so exposure is limited.
This slightly increases risk of runners in that the tokens that they have access to are slightly more powerful (access to more than 1 project) so protecting them is even more important (always use https, protect the runner instances, protect project token registration tokens).
Proposal
We will generate a token for every build,
The token will be unique across GitLab,
GitLab Runner will receive payload with: Build ID, and generated Build Token,
For cloning sources, accessing LFS objects and pulling docker images we will use the: gitlab-ci-token, build-token (we do it now),
We will authenticate this token by looking for a Ci::Build with that build-token,
We will authorize access to the resource by getting from Ci::Build information about a person who run this build, it could be: pusher of git push, person who did retry a build, person who did merge a changes,
We will run normal authorization checks for a resource as that user, basically allowing to use gitlab-ci-token:build-token credentials as temporary credentials of the build,
These credentials will only be accessible when build will be in state running, it means that it's assigned to runner and is being processed by him,
Possible authorized actions will be only read-only, we will block read-write operations,
There will be one exception: docker push for a project for which Ci::Build is created will be possible. This is to maintain backward compatibility,
Basically a Ci::Build will have access to everything to which a user triggering build does have access,
Artifacts will be downloaded using the same build token delivered in payload.
Triggers
Currently Build triggers are user-less. We will have to make triggers to be executed by specific user. We will extend Build triggers similarly to how we implement Mirroring: allow to specify an user who is running a build.
Problems
This open potential security hole, because it makes CI credentials to have much wider permissions,
This puts much more trusty in security of runners. We trust currently runners as long person doesn't use shell and doesn't use docker in --privileged mode,
This puts much more trusty in security of data channel: receiving payload by runner. We do have customers using HTTP for accessing the GitLab. However, this has the same problems as connecting from the browser.
Summary
Given above problems this seems to be a future proof solution:
We can allow to use the same credentials for accessing GitLab API,
Basically this approach doesn't seem to have edge cases,
It works well for users that are external and have limited permissions,
It will work with private, internal and public projects,
CI will only get read-only permissions (with one exception read-write to own project's container registry),
Admin users will have to be direct members of project for CI to access sources
Should CI tokens act as the person triggering the pipeline? Then we could use that user's permissions for submodules and images. It might mean that some people will trigger a pipeline and have a pipeline break, but maybe that's actually appropriate. Otherwise they could hijack the CI script and print/manipulate whatever stuff they otherwise wouldn't have access to.
I imagine sometimes it will be better to know who caused the action, but sometimes, you really want to know that it was automated.
Should CI tokens act as the person triggering the pipeline? Then we could use that user's permissions for submodules and images. It might mean that some people will trigger a pipeline and have a pipeline break, but maybe that's actually appropriate. Otherwise they could hijack the CI script and print/manipulate whatever stuff they otherwise wouldn't have access to.
I don't agree. It will lead to hard to notice build problems, because someone doesn't have permission to some of the projects.
Added example uses cases of downloading an archive, interacting with an API endpoint, and pushing tags.
Also "The biggest technical roadblock to this seems to be that the CI token is currently only checked in Git auth code. Requests outside of the repository do not route through this code so it would have to be moved and/or rewritten."
@markpundsack I did a next round of thinking and it seems that what you described above for CI tokens to acts as an user that triggers build seems like a most appropriate and simple to use.
I understand and appreciate the use case; our omnibus-gitlab builds are an example of a CI job that pulls from several other projects on the same GitLab server. (We made that work by distributing deploy keys to the runners outside of GitLab CI.)
I am not sure if it is wise to give the runner (temporary) read access to other projects as if they are a regular (human) GitLab user. Then who would that user be? Would the build run differently if the user is an admin?
Could we use GitLab groups to give runners access to projects other than the one the build was triggered on? We have the 'share with group' feature in EE so I think you could go a long way with this.
Couldn't you have a special-case user for gitlab-ci which a GitLab admin could define the level of permission for? This hypothetical user would be a normal user in every other respect, and would default to read-only access across all projects which are not marked "private". If you make it impossible to change the username (could be as obvious as gitlab-ci), it would then be very clear in audit logs and the like that the reads are coming from the gitlab-ci subsystem.
Full disclosure, I'm coming from an in-house installation context, where we share code across the company. If this suggestion is deemed inappropriate for open source (gitlab.com) use, it could also be feature-switchable. A simple checkbox saying "Give gitlab-ci runners read-only access to all non-private projects" would suffice for me, and I suspect many others in my position.
@ajk8 I see why extending read access to 'public' and 'internal' users would make sense. I think using an actual 'User' object is not a good idea but that is more of an implementation detail.
So a CI runner would be able to:
read the current project
write docker images to the current project
read projects from the current group
read internal projects
read public projects
That might be reasonable. It would still be a big change in the security model; a specific runner attached to one project could all of a sudden download all the 'internal' code on a company's GitLab server. People may have arrangements where they assume this can not happen.
@jacobvosmaer-gitlab It seems to me that if a project owner has made their project available for a set of otherwise-anonymous users to clone (internal, public), then it follows that an automated "user" with the express purpose of building, testing, and deploying code would fall under that set of otherwise-anonymous users. Put a different way, I'm not sure how gitlab's CI system is any more dangerous than a human actor with the same access.
That being said, there's no reason that a project owner couldn't be presented with an option to disable this access.
One more thing, looking at the list of permissions that you have laid out (thanks for being so explicit!), I think it is critical that the following are added:
pull docker images from internal projects
pull docker images from public projects
This would enable us to build runner images in one project, and consume them in another.
What do you think about using User permissions instead?
Yes, if you would run a build as an admin: you do git push, you also get much wider permissions for your CI builds. However, this also solves a problem of external users that needs then to have explicit access to resorces.
I initially assumed the build token would allow me to read from my dependent projects.
Each developer currently has 2 options:
Create a deploy key for their runner and add it to each of the dependent projects (fork and upstream). O(N)
Create an ssh key for their runner and add it to their account. O(1)
Although deploy keys are the secure choice for provisioning read access, using an account level ssh key may be preferred due to its simplicity and familiarity.
Alternative solutions include:
Support managing deploy keys across projects (project group deploy keys?).
Support configuring read-only ssh keys at the account level.
@ayufan I am not sure if associating a build with a User is a good idea, that is how I came up with the other ideas in the first place. I think it is a very big change with hard-to-oversee consequences. It is very different from how other CI systems work. We would end up with an unusual and surprising (to users coming from e.g. Jenkins) security model.
You talk about solving a problem with 'external users'. Can you explain to me what that problem is?
I should probably make it explicit that this approach isn't seen as complete, but the biggest bang-for-buck that will solve a lot of problems right now. We should create an EE issue to provide finer-grained control over CI build permissions, but I believe the current approach will be adequate for CE.
@jacobvosmaer-gitlab Yes, there may be some unusual surprises, but one good thing is that people can use the existing permission system to work around whatever they encounter. e.g. if someone runs a build but doesn't have access to underlying dependencies, well, just add them!
One problem with your proposal is that write docker images to the current project and read projects from the current group might be more permissions than the runner should be allowed to have. e.g. someone who is given access to one project in a group writes a malicious branch to download some other project they otherwise wouldn't have access to, but the runner does have access to. So if you're going to filter that permission by the user anyway, you're halfway to implementing exactly what this issue is about anyway. :) The only obvious thing your proposal does is limit the runner to a single group, whereas the user may have access to multiple groups. If that's important to our users, then perhaps we could consider further restricting runners to only access things in the same group, but my gut feeling is that it's unnecessary and would just lead to more issues where people complain about the restriction.
As a bit of hard-learned product philosophy, small teams with free products should generally get the most permissive flow. Enterprises need, and pay for, restrictions.
We had today a call with @jacobvosmaer-gitlab I'm preparing a summary of our call where we discussed a lot of potential problems and use cases. Stay tuned :)
@markpundsack having talked with @ayufan yesterday I now also think associating builds with users is the least worst solution. For future reference, the 'external users problem' Kamil talked about above is the fact that 'external' users on a GitLab servers are not allowed to see 'internal' projects. So they should also not be allowed to create builds that access internal projects.
(By the way, @ayufan , how would 'external' users be prevented from seeing build results on projects they are allowed to access, but where the build pulls in a project they are not allowed to see?)
My main concern with associating builds with users was and remains that it creates a whole new line of attack on the already shaky security model of CI.
This token is used by Runner to register as specific one,
This token is used to git clone sources as gitlab-ci-token user,
This token has limited permissions, only allow to clone current project sources, fetch current project LFS objects, fetch container images for that project,
This token also allows to push container images to that project,
This token also allows to access public projects of GitLab instance.
Alternatives
We discussed a few alternatives:
Allow for CI token to access Internal projects and allow to access projects from the same group. This concept introduces a security problem. GitLab do have external users which can be added as a member to the project. With that concept we don't have possibility to block external user from accessing internal projects. This also introduces another problem: if user is only member of a project (not group), he could download all sources from the group and this seems to be a privilege escalation which is basically unsolvable.
We discussed introducing a service user for each project. CI would use the service user to access GttLab. This service user would have to be explicitly added to all other projects to which he should have access. It's secure, but it's complicated from permission management, because an user (master of the project) would have to ask a master of another project to add the service user.
We then discussed adding additional relation. When CI would first access the project to which the relation is not created the master of the project would receive notification that he needs to approve such relation. Master would have to go to project, go to Pipelines and click Approve on each project which sources would like to access from this one (effectively access to any private and internal used referenced by git submodule or private docker image). This solution seems to very good from security perspective, but seems to be bad from user experience. It introduces a lot of new features that we would have to build: adding relations between project, adding UI for approving relations, adding an UI to see where this project is approved, allowing to easily revoke such permissions.
We then discussed the proposal of using a User permissions to fetch sources.
Using user permissions
We discussed a security implications of such feature:
it makes the security model open, currently the security model is closed,
we discussed stability of builds: Jacob had argument that the stability of builds would be dependent by a person who is triggering a build, and build would fail if the user would not have permissions to other projects. At first it seems to be bad from UX perspective, but then we came to agreement that this is OK, because permission denied message will be clearly shown in build log,
we discussed a problem of admin accounts, we assumed that in most organisations admins are normal developers who use CI, and they may be concerned about builds run by them that they would potentially have an access to all repositories. We discussed that this seems to be a important problem. One of the proposed solutions were to lock CI builds which are executed by admins, we would allow to fetch sources as long as an admin is a direct member of the group on which he runs a build. He would not receive a permissions, because he is admin. This how it works now. This would affect only CI builds,
we had concerns about understandability of the feature. Jacob pointed out that such model may not be clear to users. My argument was that for me it seems to be more clear than it is now. Your builds has the same access as you.
since this is significant feature it should be also properly documented to make people aware of how the system works,
we also argued about trustability of code pushed to CI. By introducing that change the user who is doing git push or clicking Merge is basically signing this commit and the CI runs with his name. This opens a new attack vectors: executing untrusted code by someone else and downloading all repositories. I argued that we have the similar problem even now, and the trustability of any CI solution is problematic. Since we run a lot of external dependencies, we will never be able to verify. Finally it is up to person who does merge to be careful with what he accepts. This problem doesn't only affect proposed changes, but also any code that we ship, or any code that do have access to production environments. For example malicious code could do dump of production database, and currently we have no easy way to mitigate this kind of attacks. This seems to be much wider problem than only changes to permissions that I’m proposing. This becomes much bigger problem when we also add to equation a possibility of doing production deployments from CI.
we also discussed that proposal opens a new way to manage resources:
ex. it allows us to lock access to Secure Variables / Runner to permission of user. We could effectively introduce a feature that makes Runner to be used only by Master of the project. Someone would be able to choose it to do this way, because the Runner does deploys and is intended to not be used by other developers.
this also allows us to make some of the jobs be able to be executed only by some users, let's consider a deployment to production: right now anyone with development access can run action to deploy to production. We could effectively limit jobs to permission of users.
in the end we came to agreement that this is big change, and we are not fully comfortable with how it changes the permission model of CI, but this seems to be best approach out of what we discussed
@ayufan is alternative 2 something that could be implemented quickly and later complemented by other solutions? In addition to being secure, it has the added benefit of explicit dependencies - you can tell an abandoned project from an active one just by looking at its users. It's actually exactly what you need to make projects and dependencies manageable. Otherwise you'd be asking youself: can I kill this project, does anyone use it? With alternative 2 it's a no-brainer.
There's no quick solution for 2. and 3., but 2. seems to be simpler in implementation.
To be safe, the 2. solution would have to require you to add a service user to each internal project, because it will not define how we should support external users, so it basically has the same flaws as 1.
I kind of feel that the best solution is to use user permissions and built better permission management in EE if this is really needed.
@nikolai1@ayufan In solution 2 would each fork of each project need to have it's service worker added to each protected project it depends on? That seems likely to become unmanageable as team size or protected dependency count grows.
I don't feel that auto-access to group projects and internal projects is that big of a problem - that's why you'd get explicit dependencies, i.e. any kind of access would be solvable as opposed to current situation, where no amount of tinkering can help you. If you skipped implementing auto-access and only introduced service accounts, then external users' situation would be no worse than it is now, but other users (arguably, main users of your Gitlab) would profit greatly, right?
Coupling users to build systems seems extremely inappropriate - so you have this intern who's only allowed to play with one project, but he can't build it, because he doesn't have access to the base libraries? That shouldn't happen, on the contrary, CI should be his best friend. CI is not you, it's your mentor, if you know what I mean.
Coupling users to build systems seems extremely inappropriate - so you have this intern who's only allowed to play with one project, but he can't build it, because he doesn't have access to the base libraries? That shouldn't happen, on the contrary, CI should be his best friend. CI is not you, it's your mentor, if you know what I mean.
It's interesting, because it can also mean that you just should not have access to these sources. And you can consider it as a permission escalation, because the developer is in full control of the build script, so it can actually fetch or use the sources inappropriately.
@ayufan I think a bit of confusion comes from the fact that a project permission means both source and binary access (docker base images). Our current non-working use case is using base-images containing application server for downstream projects. No source access is needed and in some cases unwanted.
@nikolai1 in your example of the student intern with limited access, if the CI has more access than them they effectively have full read access to that library because the student can read CI output. This is the same problem as 'external users' I think where a CI 'agent' (for lack of a better word) with leaks its read permissions to users less privileged than the CI.
@ayufan I am starting to think that no matter what we do we may want to explicitly track build-project 1-to-many relations for each individual builds, to see what builds are pulled in by the build. That way we can define at least whether a user is allowed to see the build results.
Thanks a lot @ayufan, @markpundsack , @jacobvosmaer-gitlab and everyone else involved for carefully considering all of the security and UX implications of this change. I really appreciate your transparency in this difficult decision; it helps a lot to explain why the user permission solution was chosen.
This was a complex, controversial problem that a lot of people have been mulling over for a long time (I opened https://gitlab.com/gitlab-org/gitlab-ce/issues/18107 over a year ago!) So thank you for not letting the difficulty keep this one on the backlog. GitLab is truly excellent software.
The biggest problem with this change is that it assumes a user who can checkout a repositories code or requires it as part of some other project's build has access to the whole project. A gitlab project is a a lot more than just the code.
I have project/app, project/test, the app's build requires access to test, but not all developers need to know the complete detail and everything about project/test. How do I go about this? With CI_TOKEN I could just use that to clone test when building app. Is it possible any more? other than horrible SSH hacks.
@omeid that sounds orthogonal to how GitLab permissions are organized: they go from "guest" (see metadata but not the code) to read access and increasing amounts of read/write access to the code. "See just the code" is not on that scale. https://docs.gitlab.com/ce/user/permissions.html
I was excited to see this change in 8.12. I would like to use sonar-scanner in my projects. There exists a plugin for sonar that, once analyzed, will go back to gitlab and annotate commits with details about violations found. The problem I have is that I need to authenticate sonar to gitlab. Currently, the way to do that is a secret variable that contains a key that allows sonar to reach back to gitlab via the API. This key can be set for all projects (in which case, this one special user needs to have developer access to all projects you want sonar to annotate commits with), or it can be per project (same thing though, developer access, secret key, and any developer could then reveal the key with 'env').
So I was hoping the gitlab-ci-token (which now has permissions in the context of the user who launched the build, which is perfect for this) would have access to the API -- enough access to annotate the commits. Unfortunately I see gitlab-ci-token does not support API access, so we are stuck with the much less secure, and much more cumbersome method of maintaining the key as secure variables, or with some "shared" user that gets added to your repository just to annotate commits.
Alternatively, we can store this magic-user's key in sonar itself and use the same user across all projects, but this means giving this one user account developer access to all projects in order for sonar commit annotation to work. While this is slightly more secure in that it doesn't reveal the key to developers via env, it's still not ideal (ideally the user who launches the job would have have their own user annotate their own commits with the failures found in sonar). The main reason it's not ideal is because it makes the fork/mr workflow super cumbersome -- you fork, do your work, and if you push your build will fail, because sonar fails, because you forgot to add SpecialMagicUser as a developer to your fork.
I wondered if it would be worth my time to rewrite the sonar-scanner plugin so that it can annotate the commits directly in the runner via git, instead of using the API as the plugin does now. However, I see the gitlab-ci-token does not allow annotating the commits locally with git inside the runner, either.
Anyone know the best way to resolve this, or if we can +1 minimal API access for gitlab-ci-token (since it runs in the context of the correct user that already has the permissions I'm seeking already...)?
Is there any update or estimate on when Triggers will be supported? I have submodules triggering the parent project to run a CI build, however due to the permissions problem it will only succeed when manually triggered at the parent project level. Would like to retire the old CI system and rely entirely on Gitlab.