Managing access for both inbound and outbound permissions together for limited CI_JOB_TOKEN requires coordination across projects and project owners may not know of each other at all so revert to creating an email or opening an issue, if they can, in the project they need to access.
Proposal
Remove capability for users to Limit CI_JOB_TOKEN access scope
Intended users
Feature Usage Metrics
This page may contain information related to upcoming products, features and functionality.
It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes.
Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
Implementation tasks
Backend
We can remove all references to direction from the Scope and Allowlist classes
We can create a data migration to remove all outboundCi::JobToken::ProjectScopeLink's from the table.
We can remove the direction column on Ci::JobToken::ProjectScopeLink related table and adjust the unique index on that table
We will need to ensure all graphql documentation no longer references outbound as much as possible without fully removing the direction argument and outbound enum.
Frontend
Remove outbound from token_access_app.vue
Delete outbound_token_access.vue
Delete outbound mutations
Delete outbound queries
Delete outbound spec
Update token access app spec
Once inbound is default direction update related mutations
Remove inbound naming prefix on mutations/queries (optional)
We will need to switch the default to create :inbound links for the graphql mutations
We will need to ensure all graphql documentation no longer references outbound as much as possible without fully removing the direction argument and outbound enum.
I'm adding a weight of 2 since it requires a data and schema migration.
Front end
@pburdette, could you point this from a front end perspective.
@jheimbuck_gl, and I discussed earlier that we will not migrate the data from the outbound side to the equivalent inbound side restriction and instead delete all outbound data.
In other words, if project A is in Project B's outbound allowlist then it would be possible to create the same access by migrating project B into project A's inbound allowlist. But we don't need to.
@carolinesimpson I think a feature flag could make sense. It's slightly more work but has the advantage of de-risking the change and making the exact release time more in our control.
Suppose you have a project named A which has an outbound scope feature enabled. After removing the flow, the outbound allowlist created by the customer would no longer restrict what project's A's CI_JOB_TOKEN could access.
When this flow is removed, unless a customer takes actions to secure projects via the inbound allowlist on the other side then they could find that their job token can now access projects that the had explicitly intended to not allow it access too.
For instance, if project A had project B and C in the outbound allowlist then that allows project A to access only project B and C as long as the user running the job has permission. To create the equivalent restrictions they could enable project B and C's inbound allowlists and add Project A to project B's inbound allowlist and add project A to project C's inbound allowlist to create the same project accessibility via project A's job token.
@jreporter, In our example from above, project A would be able to access anything that the user can access through the CI_JOB_TOKEN and is no longer restricted to accessing just project B and C
thanks @allison.browne - so it sounds like for existing projects this would be no change, but for new projects it would mean the CI_JOB_TOKEN has more permissions?
I've the same question as @jreporter - What would be the impact for new projects? Do they have to be explicitly added to the inbound scope on the other side? Taking the same example, say if the new Project is called D would Project A and B have to add D to their inbound scope or else D wouldn't be able to call Project A and B? Is that the correct assumption?
@iamricecake Can you take some time on a focus friday or something to reassess the current weight for this? It wasn't our team that added the weight, so I just want to make sure we agree with the assessment.
@artem_ptushkin for %17.0 the toggle for Limit access to this project will no longer be available. By default, all projects will limit access and must use the allowlist to grant access.
@mgandres it looks like we've deprecated Mutation.ciJobTokenScopeAddProjectdirection (docs) as well as CiJobTokenScopeTypeoutbound_allowlist (docs)
But it doesn't look like we've deprecated Mutation.ciJobTokenScopeRemoveProjectdirection (docs).
The deprecation documentation says that we need to deprecate parts of the schema but keep available for at least 6 months, and then we can remove it in a major release.
So it looks like we are good to remove the fields that we've deprecated already, but if we want to remove Mutation.ciJobTokenScopeRemoveProjectdirection then we'll have to deprecate it first.
@shampton what does this mean for our plans in %17.0 - and is a specific deprecation notice required for Mutation.ciJobTokenScopeRemoveProjectdirection ?
@shampton if we're removing the ability for customers to specific the direction, then we should remove it. I wonder if it was just missed. I'll wait for @mgandres to confirm.
@shampton@jocelynjane No, the endpoint is still being used for the inbound permissions for limited CI_JOB_TOKEN, which we're still supporting. We should keep it.
However, we can probably deprecate the direction argument. It's used here with direction: INBOUND. Since we're not supporting OUTBOUND anymore, INBOUND would be the default so there's no need to declare it every time.
@mgandres since the current default for that argument is OUTBOUND which is going to be no longer supported, I think there needs to be some deprecation notice here.
@jocelynjane even if we deprecate this right now, we are less than six months from when %17.0. WDYT?
Yes, I think we need to deprecate the direction @mgandres@shampton. Since we know coming %17.0 the direction is no longer supported, we should update the documentation at a minimum so our users know this setting is essentially nil. Then we can follow the deprecation process.
@shampton do I need to create a deprecation notice similar to the previous?
@jocelynjane yes we'll need a deprecation notice. We'll also need an engineer to update the GraphQL documentation and schema to note that it is deprecated.
@shampton@mgandres@iamricecake - I'm targeting the deprecation notice for %16.8. I have the issue to update the docs scheduled for 16.9 right now (as I know 16.8 is a busy milestone and we have a lot of OOOs with the holidays), but if anyone is feeling so inclined to move it into %16.8 as well, I would not object!
most important use case of CI_JOB_TOKEN is authentication with our container and package registries
Package registry authentication is often stored in language specific configuration files (like ~/.pypirc, settings.xml, composer.json etc). These will contain the string CI_JOB_TOKEN, enabling detection with advanced search. Detecting the target registry (in terms of project) and cross-referencing its token access list is non trivial. Some heuristic path extraction may be possible but I wouldn't bet on its accuracy for all registry types.
Container and package registry authentication can be performed in the CI file directly as a script and will usually contain the string CI_JOB_TOKEN These scripts could of course also be separate files. Path extraction isn't easier than in the case above.
other use cases of the token, like calling APIs and cloning other repos follow the same signature
Container registry authentication can happen automatically using the job token when using the image keyword. CI_JOB_TOKEN will not appear as a string in the CI file. Path extraction is much simpler in this case, potentially enabling cross-referencing with the token access list of the image's repository.
Based on this, we need a combination of advanced search to find all occurrences of CI_JOB_TOKEN, flagging those projects as potentially affected. These need manual migration because we can't reliably resolve which projects they access. In addition, CI files need to be parsed, examining if images in other projects' registries are used. For these projects, the token access list needs to be checked to ensure that the requesting project is on it. This could feasibly be automated.
An alternative approach would be listing projects that have the token list currently deactivated, so the access to them would be restricted once this rolls out. We would then search for the project paths in advanced search to find our potentially affected projects. The issue here is similar to above: Paths can be somewhat obscured by the use of variables or project IDs in the case of API access. I tend towards this approach, as it may scale better.
Hello we encountered an interesting issue with a US Government customer today.
The customer utilizes a single project to manage their container registry images and authenticate with a project access token. A DOCKER_AUTH_CONFIG instance-level environment variable is created to authenticate with the project's container registry. When jobs are initiated, we observed authentication failures when attempting to pull images from said project.
After some troubleshooting, we discovered that in the project's CI/CD settings, Token Access > Limit access from this project was still enabled, albeit the setting is deprecated. After disabling this setting, authentication to the project's container registry succeeded.
I came across this issue from our documentation -- does this fall under the scope of this discussion?
@dcoy What version were they on, I'm guessing something above 16.3 and they were still able to disable the setting via the UI, right? I'm guessing these authentication failures were changed behavior after an upgrade? Or was this just something they set up for the first time and couldn't get working as they thought it should?
I don't recall if Limit access from this project and Limit access to this project were mutually exclusive… but I don't think so. That's the only way I could really make sense of this, though. Seems very odd that the outbound limit would impact inbound auth requests to the container registry.
@dcoy@JamesRLopes what version did the customer upgrade from? I know we have had some other CI job token changes and I wonder if those changes are causing this problem.
To prepare for this change, users on GitLab.com or self-managed GitLab 15.9 or later can enable the Allow access setting now and add the other projects. It will not be possible to disable the setting in 18.0 or later.
Can this decision be rethought please?
We have a lot of repositories in a shared top level group (e.g CI components, packages, container images, ...). How are we supposed to manage adding every new project (dozens a day) to the allow lists of all these (also dozens) shared repositories?
There is nothing security critical in those repositories and we actively want them to be accessible from all other groups and project on our self hosted EE instance.
You could add all your top-level groups to the allowlist of your shared repositories (since #415519 (closed)), that way any newly created projects in those top-level groups should automatically have access. This is also possible via API (since #435903 (closed)), so it could be automated.
In self-managed there's also going to be an instance-level setting to disable the enforcement: #496647 (closed)
Our plans on this have gone through a few iterations – @jocelynjane has already prepared an update for the deprecation notice you quoted here, but it has not been merged yet. Hopefully this should address your concerns, @lovetheguitar!