It would be great if we were able to prefix all tokens for self-managed instances to distinguish them from GitLab SaaS and other self-hosted instances.
This issue was automatically tagged with the label ~"group::authentication and authorization" by TanukiStan, a machine learning classification model, with a probability of 0.22.
If this label is incorrect, please tag this issue with the correct group label as well as automation:ml wrong to help TanukiStan learn from its mistakes.
Thanks for the issue @nejc ! I would actually advocate for the opposite and remove the configuration possibility for the Personal Access Token. Scanning tools look for glpat- and anyone who changes that is potentially more exposed.
Is it something important for Siemens to be able to use custom prefixes?
Yes, this is very important as we use a custom prefix to be able to detect leaked credentials that belong to our GitLab instance and I'm pretty sure many security aware customers have set a specific prefix to do the same.
@dcouture to give a bit of context, if we are able to determine the token is from our instance, we can more easily proactively auto-revoke it if the user does not respond to security alerts instead of attempting this on any glpat- token that may come from any instance in the wild.
We have a central scanning service that extends gitleaks config, and we provide our pattern when entering secrets scanning programs.
Currently I'm not sure if the new tokens are being picked up by security tools as they're a bit more specific, at least for gitleaks I've had to add them myself I see the point though in general and we understand that there is a trade-off between having tokens discoverable and being able to act on it.
The prefix feature for PATs also predates the default glpat- prefix added later, by around a year (13.7 -> 14.5), so IMO it would be awkward to remove it.
Oh that's unfortunate about Gitleaks. I know it was added to our secret detection tools (gitlab-org/security-products/analyzers/secrets!174 (merged)) but the process to bring it back to the upstream gitleaks project should be improved. Thanks for the context!
@dcouture we've discussed this a bit in our team and thought of another approach that might work better for both SaaS and self-hosted, i.e. discoverable and easy to act on for everyone.
How about instead of configurable prefixes for every type of token, have a single instance-level token prefix prepended. I know, a prefix for a prefix . This would avoid having a bunch of configuration options, and security tools would detect them all, but additionally tokens from specific instances could still be identified.
This would then be applied to all tokens that can be prefixed.
For example, adding SIE- as an instance prefix, you'd get this for all kinds of tokens (this is just a proposal, some of them don't exist atm) -> See table in description #388379
..and so on. This would add more padding to token length, but I see this already happened with the pipeline trigger token and seems like it's not an issue. WDYT?
It would still cause false negatives in tools that do automatic revocation / validation of tokens but I think it's better yes! And if it's better for your internal tooling I'm not opposed to it from the security point of view. I'll let the product folks chime in with their opinion!
@dcouture The idea would be that GitLab documents that anything ending in the expected format decided by GitLab e.g. glpat-XXXX would be detected as a GitLab token. A simple regex could ignore the prefix (which would have a certain max-size and allowed chars), does that cover your concern?
@dlouzan It would be detected, but some tools attempt to automatically validate the token by doing an API request (TruffleHog does that and I wonder if our automatic revocation feature supports prefixes yet ) and they'd disregard those tokens if they don't take the unexpected prefix.
@nejc Great Idea, that also reduces the config options! Could you maybe extend the table above with what exists and what not? I was also thinking about keeping the PAT as because this is out already in the wild since a while, e.g., instead of SIE-glpat- just SIE-.
I agree in this case it seems like the instance prefix might even be preferable for non-SaaS, (at least for auto revocation), so it doesn't attempt to auto-revoke all kinds of glpats, but only those on the current instance (since it presumably will be too hard to also detect a base URL for the self-revoke API from just the token leak itself).
@dcouture the question is if we agree on this approach with the GitLab product staff, whether we should add the token-type prefixes first for all tokens, or go ahead with the instance prefix already regardless of the existing prefixes?
@hsutor from our perspective (self-hosted) we would first focus on contributing an instance prefix for all tokens (regardless of whether they have a specific hardcoded token type prefix already), and then later add hardcoded prefixes for specific types still missing (Prefix all authentication tokens for easier det... (&8923)). WDYT?
fyi @connorgilbert@amarpatel while I don't think the proposal negatively affects the auto revocation on SaaS, it's worth noting the prefix change if we offer the service on self-managed in future.
@hsutor the idea is to add this to all token types. So the relevant column is the 3rd one, i.e. "Self-hosted prefix for all tokens".
I realize this is in addition to adding hardcoded prefixes for each token type which is a separate effort still in progress (hence the proposed table of how this would look with both prefixes if an instance prefix is added). I've edited out some of the old issue description to avoid confusion hopefully.
@adil.farrukh yes this is actually also part of our motivation, to make self-managed auto-revocation easier (whether this is native in GitLab or as an integration with another service via the API such as GitLab & GitHub secrets scanning partnerships when tokens leak elsewhere).
@hsutor@adil.farrukh I wanted to let you know that I've started working on this feature. That MR starts with support for feed tokens, as they are easy to test and to show the concept. I'll add support for other token types in follow-up MRs, once this approach has been approved.
As discussed, this MR proposes to add an instance wide prefix, that is set to gl by default. The new prefix format is: #{instance_prefix}#{token_type_prefix}. By default, instance_prefix is the first part of the current prefix (gl). We can now customize the instance prefix to create a new prefix, while keeping the rest of the prefix unchanged. For a feed token with the instance_prefix of my-company-name, we'd get: my-company-name-ft-.
@nwittstruck Thank you and sounds good! I went back looking for our discussion on persisting history around the prefixes for revocation, and based on Admin Token API does not work with historic cus... (#504989) seems like we'd hold off on that until greater demand. So we are good to go ahead with this work, and get the team's reviews.
I've added (hopefully) everyone responsible for the respective tokens types to the implementation status, as I've started working on this and the last ping was 2 years ago.
I think the table should be mostly accurate. If I've pinged someone by mistake, please let me know, I'll fix the table.
There is no action necessary from the people I've tagged right now, this is just for awareness.
You can find more about the current state of the implementation in this thread or in the documentation.
@nwittstruck FWIW, grouppipeline security only owns CI job token through the upcoming allowlist deprecation in 18.0. Anything forward looking beyond the deprecation (i.e. future of CI job token) will be owned by groupauthentication.
To set expectations, GitLab product managers or team members can't make any promise if they will proceed with this.
However, we believe everyone can contribute,
and welcome you to work on this proposed change, feature or bug fix.
There is a bias for action,
so you don't need to wait. Try and spin up that merge request yourself.
Nejc Habjanchanged title from Make hardcoded token prefixes configurable to Add single instance token prefix (was: Make hardcoded token prefixes configurable)
changed title from Make hardcoded token prefixes configurable to Add single instance token prefix (was: Make hardcoded token prefixes configurable)
Nejc Habjanchanged the descriptionCompare with previous version
I wanted to clarify that automatic revocation of leaked PATs works in both GitLab.com and self-managed, but today requires the glpat- prefix. The automatic revocation of PATs works differently from external secrets liked AWS; for PATs it's handled directly within the GitLab application, so it's not limited to GitLab.com.
We're interested in injecting the instance-specific prefixes (see comment above, #388379 (comment 1322979107), referring to #345606) but don't currently have an ETA for that based on our other work in flight.
It would be a great addition to these efforts being able to differentiate in between Personal Access Token, Group Access Token and Project Access Token.
My observation is that junior CI developers are integrating their own users personal access token into CI variables. Thus it would be helpful having a different pattern, that could be used to check for personal access tokens in CI variables. E.g. glgat- and glpat-.
FYI: I've updated the title and description to propose having a prefix available for all tokens that is entirely distinct from the default prefix (glpat- etc).
This is based on our finding that simply adding another prefix instead of changing it completely causes too many duplicates and false positives in security scanners like gitleaks and GitLab secret detection etc. (e.g. SIE-glpat-123 and glpat-123 both match, which wouldn't happen with something more distinct like SIE-pat-123).