Discussion: Standardized Authentication and Authorization between internal services and gitlab rails monolith
Summary
GitLab has a monolithic architecture, which makes authentication straightforward in most cases: use current_user
to access the logged in user in gitlab
and make authorization requests.
When an external service wants to access GitLab resources on behalf of a user, the default recommendation is configure GitLab as an OAuth identity provider. If the client wants more information about the user making the request, all that is required to "add" OpenID connect to the standard GitLab OAuth flow is to have the OAuth application request the openid
scope. When that scope is requested, the OAuth token response will include an id_token
attribute whose value is an OIDC JWT who claims are configured here.
When an external service is being built by a GitLab team, OAuth or OAuth + OIDC are also an option. But in some cases, the authentication flow does not fit the standard OAuth use-case. I've seen many teams dealing with this issue recently and wonder if we should standardize how authentication is handled in these scenarios to prevent a proliferation of similar-but-different approaches.
There will of course always be edge cases and specific needs for different scenarios, but there is definitely overlap as well.
Current approaches
Distributed Tracing
GitLab Observability Backend (GOB) utilizes an instance-wide, trusted GitLab OAuth Application. Because the OAuth Application is trusted, the end user does not see a window requesting their permission to authorize and the result is am invisible/seamless OAuth flow.
The benefits of this approach:
- OAuth access tokens are only valid for 2 hours before they must be refreshed, so if they leak the vulnerability window is small.
- OAuth access tokens have observability-specific scopes, so these tokens are compliant with the principle of least privilege.
- Seamless UX; requires no interaction from the end user.
The downsides of this approach:
- The seamless UX is unexpected for an OAuth authorization flow; the OAuth protocol was explicitly designed to provide delegated access without sharing passwords. By bypassing user consent, we are subverting the primary intent of the protocol.
- Access tokens are stateless: the client (GOB) must make requests to the server (
gitlab
) in order to determine access to resources. This may mean that many requests are made togitlab
, risking a self-DOS. To mitigate this issue, GOB had to implement caching mechanisms on their side so that every authorization request doesn't need to make a request togitlab
.
Code Suggestions (web IDE) for GitLab.com (SaaS) customers
Code suggestions: Get a token and send it to mo... (gitlab-web-ide#141 - closed)
The GitLab Frontend Web makes an API request to a use-case specific API endpoint, which uses creates and returns a JWT for the current_user
. The JWT is passed as the authorization
header to requests to the Code Suggestions Model Gateway.
The benefits of this approach:
- JWTs are stateless, which allows the Code Suggestions Model Gateway to validate tokens without making requests to the GitLab API. (related conversation)
- JWT expiration is 1 hour, which is short enough that if a user permissions are changed, the JWT will expire in a reasonable amount of time.
- Seamless UX; requires no interaction from the end user.
The downsides of this approach:
- Required implementation of a custom
gitlab
API endpoint for this single use-case. - No way to revoke a JWT aside from rotating JWKS (and we do not rotate those regularly)
- No explicit user consent in the flow; explicit consent granted via OAuth could have mitigated need for separate user and group-level settings for enabling code suggestions.
Code suggestions for Self-managed instances
Note: due to the need for self-managed instances to communicate with GitLab.com, this scenario is unusual and therefore pretty different from the other projects listed in this issue. As a result, I am not going to enumerate the pro/con list.
First iteration: authenticate with a GitLab.com Personal Access Token (PAT) from the user account of a self-managed (SM) instance admin.
Second iteration: Customers.gitlab.com issues one JWT per GitLab SM instance. SM users make requests to the AI abstraction layer (which is a GitLab API endpoint) using a Personal Access Token (PAT). SM instances forward these requests to GitLab-hosted AI services with the instance JWT on behalf of SM end-users
[Third iteration]: TBD, will involve the proposed AI gateway.
Container registry
https://gitlab.com/gitlab-org/container-registry/-/blob/master/docs-gitlab/auth-request-flow.md
Docker client makes authentication request to GitLab Container Registry. If unauthorized, Docker Client makes a GET /jwt/auth
request to gitlab
, which returns a JWT with claims that depend on the requested scopes etc.
There is a similar auth flow for Dependency Proxy, which is described in detail here.
Remote development
Remote development currently injects Personal Access Tokens (PATs) into a workspace and uses the PAT for authenticating requests to GitLab. PATs are long-lived and have a wider scope than necessary, so the team is looking for alternatives.
OAuth was explored as an option. If OAuth were used in a typical OAuth fashion, the end-user would first need to click a button in GitLab to generate a workspace, then, once it was spun up, visit their empty workspace and initiate an oauth flow. Once that flow was complete, the generated OAuth access token would be used to clone the repository. This would be a slow, clunky UX.
The next approach considered was using an OAuth access token but generating it from within gitlab
without user consent and sending the access token along with the workspace creation request so that the repository could be cloned without further user interaction. While this would work from a technical standpoint, with oauth it is unusual for the authorization server (gitlab
) to both request the credentials and generate the credentials. If there is no user interaction, is it even OAuth?
You could make the same case that using a trusted
OAuth client, as Distributed Tracing does, has zero user interaction and so it has the same problem. The fact that this use-case called for the authentication server kicked off the OAuth token creation felt even more unusual, however.
Another benefit of using JWTs for this use-case is that they can have claims that give further information about the current user.
As of August 15, 2023, the approach has not been finalized but the team is leaning toward using JWTs in favor of OAuth access tokens.