Authentication: iteration 2
This is a follow-up to https://gitlab.com/groups/gitlab-org/-/epics/10604, in which we already explored several means of authentication for the code suggestions for self-managed feature. Authentication requires two distinct flows, let's call them `Leg 1` and `Leg 2`:
1. **Leg 1: SM end-users auth'ing with their SM instance.** This involves their client of choice (e.g. VSCode) to log in to the SM instance and obtain a token to make further requests.
2. **Leg 2: SM admins to register with GitLab Inc.** Since GitLab Inc runs the code suggest service that is used by self-managed end-users, they require using a token that is also issued by us instead of the SM instance. This requires a separate connection or registration step between `SM instance <=> GitLab-hosted service`.
The goal of this epic is to document the status quo point out its drawbacks and limitations, and how to evolve it from here.
## Status quo (iteration 1)
For iteration 1, we chose the path of least resistance to get something to market faster. It involves two kinds of PATs, mapping to the two paths/legs outlined above:
1. One "admin PAT" created by the SM admin on SaaS, for use in leg 2. It is then stored in the SM instance's application settings and used to proxy requests to SaaS.
2. `N` user PATs created by SM users on the SM instance, for use in leg 1. They are used to have their clients talk to the SM instance, which then re-maps these requests to SaaS using the admin PAT from step 1.
The actual flow involved re-using an existing endpoint on SaaS (gitlab-rails), `/api/v4/code_suggestions/tokens` to obtain a JWT for use in code suggestion clients on behalf of a SM end-user. The way this is implemented is by having the client send the request for this token to the SM instance (auth'ed by their own PAT), which then forwards it to SaaS, which issues the JWT and returns it to the SM end-user (auth'ed by the admin PAT). From there on, the paths between SaaS or SM end-user converge and are basically indistinguishable since all the model gateway sees is a SaaS JWT.
<details>
<summary>Flow chart</summary>
```mermaid
sequenceDiagram
autonumber
participant A as SM admin
participant U as SM user
participant VS as VS Code
participant SM as SM GitLab
participant GL as SaaS GitLab
participant CS as Code Suggest
Note over A,GL: Admin persona
A->>GL: Create Personal Access Token (PAT)
GL-->>SM: PAT
SM->>SM: store PAT in application settings
Note over U,CS: Developer persona
U->>SM: Create PAT
SM-->>U: PAT
U->>VS: Configure with PAT
VS->>VS: store PAT
loop Use code suggestions
alt JWT token missing or invalid
VS->>SM: Authenticate user with PAT
SM->>GL: Get JWT with admin PAT
GL-->>SM: JWT
SM-->>VS: JWT
else
VS->>CS: get code suggestions with JWT
CS-->>VS: code suggestions
end
end
```
</details>
This approach, while it was the fastest route to success, has some drawbacks and also violates several points of our [desired architecture for AI services](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/161 "Unified AI Gateway Architecture"):
- **SM instances coupled to SaaS for auth.** This is an architectural anti-pattern since GitLab SaaS must not be considered "special" and be relied on by self-managed. It also incurs scaling risk since SM instance traffic must be absorbed. Since we already know based on their subscription whether they are eligible to use an add-on, this could likely be handled exclusively by CustomersDot instead of SaaS.
- **Instance == SaaS account.** Since a PAT is used for SM to SaaS auth, we currently equate "customer instance" with a particular SaaS user created by the SM admin. This is undesirable for a variety of reasons (poor customer UX, telemetry tied to user not customer, less control over access tokens once issued). While some of these could be addressed using `Service Accounts` instead, it would not fix the previous issue of removing the dependency on SaaS.
## Proposed alternatives
### Option 1: Single instance token
**UPDATE: we ended up choosing this approach.**
This options aims to be an iterative solution toward the new [architectural proposal](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/161 "Unified AI Gateway Architecture") for how to deliver AI services. It is only an incremental step because it would not yet talk the proposed AI gateway for auth, but CustomersDot. This concession is made because the proposed new service does not yet exist, and we want to maintain velocity. If the APIs are developed carefully, we should eventually be able to push this logic behind the AI gateway instead.
Key ideas:
- CDot is in charge of **issuing and signing** tokens
- CDot issues **one JWT per instance**, regularly refreshed based on subscription status
- SM users make requests to the SM AI abstraction layer using **PATs**
- SM instances forward these requests to GitLab-hosted AI services with the **instance JWT** on behalf of SM end-users
```mermaid
sequenceDiagram
autonumber
participant U as SM user
participant VS as VS Code
participant SM as SM GitLab
participant CD as CustomersDot
participant AI as AI gateway<br/>(for now CS model gateway)
Note over SM,CD: instance-to-instance auth
loop chron job
SM->>CD: sync license data
Note over CD,SM: Token validity tied to subscription
CD-->>SM: seat data + JWT instance access token (IJWT)
SM->>SM: store seat data + IJWT
end
Note over U,AI: Developer persona
U->>SM: create PAT
SM-->>U: PAT
U->>VS: configure with PAT
loop Use code suggestions
Note over VS,AI: All requests via AI abstraction layer
VS->>SM: get code suggestions with PAT
SM->>SM: auth user with PAT
SM->>SM: verify user assigned to seat
SM->>AI: fetch code suggestions with IJWT
AI->>AI: validate IJWT
AI-->>SM: code suggestions
SM-->>VS: code suggestions
end
```
* **1-3:** The sync would for now re-use the existing [`SyncSeatLinkWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/e4f942238223a5f4485553157150b32017ca8716/ee/app/workers/sync_seat_link_worker.rb). This worker runs periodically to exchange an instance's license data for subscription data. This is already a form of instance auth since it requires the self-managed instance to post license data, including license key, acting as a form of "token". In the response, we could then add the code suggestions add-on seats they purchased through this subscription, plus a JWT that the instance stores locally. The JWT's TTL must be somehow tied to both the sync frequency and the subscription life time.
* **4-6:** As before.
* **7-12:** Code suggestions are now fetched from the SM instance via the user's PAT. The instance first auth's the user as before, ensures that the user is assigned to a seat, then makes a request to the AI gateway (for now: the CS model gateway until we have that new service.). Note that this goes exclusively through the [AI abstraction layer](https://docs.gitlab.com/ee/development/ai_features.html#abstraction-layer) i.e. the client never sees the service(s) underneath, all they use are gitlab-rails GraphQL queries and REST API calls. The response is then routed back all the way to the client.
#### Benefits/drawbacks
* Good: We can reuse the seat link sync to auth an instance
* Bad: We still never really "see" the SM end-user since we are still making requests using the instance token. **TODO: Is this even necessary or desired?**
* Bad (maybe): Tying the IAT to subscription validity means we may have to sync more frequently than we do currently and the worker has known performance issues
### Current Status of Option 1 implementation
| Steps | Status | Team | Issue | Notes |
|-------|--------|------|-------|-------|
| 1,3 | Complete | App Perf | https://gitlab.com/gitlab-org/gitlab/-/issues/416996+, https://gitlab.com/gitlab-org/gitlab/-/issues/416468+ | |
| 2 | Complete | App Perf | https://gitlab.com/gitlab-org/customers-gitlab-com/-/issues/6860+ | |
| 4-6 | Already Exists | | | |
| 7 | Started | Create: IDE | https://gitlab.com/gitlab-org/gitlab/-/issues/416738 | VSCode will require an update to support this new flow, but also should be backwards compatible with the flow released with 16.1 |
| 8,10,13 | Complete | App Perf | https://gitlab.com/gitlab-org/gitlab/-/issues/417152 | |
| 9 | Started | Fulfillment | https://gitlab.com/gitlab-org/fulfillment-meta/-/issues/1410 | |
| 11 | Testing ongoing | App Perf | https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/182+ | Final complete end to end testing required once the IDE extensions are ready |
### Option 2: Per end-user token
**UPDATE: we ended up discarding this.**
The main differentiator here is that there is no instance token. Instead, we lazily request end-user JWTs that must work when proxying requests via the AI abstraction layer. I am not sure if this approach is feasible since both the SM instance and GitLab Inc services would be required to validate these tokens. It could perhaps be realized by regularly fetching a signing key from CDot, which is similar to how the CS model gateway currently performs auth.
Key ideas:
- CDot is in charge of **issuing and signing** tokens
- CDot issues **one JWT per user**, refreshed lazily when end-users make requests and the token has expired
- SM instance and GitLab-hosted services **verify** these tokens
- SM users make requests to the SM AI abstraction layer using their **JWTs**
- SM instances forward these requests to GitLab-hosted AI services with **the same JWT** on behalf of SM end-users
```mermaid
sequenceDiagram
autonumber
participant U as SM user
participant VS as VS Code
participant SM as SM GitLab
participant CD as CustomersDot
participant AI as AI gateway<br/>(for now CS model gateway)
loop Sync signing key
SM->>CD: request JWKS (public signing key)
CD-->>SM: JWKS
SM->>SM: store JWKS
end
Note over U,AI: Developer persona
U->>SM: create PAT
SM-->>U: PAT
U->>VS: configure with PAT
loop Use code suggestions
Note over VS,AI: All requests via AI abstraction layer
alt No user JWT (UJWT) stored
VS->>SM: request user UJWT with PAT
SM->>SM: auth user with PAT
SM->>SM: verify user assigned to seat
SM->>CD: exchange license key for UJWT
CD->>CD: sign UJWT with JWKS private key
CD-->>SM: UJWT
SM-->>VS: UJWT
VS->>VS: store UJWT
else
VS->>SM: get code suggestions with UJWT
SM->>SM: validate UJWT against JWKS
alt Token expired
SM-->>VS: Respond unauthorized
Note over SM,VS: Repeat steps 7-14
Note over SM,VS: GOTO 15
else
SM->>SM: verify user assigned to seat
SM->>AI: fetch code suggestions with UJWT
AI->>AI: validate UJWT against JWKS
AI-->>SM: code suggestions
SM-->>VS: code suggestions
end
end
end
```
#### Benefits/drawbacks
* Good: We don't need a sync worker or even an instance token
* Good: We can issue tokens per each end-user instead of per-instance, which provides us with more fine grained control
* Bad: More complicated auth flow since the UJWT would have to be accepted by both the SM instance and our own service infra since they would be used by the AI layer to forward requests, which requires a regular public key exchange
* Bad (potentially): If CDot is the `Issuer` for end-user tokens, the kinds of claims we can embed are severely limited considering that CDot knows nothing about SM end-users.
## Questions/comments
- Note that despite the proposed architecture here we will still require SaaS user accounts for every SM admin due to https://gitlab.com/groups/gitlab-org/-/epics/8905.
epic