Create an endpoint for clients to auth and get a user JWT
if a client should be able to connect directly to AI Gateway, it needs first a JWT which the client will use to authenticate against AI Gateway.
Create a REST API endpoint which client can use to obtain connection info for sending code completion requests. This connection info should include:
- a short-term JWT which can be used to authenticate to AI Gateway - eventually
- additional headers (
X-Gitlab-Instance-Id
,X-Gitlab-Realm
,X-Gitlab-Global-User-Id
,X-Gitlab-Host-Name
)- also
expires_at
(in seconds since the epoch) would be useful for client (#452044 (comment 1841150484))
- also
- AI Gateway URL
Client uses PAT to access this endpoint. Internally Rails will send a request to Cloud connector / AI Gateway to issue the short-term token and passes it (plus any additional details mentioned above) to the client.
Thought: Ideally the response should be generic and easily extensible so we need minimum business logic on client side to connect to different endpoints. One idea is described in &12224 (comment 1768842483) - client does/will do "request categorizaiton" (it detects if a request is completion or generation). The endpoint would return connection details for each type, something like:
{
{
default: {
url: "self-managed code suggestions URL" # default, fallback URL to send suggestion request for which no specific type was detected
},
code_completion: {
base_url: "ai gateway base URI", e.g. https://cloud.gitlab.com/ai
jwt_token: XYZ,
expires_at: "1712060286" // UNIX epoch
headers: {
X-Gitlab-Instance-Id: ...,
X-Gitlab-Realm: ...,
X-Gitlab-Global-User-Id: ...,
X-Gitlab-Host-Name: ...
}
}
... other payload data
}
Then when client needs to send a code suggestion request, it just checks this mapping if there is a specific endpoint for request type - if yes, it uses this endpoint otherwise it uses the "default" endpoint as a fallback.
Note: endpoint information doesn't aim to contain whole set of params needed for sending request to AI gateway, also params are specific to the version of endpoint (e.g. v3 completion endpoint has different params than v2 completion endpoint) - so only base_url
will be included in endpoint information and it will be up to the client to build final url (by adding /v[23]/code/completions
to base URL depending which version client supports/uses).
Important: this feature must be implemented behind a feature flag which is not enabled until short-term JWT creation is finished (#455607).
JWT creation
To unblock work on this and frontend issues which depend on this, we decided that in the first steps (this issue) we will return just instance-level JWT (similar to !147246 (diffs)). This will implemented behind a feature flag and only for development purposes. Then when AI Gateway supports creation of short-term JWTs, we update the logic which obtains short-term JWT (tracked by #455607)
A very basic POC: !147246 (merged)