Running Chat / Anthropic behind a proxy API in AI gateway
Problem to solve
Duo Chat is currently only available on .com
. To make it available for self-managed users, we want to move it behind the AI Gateway.
Proposal
In a first iteration we could build endpoints that proxy the request to the AI provider. The self-managed gitlab-rails instance would proxy the request to the ai-gateway, along with a Service Access Token it gets from cdot
. This is a similar token to one that is used for code suggestions (see gitlab-org&11036 (closed)).
This work will consist of three (main) parts:
- Anthropic on the ai gateway (this issue)
- Providing a Service Access Token from cdot (issue)
- Gitlab-rails calls the anthropic endpoint on the ai gateway (issue)
This will also simplify developer access to anthropic / ai features, as they will simply call the model gateway with their PAT instead of having to create an API key for the service.
Gitlab-rails will call the API with a IJWT from cdot, we have to validate this in the AI Gateway. This has been done before with the code suggestions token
Links / references
https://docs.gitlab.com/ee/architecture/blueprints/ai_gateway/index.html#exposing-ai-providers
https://docs.gitlab.com/ee/user/project/repository/code_suggestions/self_managed.html