Running Chat / Anthropic behind a proxy API in AI gateway

Problem to solve

Duo Chat is currently only available on .com. To make it available for self-managed users, we want to move it behind the AI Gateway.

Proposal

In a first iteration we could build endpoints that proxy the request to the AI provider. The self-managed gitlab-rails instance would proxy the request to the ai-gateway, along with a Service Access Token it gets from cdot. This is a similar token to one that is used for code suggestions (see gitlab-org&11036 (closed)).

This work will consist of three (main) parts:

  • Anthropic on the ai gateway (this issue)
  • Providing a Service Access Token from cdot (issue)
  • Gitlab-rails calls the anthropic endpoint on the ai gateway (issue)

This will also simplify developer access to anthropic / ai features, as they will simply call the model gateway with their PAT instead of having to create an API key for the service.

Gitlab-rails will call the API with a IJWT from cdot, we have to validate this in the AI Gateway. This has been done before with the code suggestions token

Links / references

https://docs.gitlab.com/ee/architecture/blueprints/ai_gateway/index.html#exposing-ai-providers

https://docs.gitlab.com/ee/user/project/repository/code_suggestions/self_managed.html

gitlab-org&11036 (closed)

Edited by Roy Zwambag