Add connection to Model-Gateway to Code Completions Endpoint on Monolith
Problem
With https://gitlab.com/gitlab-org/gitlab/-/issues/415581+ we created an API endpoint that offers connection from IDE Extension -> Gitlab Monolith -> Google Vertex(code-gecko) but based on Unified AI Gateway Architecture (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist#161 - closed) this is not the desired longterm architecture, especially since we do not want code-completion for self-managed instances connecting to gitlab.com. Rather, we need to put the gateway between the Monolith and the AI provider in the form of Google's code-gecko.
Desired Outcome
Any incoming request to the code_completions endpoint on the Monolith can be routed either to code-gecko directly or to the model-gateway based on a feature flag (see gitlab-org/modelops/applied-ml/code-suggestions/ai-assist#161 (comment 1443135228))
Proposed Solution
- Connect to the gateway API defined in gitlab-org/modelops/applied-ml/code-suggestions/ai-assist#161 (closed).
- Configure a feature flag to be able to switch between Gateway connection and direct
code-geckoconnection - Make sure to pass on Telemetry headers to the Gateway