AI Gateway Discussion - Do We Need One for Self-Managed / Airgapped Code Suggestions?

Background

In preparation for an MVP for a self-managed, air-gapped Code Suggestion feature, the question arises what architecture is strictly required to implement this feature in the lightest way possible for experimentation.

Currently, there does not exist a ready-made self-managed AI Gateway.

In our connected version of Code Suggestions, AI Gateway serves a few main functions:

  • managed calls to our 3rd party model vendors (Anthropic and Vertex AI)
  • code generation
    • cleaning up to the end of the block
    • cleaning repeated group of lines
  • code completions
    • trimming the prefix and suffix
    • almost the same post-processing as for code generations

Question

With the goal of expediting a self-managed air-gapped MVC, the following questions arise:

  1. Do we strictly NEED a self-managed version of the AI Gateway?
  2. Assuming one self-hosted model/API, do we need the AI Gateway to manage the call to the model
  3. What authentication / authorizations are currently happening at the AI Gateway level? Are these applicable in a self-hosted setting?
  4. Can we shift the current pre- and post-processing steps to the IDE?
    1. what effect could that have on latency? compute?
    2. what is the minimal logic we would need in the IDE to make the CS feature work at some minimum threshhold?
  5. What would be the loss/gains in terms of effort and time if the end goal is to craft a self-managed AI Gateway. Put another way, is the cost/time of building the logic in the IDE for SM customers lower than the cost of locally deploying a Gateway?

Related Issues

https://gitlab.com/gitlab-org/gitlab/-/issues/450206

&13024

#434063 (closed)

Edited by Susie Bitters