Update AI Gateway Blueprint
All threads resolved!
All threads resolved!
Compare changes
@@ -526,12 +526,11 @@ Alternative solutions were discussed in
@@ -526,12 +526,11 @@ Alternative solutions were discussed in
Our aim is for the AI Gateway to become the primary method for the monolith to access models. Here are some of the key features we need to implement:
- Allow self-managed installations to have their own AI Gateway, with Runway likely being the deployment method.
@@ -539,7 +538,8 @@ The AI Gateway, a standalone service, is the sole access point for all communica
@@ -539,7 +538,8 @@ The AI Gateway, a standalone service, is the sole access point for all communica
This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
When it comes to user-deployed models, the model registry is not solely for large language models; it's currently more targeted at smaller model applications. These smaller models could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. However, even for these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
[Model registry](https://docs.gitlab.com/ee/user/project/ml/model_registry/) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
- [AI Gateway as the Sole Access Point for Monolith to Access Models](https://gitlab.com/groups/gitlab-org/-/epics/13024)
@@ -568,4 +568,9 @@ Self-managed instances should have their own AI Gateway, with Runway likely bein
@@ -568,4 +568,9 @@ Self-managed instances should have their own AI Gateway, with Runway likely bein
- Find a way to replicate these into a self-managed installation (e.g., organization-level agents where we prepopulate with a few agents).
- To avoid hitting the monolith on every call to AI agents, we will need a method to propagate AI Agents prompts into the AI Gateway.
\ No newline at end of file
[AI Agents](https://gitlab.com/groups/gitlab-org/-/epics/12330) is a feature that allows users to implement and manage their own chats and AI features, managing prompts, models and tools. Development is currently in its early stages. Once mature, we intend to move GitLab feature to agents, but there are blockers that currently prevent us from doing so:
\ No newline at end of file