From 0f675854b3d490d78825d010199d8cfb8b38deb0 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Wed, 20 Mar 2024 16:56:16 -0700
Subject: [PATCH 1/8] Update AI Gateway Blueprint

---
 .../blueprints/ai_gateway/index.md            | 46 +++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index d970094d26929d..94ff3db365cd43 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -558,3 +558,49 @@ Alternative solutions were discussed in
 ## Decisions
 
 - [ADR-001: Allow direct connections](decisions/001_direct_connections.md)
+
+## Future work
+
+Our aim is for the AI Gateway to become the primary method for the monolith to access models. Here are some of the key features we need to implement:
+
+- Expose a list of available models.
+- Allow self-managed installations to have their own AI Gateway, with Runway likely being the deployment method.
+
+These goals are broken down into 4 distinct categories below.
+
+### Centralized Access Through AI Gateway
+
+The AI Gateway, a standalone service, is the sole access point for all communication between GitLab installations and third-party AI models. It is designed to centralize and manage access to all GitLab features, whether they are in-app functionalities or code suggestions, irrespective of their deployment methods. 
+
+This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
+
+When it comes to user-deployed models, the model registry is not solely for large language models; it's currently more targeted at smaller model applications. These smaller models could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. However, even for these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
+
+- [AI Gateway as the Sole Access Point for Monolith to Access Models](https://gitlab.com/groups/gitlab-org/-/epics/13024)
+
+### Unit Primitives
+
+Unit Primitives are a fundamental part of our strategy for managing access to AI features through the AI Gateway. They represent the smallest unit of functionality that can be accessed and managed through the Gateway. 
+
+In the initial iteration, we will support two primitives: Code Suggestions and Chat. The latter will encompass all Chat features in one primitive. 
+
+In the next iteration, we plan to decompose the Chat primitive into multiple primitives based on top-level tools. This work is dependent on the completion of the task to move classification into the AI Gateway. 
+
+The introduction of Unit Primitives will simplify the management of AI features and provide a more granular control over the functionalities exposed through the AI Gateway. This will also pave the way for future work on supporting user-deployed models and locally hosted models.
+
+For more details, refer to the [Initial Set of Unit Primitives](https://gitlab.com/gitlab-org/gitlab/-/issues/444934) issue.
+
+- [Unit Primitives for Accessing CC Features](https://gitlab.com/groups/gitlab-org/-/epics/12556)
+
+### Self Managed AI Gateway
+
+Self-managed instances should have their own AI Gateway, with Runway likely being the deployment method. This means part of our work will be to ensure that the AI Gateway can be deployed in a self-managed environment. This work will go hand-in-hand with the work to support locally hosted models (local inference) in support of GitLab AI features.
+
+- [Self Managed AI Gateway](https://gitlab.com/groups/gitlab-org/-/epics/13162)
+
+### AI Agents
+
+As AI Agents evolve, we plan to reimplement our AI features as agents. To achieve this, we need to:
+- Implement prompt templating.
+- Find a way to replicate these into a self-managed installation (e.g., organization-level agents where we prepopulate with a few agents).
+- To avoid hitting the monolith on every call to AI agents, we will need a method to propagate AI Agents prompts into the AI Gateway.
-- 
GitLab


From 7955ab9c06f1218dbc1771f08c39de6f1f8d37a3 Mon Sep 17 00:00:00 2001
From: Eduardo Bonet <ebonet@gitlab.com>
Date: Fri, 22 Mar 2024 22:13:32 +0000
Subject: [PATCH 2/8] Apply 3 suggestion(s) to 1 file(s)

---
 .../blueprints/ai_gateway/index.md            | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index 94ff3db365cd43..a1070234d07701 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -561,12 +561,11 @@ Alternative solutions were discussed in
 
 ## Future work
 
-Our aim is for the AI Gateway to become the primary method for the monolith to access models. Here are some of the key features we need to implement:
+AI Gateway aim is to become the primary method for the monolith to **access** machine learning models across all usages of GitLab and create a consistent user journey when developing AI-backed features. To do so, these goal is split down into three categories:
 
-- Expose a list of available models.
-- Allow self-managed installations to have their own AI Gateway, with Runway likely being the deployment method.
-
-These goals are broken down into 4 distinct categories below.
+- Centralized Access Through AI Gateway
+- Self Managed AI Gateway
+- Unit Primitives
 
 ### Centralized Access Through AI Gateway
 
@@ -574,7 +573,8 @@ The AI Gateway, a standalone service, is the sole access point for all communica
 
 This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
 
-When it comes to user-deployed models, the model registry is not solely for large language models; it's currently more targeted at smaller model applications. These smaller models could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. However, even for these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
+
+[Model registry](https://docs.gitlab.com/ee/user/project/ml/model_registry/) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
 
 - [AI Gateway as the Sole Access Point for Monolith to Access Models](https://gitlab.com/groups/gitlab-org/-/epics/13024)
 
@@ -603,4 +603,9 @@ Self-managed instances should have their own AI Gateway, with Runway likely bein
 As AI Agents evolve, we plan to reimplement our AI features as agents. To achieve this, we need to:
 - Implement prompt templating.
 - Find a way to replicate these into a self-managed installation (e.g., organization-level agents where we prepopulate with a few agents).
-- To avoid hitting the monolith on every call to AI agents, we will need a method to propagate AI Agents prompts into the AI Gateway.
+
+[AI Agents](https://gitlab.com/groups/gitlab-org/-/epics/12330) is a feature that allows users to implement and manage their own chats and AI features, managing prompts, models and tools. Development is currently in its early stages. Once mature, we intend to move GitLab feature to agents, but there are blockers that currently prevent us from doing so: 
+
+- [Lack of prompt templating](https://gitlab.com/gitlab-org/gitlab/-/issues/441081).
+- Implement replication of user-defined prompts into ai-gateway.
+- Implement replication of GitLab-defined prompts into self-managed installations (e.g., organization-level agents where we prepopulate with a few agents).
-- 
GitLab


From 8de2725d1f17dfaa3a2d31e19e75cb1a52161fd2 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Fri, 22 Mar 2024 22:28:12 +0000
Subject: [PATCH 3/8] update docs

---
 doc/architecture/blueprints/ai_gateway/index.md | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index a1070234d07701..d18105669d9fb0 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -598,11 +598,18 @@ Self-managed instances should have their own AI Gateway, with Runway likely bein
 
 - [Self Managed AI Gateway](https://gitlab.com/groups/gitlab-org/-/epics/13162)
 
-### AI Agents
+## Other components in the AI stack
+
+While AI Gateway centralizes _access_ to AI features and models, it interacts with other components to help users achieve their goals:
+
+- AI Agents: create and manage agents and prompts
+- Model registry: manage and deployment machine learning models
+
+### Model registry
 
-As AI Agents evolve, we plan to reimplement our AI features as agents. To achieve this, we need to:
-- Implement prompt templating.
-- Find a way to replicate these into a self-managed installation (e.g., organization-level agents where we prepopulate with a few agents).
+[Model registry](https://docs.gitlab.com/ee/user/project/ml/model_registry/) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
+
+### AI Agents
 
 [AI Agents](https://gitlab.com/groups/gitlab-org/-/epics/12330) is a feature that allows users to implement and manage their own chats and AI features, managing prompts, models and tools. Development is currently in its early stages. Once mature, we intend to move GitLab feature to agents, but there are blockers that currently prevent us from doing so: 
 
-- 
GitLab


From f1f665f019185644ffb7df2421d83b69dd61a1c0 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Fri, 22 Mar 2024 22:34:15 +0000
Subject: [PATCH 4/8] fix linting errors

---
 doc/architecture/blueprints/ai_gateway/index.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index d18105669d9fb0..46e2903eeeb87e 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -574,7 +574,7 @@ The AI Gateway, a standalone service, is the sole access point for all communica
 This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
 
 
-[Model registry](https://docs.gitlab.com/ee/user/project/ml/model_registry/) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
+[Model registry](../../../user/project/ml/model_registry/index.md) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
 
 - [AI Gateway as the Sole Access Point for Monolith to Access Models](https://gitlab.com/groups/gitlab-org/-/epics/13024)
 
@@ -607,7 +607,7 @@ While AI Gateway centralizes _access_ to AI features and models, it interacts wi
 
 ### Model registry
 
-[Model registry](https://docs.gitlab.com/ee/user/project/ml/model_registry/) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
+[Model registry](../../../user/project/ml/model_registry/index.md) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
 
 ### AI Agents
 
-- 
GitLab


From 331ed4ebb6ce1f3fb84972790873287d54bc0b35 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Fri, 22 Mar 2024 22:53:02 +0000
Subject: [PATCH 5/8] fix linting

---
 doc/architecture/blueprints/ai_gateway/index.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index 46e2903eeeb87e..ea3ee55fb1ff1b 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -573,7 +573,6 @@ The AI Gateway, a standalone service, is the sole access point for all communica
 
 This strategy significantly simplifies enterprise management and abstracts machine learning away from the monolith. With future expansions including telemetry, embeddings API, and multi-region/customer-specific deployments, our goal is to provide a scalable, comprehensive AI solution for all GitLab users, regardless of their installation type.
 
-
 [Model registry](../../../user/project/ml/model_registry/index.md) is a feature that allows users to use GitLab to manage the machine learning models. While not solely focused on large language models, and currently more targeted at smaller model applications, which could be deployed in various ways: as a standalone library, a service, a pod, a cloud deployment, and so forth. For these user-deployed models, the ability to auto-configure an API that's accessible through the AI Gateway could be a significant feature.
 
 - [AI Gateway as the Sole Access Point for Monolith to Access Models](https://gitlab.com/groups/gitlab-org/-/epics/13024)
-- 
GitLab


From a7491abacf4171051977364af0fc98ff18e634a2 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Thu, 28 Mar 2024 14:51:03 +0000
Subject: [PATCH 6/8] Apply 1 suggestion(s) to 1 file(s)

---
 doc/architecture/blueprints/ai_gateway/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index ea3ee55fb1ff1b..f2164a281bf89a 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -593,7 +593,7 @@ For more details, refer to the [Initial Set of Unit Primitives](https://gitlab.c
 
 ### Self Managed AI Gateway
 
-Self-managed instances should have their own AI Gateway, with Runway likely being the deployment method. This means part of our work will be to ensure that the AI Gateway can be deployed in a self-managed environment. This work will go hand-in-hand with the work to support locally hosted models (local inference) in support of GitLab AI features.
+Self-managed instances can either use centralized AI Gateway or have their own AI Gateway if they want to use self-deployed models, with Runway likely being the deployment method. This means part of our work will be to ensure that the AI Gateway can be deployed in a self-managed environment. This work will go hand-in-hand with the work to support locally hosted models (local inference) in support of GitLab AI features.
 
 - [Self Managed AI Gateway](https://gitlab.com/groups/gitlab-org/-/epics/13162)
 
-- 
GitLab


From 530127e5424def8d26dbea89072fa1aa2f4d1727 Mon Sep 17 00:00:00 2001
From: Shinya Maeda <shinya@gitlab.com>
Date: Fri, 29 Mar 2024 08:09:54 +0000
Subject: [PATCH 7/8] Apply 1 suggestion(s) to 1 file(s)

---
 doc/architecture/blueprints/ai_gateway/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index f2164a281bf89a..1be7ba9ea768b4 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -593,7 +593,7 @@ For more details, refer to the [Initial Set of Unit Primitives](https://gitlab.c
 
 ### Self Managed AI Gateway
 
-Self-managed instances can either use centralized AI Gateway or have their own AI Gateway if they want to use self-deployed models, with Runway likely being the deployment method. This means part of our work will be to ensure that the AI Gateway can be deployed in a self-managed environment. This work will go hand-in-hand with the work to support locally hosted models (local inference) in support of GitLab AI features.
+Self-managed instances can either use GitLab-hosted AI Gateway or have their own AI Gateway if they want to use self-deployed models, with Runway likely being the deployment method. This means part of our work will be to ensure that the AI Gateway can be deployed in a self-managed environment. This work will go hand-in-hand with the work to support locally hosted models (local inference) in support of GitLab AI features.
 
 - [Self Managed AI Gateway](https://gitlab.com/groups/gitlab-org/-/epics/13162)
 
-- 
GitLab


From 5db7dff192863563d8e775da06119310b3d14903 Mon Sep 17 00:00:00 2001
From: David O'Regan <doregan@gitlab.com>
Date: Thu, 4 Apr 2024 13:10:31 +0200
Subject: [PATCH 8/8] Update AI Gateway Blueprint

---
 doc/architecture/blueprints/ai_gateway/index.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/architecture/blueprints/ai_gateway/index.md b/doc/architecture/blueprints/ai_gateway/index.md
index 1be7ba9ea768b4..c883a6746942ea 100644
--- a/doc/architecture/blueprints/ai_gateway/index.md
+++ b/doc/architecture/blueprints/ai_gateway/index.md
@@ -579,7 +579,7 @@ This strategy significantly simplifies enterprise management and abstracts machi
 
 ### Unit Primitives
 
-Unit Primitives are a fundamental part of our strategy for managing access to AI features through the AI Gateway. They represent the smallest unit of functionality that can be accessed and managed through the Gateway. 
+Unit Primitives are a fundamental part of our strategy for managing access to AI features through the AI Gateway. They represent the smallest unit of functionality that can be accessed and managed through the Gateway. This approach provides a more granular control over the functionalities exposed through the AI Gateway and simplifies the management of AI features. It also paves the way for future work on supporting user-deployed models and locally hosted models. From a business perspective, unit primitives are the smallest pieces that may be shuffled across various tiers or packaging models, providing flexibility and adaptability in our offerings.
 
 In the initial iteration, we will support two primitives: Code Suggestions and Chat. The latter will encompass all Chat features in one primitive. 
 
-- 
GitLab