Update AI Gateway Blueprint
As per https://gitlab.slack.com/archives/C051K31F30R/p1710972249450739?thread_ts=1710431031.385889&cid=C051K31F30R, we want to outline our work/goals for the AI Gateway as part of the current blueprint, this will help consolidate understanding and drive our work in AI Framework HQ (&11403) • Unassigned
Merge request reports
Activity
changed milestone to %16.11
added documentation groupai framework typemaintenance workflowin dev labels
added devopsai-powered sectiondata-science labels
Just spinning this up as an initial brain dump, lets pair on this and then we can pull in the needed stakeholders to finish it out
- A deleted user
added Architecture Evolution Blueprint label
5 Warnings 331ed4eb: The commit subject must contain at least 3 words. For more information, take a look at our Commit message guidelines. 331ed4eb: The commit subject must start with a capital letter. For more information, take a look at our Commit message guidelines. f1f665f0: The commit subject must start with a capital letter. For more information, take a look at our Commit message guidelines. 8de2725d: The commit subject must contain at least 3 words. For more information, take a look at our Commit message guidelines. 8de2725d: The commit subject must start with a capital letter. For more information, take a look at our Commit message guidelines. 2 Messages This merge request might require a review from a Coach Engineer. This MR contains docs in the /doc/architecture directory, but any Maintainer (other than the author) can merge. You do not need tech writer review. Architecture Evolution Review
This merge request might require a review from a Coach Engineer.
The following files, which might require the additional review, have been changed:
doc/architecture/blueprints/ai_gateway/index.md
If needed, you can retry the
danger-review
job that generated this comment.Generated by
Dangeradded docs-only label
Thanks for this @oregand; I will push some updates to this branch tomorrow
requested review from @shekharpatnaik, @eduardobonet, and @dmishunov
- Resolved by David O'Regan
- Resolved by David O'Regan
- Resolved by David O'Regan
- Resolved by David O'Regan
- Resolved by David O'Regan
- Resolved by David O'Regan
Thanks @oregand and @jessieay for this update. Added some comments for consideration. This versions is already great, but in a future iteration it would be great to split the general vision for infrastructure (including overview of AI Agents, AI Gateway, Custom Models, Model registry, and how they interact) into its own blueprint, and keep this as a deep dive into AI Gateway exclusively. I think this would facilitate understanding of the higher level vision
mentioned in issue #451740
removed review request for @dmishunov
- Resolved by Jessie Young
@jessieay @shekharpatnaik I have a question about the programming language that is being used for AI gateway. My understanding is that Python was originally chosen for
model-gateway
gitlab-com/gl-infra/readiness!166 (82c575bf) project because the project directly used a pre-trained model to handle code suggestions. However, that option is no longer supported: (for example, Remove references to Triton model serving (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!417 - merged)) and now AI Gateway project is used to convert a general request into a specific one and propagate it to the right model. @tle_gitlab could you please verify my reasoning here?WDYT about revisiting the decision regarding which programming language must be used and either:
- Extend the blueprint with the reasons/arguments for proceeding with Python
- Consider more suitable options for the current use cases and future development before we start implementing more complex features. Estimate the feasibility and provide the strategy for gradually moving from one implementation to another.
For example, we can compare Python and Golang for our use case and here are the pros listed for both options:
Python
- Current functionality and Evolution: if the following code https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/a62ffb16273f2bcd2deb5c06ad1496c81c0c5d4e/ai_gateway/tokenizer.py#L10 is still in use or there are arguments to introduce similar functionality, then it makes sense to stick to Python to use the ML-related libraries. The document mentions the advantage of having access to
PyTorch and Tensorflow
, but do we plan to use these libraries within AI Gateway or are we going to implement the support for native GitLab models in a separate project? @tle_gitlab are you aware of any plans? - Maintenance: AI Gateway is closely related to ML topics and is expected to get contributions from ML engineers who are more likely familiar with Python.
- Existing code: some features/endpoints are already implemented and additional effort will be required to rewrite the existing functionality into another language
Golang
- Golang is generally faster than Python and a Golang server can handle more requests (my quick benchmarks showed 10x difference but it’s hard to get exact numbers for our specific use case)
- HTTP/2 or gRPC support for bidirectional streaming
- Maintenance: at GitLab, Golang is frequently chosen for a standalone service (Gitlab Shell, Workhorse, Gitaly), especially when scalability and performance are important:
- Gitlab Shell was originally written in Ruby, rewritten in Golang
- Gitaly was extracted from GitLab Rails and gradually rewritten into Golang
- Deployment: since self-managed customers are going to install AI Gateway themselves, it makes sense to limit the number of technologies they have to deal with
Both options have advantages, so I just wanted to make sure that we deliberately decided to move forward with any of them. Could you please have a look?
removed review request for @eduardobonet
- Resolved by Shinya Maeda
added 1134 commits
-
070d0a4d...cbf5d1f8 - 1129 commits from branch
master
- db97579b - Update AI Gateway Blueprint
- 47de6731 - Apply 3 suggestion(s) to 1 file(s)
- 569c5a23 - update docs
- a8b8cda0 - fix linting errors
- cbd67255 - fix linting
Toggle commit list-
070d0a4d...cbf5d1f8 - 1129 commits from branch
mentioned in epic &12973 (closed)
mentioned in merge request !144767 (merged)
- Resolved by David O'Regan
Breaking out into a new thread for clarity!
from wording "Self-managed instances should have their own AI Gateway" do I understand correctly that AI Gateway should be deployed for all self-managed instances (which want to use AI)?
Yes, your understanding is correct. The goal is to provide flexibility for self-managed instances. They can choose to deploy their own AI Gateway to access local models, which would be beneficial for those who want to keep their data and processing within their own infrastructure.
Alternatively, they can use the centralized AI Gateway provided by GitLab (cloud.gitlab.com) to access third-party AI providers. This would be a good option for those who prefer not to manage their own AI infrastructure or who want to leverage a wider range of AI models and services.
Should we update the wording to be explicit about this in your opinion?
- Resolved by David O'Regan
- Resolved by David O'Regan
- Resolved by Jessie Young
added 215 commits
-
df9c1c24...6f78799e - 208 commits from branch
master
- 8c5dd58f - Update AI Gateway Blueprint
- 19a6779e - Apply 3 suggestion(s) to 1 file(s)
- f7d0b5a6 - update docs
- fa2b1ab6 - fix linting errors
- ed7208b1 - fix linting
- 33d4be31 - Apply 1 suggestion(s) to 1 file(s)
- db7b15c8 - Apply 1 suggestion(s) to 1 file(s)
Toggle commit list-
df9c1c24...6f78799e - 208 commits from branch
- Resolved by Jessie Young
- Resolved by David O'Regan
mentioned in epic &13383
added 571 commits
-
db7b15c8...e0a0dbd2 - 564 commits from branch
master
- 5fee2055 - Update AI Gateway Blueprint
- 14641940 - Apply 3 suggestion(s) to 1 file(s)
- ec8288db - update docs
- 6168991e - fix linting errors
- 7815d2d1 - fix linting
- a1b91e1c - Apply 1 suggestion(s) to 1 file(s)
- 8b7302a1 - Apply 1 suggestion(s) to 1 file(s)
Toggle commit list-
db7b15c8...e0a0dbd2 - 564 commits from branch
mentioned in incident gitlab-org/quality/engineering-productivity/approved-mr-pipeline-incidents#47 (closed)
mentioned in incident gitlab-org/quality/engineering-productivity/approved-mr-pipeline-incidents#50 (closed)
added 22 commits
-
b6d54d4a...119f5a3b - 14 commits from branch
master
- 84573de7 - Update AI Gateway Blueprint
- 7e089073 - Apply 3 suggestion(s) to 1 file(s)
- 69ff7ba7 - update docs
- d30e3212 - fix linting errors
- f5b70804 - fix linting
- 000fb66a - Apply 1 suggestion(s) to 1 file(s)
- 95d700d4 - Apply 1 suggestion(s) to 1 file(s)
- 980a28af - Update AI Gateway Blueprint
Toggle commit list-
b6d54d4a...119f5a3b - 14 commits from branch
mentioned in incident gitlab-org/quality/engineering-productivity/approved-mr-pipeline-incidents#52 (closed)
mentioned in epic &13393
added 29 commits
-
980a28af...07c00f10 - 21 commits from branch
master
- 9ec612e1 - Update AI Gateway Blueprint
- 0a5a7067 - Apply 3 suggestion(s) to 1 file(s)
- f4add2e9 - update docs
- 48f81698 - fix linting errors
- 6dc09801 - fix linting
- 109e2c71 - Apply 1 suggestion(s) to 1 file(s)
- ff0591e2 - Apply 1 suggestion(s) to 1 file(s)
- e75950e5 - Update AI Gateway Blueprint
Toggle commit list-
980a28af...07c00f10 - 21 commits from branch
added 143 commits
-
e75950e5...51f19208 - 135 commits from branch
master
- b0d5a68b - Update AI Gateway Blueprint
- e91b8180 - Apply 3 suggestion(s) to 1 file(s)
- f4a75269 - update docs
- a9be192d - fix linting errors
- 1947d1c4 - fix linting
- 7cab5270 - Apply 1 suggestion(s) to 1 file(s)
- 430bc7d1 - Apply 1 suggestion(s) to 1 file(s)
- b9a243a6 - Update AI Gateway Blueprint
Toggle commit list-
e75950e5...51f19208 - 135 commits from branch
added workflowin review label and removed workflowin dev label
added 168 commits
-
b9a243a6...89fdedcc - 160 commits from branch
master
- f1af1d59 - Update AI Gateway Blueprint
- 14fc1178 - Apply 3 suggestion(s) to 1 file(s)
- 9170b224 - update docs
- 9bc25aa8 - fix linting errors
- 0c5cab0d - fix linting
- abd3c856 - Apply 1 suggestion(s) to 1 file(s)
- 104277de - Apply 1 suggestion(s) to 1 file(s)
- e6b05b31 - Update AI Gateway Blueprint
Toggle commit list-
b9a243a6...89fdedcc - 160 commits from branch
added 203 commits
-
e6b05b31...c70dee6d - 195 commits from branch
master
- 0f675854 - Update AI Gateway Blueprint
- 7955ab9c - Apply 3 suggestion(s) to 1 file(s)
- 8de2725d - update docs
- f1f665f0 - fix linting errors
- 331ed4eb - fix linting
- a7491aba - Apply 1 suggestion(s) to 1 file(s)
- 530127e5 - Apply 1 suggestion(s) to 1 file(s)
- 5db7dff1 - Update AI Gateway Blueprint
Toggle commit list-
e6b05b31...c70dee6d - 195 commits from branch
I have pulled in the changes from the other Blueprint update that was merged this week as an FYI
unassigned @jessieay
requested review from @jessieay and removed review request for @shekharpatnaik
LGTM @oregand - let's merge and iterate from here in a new MR
enabled an automatic merge when the pipeline for 74fbab84 succeeds
mentioned in commit fa08c704
mentioned in incident gitlab-org/quality/engineering-productivity/review-apps-broken-incidents#1577 (closed)
added workflowstaging-canary label and removed workflowin review label
added workflowcanary label and removed workflowstaging-canary label
added workflowstaging label and removed workflowcanary label
added workflowproduction label and removed workflowstaging label
added workflowpost-deploy-db-production label and removed workflowproduction label
added releasedcandidate label
added releasedpublished label and removed releasedcandidate label
mentioned in issue #493853 (closed)
mentioned in epic gitlab-org#13393