Skip to content
Snippets Groups Projects

Update AI Gateway Blueprint

Merged David O'Regan requested to merge ai_gateway_blueprint_update into master
All threads resolved!

As per https://gitlab.slack.com/archives/C051K31F30R/p1710972249450739?thread_ts=1710431031.385889&cid=C051K31F30R, we want to outline our work/goals for the AI Gateway as part of the current blueprint, this will help consolidate understanding and drive our work in AI Framework HQ (&11403) • Unassigned

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
    • Resolved by David O'Regan

      Thanks @oregand and @jessieay for this update. Added some comments for consideration. This versions is already great, but in a future iteration it would be great to split the general vision for infrastructure (including overview of AI Agents, AI Gateway, Custom Models, Model registry, and how they interact) into its own blueprint, and keep this as a deep dive into AI Gateway exclusively. I think this would facilitate understanding of the higher level vision

  • Eduardo Bonet approved this merge request

    approved this merge request

  • added 1 commit

    • dbd63ffd - Apply 3 suggestion(s) to 1 file(s)

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • David O'Regan resolved all threads

    resolved all threads

  • added 1 commit

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • Jessie Young mentioned in issue #451740

    mentioned in issue #451740

  • Denys Mishunov approved this merge request

    approved this merge request

  • Denys Mishunov removed review request for @dmishunov

    removed review request for @dmishunov

    • Resolved by Jessie Young

      @jessieay @shekharpatnaik I have a question about the programming language that is being used for AI gateway. My understanding is that Python was originally chosen for model-gateway gitlab-com/gl-infra/readiness!166 (82c575bf) project because the project directly used a pre-trained model to handle code suggestions. However, that option is no longer supported: (for example, Remove references to Triton model serving (gitlab-org/modelops/applied-ml/code-suggestions/ai-assist!417 - merged)) and now AI Gateway project is used to convert a general request into a specific one and propagate it to the right model. @tle_gitlab could you please verify my reasoning here?

      WDYT about revisiting the decision regarding which programming language must be used and either:

      • Extend the blueprint with the reasons/arguments for proceeding with Python
      • Consider more suitable options for the current use cases and future development before we start implementing more complex features. Estimate the feasibility and provide the strategy for gradually moving from one implementation to another.

      For example, we can compare Python and Golang for our use case and here are the pros listed for both options:

      Python

      • Current functionality and Evolution: if the following code https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/blob/a62ffb16273f2bcd2deb5c06ad1496c81c0c5d4e/ai_gateway/tokenizer.py#L10 is still in use or there are arguments to introduce similar functionality, then it makes sense to stick to Python to use the ML-related libraries. The document mentions the advantage of having access to PyTorch and Tensorflow, but do we plan to use these libraries within AI Gateway or are we going to implement the support for native GitLab models in a separate project? @tle_gitlab are you aware of any plans?
      • Maintenance: AI Gateway is closely related to ML topics and is expected to get contributions from ML engineers who are more likely familiar with Python.
      • Existing code: some features/endpoints are already implemented and additional effort will be required to rewrite the existing functionality into another language

      Golang

      • Golang is generally faster than Python and a Golang server can handle more requests (my quick benchmarks showed 10x difference but it’s hard to get exact numbers for our specific use case)
      • HTTP/2 or gRPC support for bidirectional streaming
      • Maintenance: at GitLab, Golang is frequently chosen for a standalone service (Gitlab Shell, Workhorse, Gitaly), especially when scalability and performance are important:
        • Gitlab Shell was originally written in Ruby, rewritten in Golang
        • Gitaly was extracted from GitLab Rails and gradually rewritten into Golang
      • Deployment: since self-managed customers are going to install AI Gateway themselves, it makes sense to limit the number of technologies they have to deal with

      Both options have advantages, so I just wanted to make sure that we deliberately decided to move forward with any of them. Could you please have a look? :pray:

  • Eduardo Bonet removed review request for @eduardobonet

    removed review request for @eduardobonet

  • David O'Regan added 1134 commits

    added 1134 commits

    Compare with previous version

  • mentioned in epic &12973 (closed)

  • mentioned in merge request !144767 (merged)

    • Author Maintainer
      Resolved by David O'Regan

      @jprovaznik

      Breaking out into a new thread for clarity!

      from wording "Self-managed instances should have their own AI Gateway" do I understand correctly that AI Gateway should be deployed for all self-managed instances (which want to use AI)?

      Yes, your understanding is correct. The goal is to provide flexibility for self-managed instances. They can choose to deploy their own AI Gateway to access local models, which would be beneficial for those who want to keep their data and processing within their own infrastructure.

      Alternatively, they can use the centralized AI Gateway provided by GitLab (cloud.gitlab.com) to access third-party AI providers. This would be a good option for those who prefer not to manage their own AI infrastructure or who want to leverage a wider range of AI models and services.

      Should we update the wording to be explicit about this in your opinion?

      /cc @shinya.maeda @mkaeppler

  • David O'Regan
  • added 1 commit

    • f73e769e - Apply 1 suggestion(s) to 1 file(s)

    Compare with previous version

  • Shekhar Patnaik approved this merge request

    approved this merge request

  • Shinya Maeda
  • added 1 commit

    • df9c1c24 - Apply 1 suggestion(s) to 1 file(s)

    Compare with previous version

  • David O'Regan added 215 commits

    added 215 commits

    Compare with previous version

  • Jessie Young
  • Jessie Young
  • Jessie Young mentioned in epic &13383

    mentioned in epic &13383

  • David O'Regan added 571 commits

    added 571 commits

    Compare with previous version

  • added 1 commit

    Compare with previous version

  • David O'Regan added 22 commits

    added 22 commits

    Compare with previous version

  • mentioned in epic &13393

  • David O'Regan added 29 commits

    added 29 commits

    Compare with previous version

  • David O'Regan added 143 commits

    added 143 commits

    Compare with previous version

  • added workflowin review label and removed workflowin dev label

  • David O'Regan added 168 commits

    added 168 commits

    Compare with previous version

  • David O'Regan added 203 commits

    added 203 commits

    Compare with previous version

  • Author Maintainer

    @jessieay

    I have pulled in the changes from the other Blueprint update that was merged this week as an FYI :smile:

  • David O'Regan requested review from @jessieay and removed review request for @shekharpatnaik

    requested review from @jessieay and removed review request for @shekharpatnaik

  • Jessie Young approved this merge request

    approved this merge request

  • LGTM @oregand - let's merge and iterate from here in a new MR

  • Jessie Young resolved all threads

    resolved all threads

  • Jessie Young enabled an automatic merge when the pipeline for 74fbab84 succeeds

    enabled an automatic merge when the pipeline for 74fbab84 succeeds

  • Jessie Young mentioned in commit fa08c704

    mentioned in commit fa08c704

  • added workflowstaging label and removed workflowcanary label

  • mentioned in issue #493853 (closed)

  • mentioned in epic gitlab-org#13393

  • Please register or sign in to reply
    Loading