UX Design for Model Routing

Overview

For self-managed GitLab Duo, some sophisticated customers are able to host multiple models, which could be a combination of enterprise infrastructure and private clouds. Self-hosted and semi-airgapped customers want the capacity to switch the routing of LLM inputs based on not just Duo features, but also the LLM input or individual user's parameters. It may be preferred to route a given request to different LLMs depending on:

  • Cost
  • information security
  • User usage budgets

Proposal

  1. Allow a self-managed customer to host more than 1 model for a given Duo feature
  2. Allow customers to define/configure rules based on available parameters for when inputs should be routed to one model or another.
  3. Users can see which model their input will be routed to (can be a high-sec or low-sec visualization)
  4. The user has the option to manually override the input to go to the on-prem OS model (high-sec), but cannot override to send it to the private cloud hosted model (low-sec).

Proposed Parameters

Document-based model routing

  • Project membership
  • Project supergroups
  • CODEOWNERS
  • data security classification level of the user’s current file, repo, group
  • GitLab Control Framework

User-based model routing

  • Simple list of user IDs
  • User roles
  • User group and supergroup membership

End user controlled model routing

  • end user is able to select which model to route an input to; user classifies their own input

Context-based model routing

  • text within the input
    • example: PII detection
    • example: a zero shot model determines the text to be of high sensitivity

Definition of Done

When a user engages with Duo Chat or Code Suggestions:

  • their input is automatically routed to either the on-prem OS model or a private cloud hosted model
  • the rules by which this automation happens is configurable by the customer
  • the user has the option to manually override the input to go to the on-prem OS model, but cannot override to send it to the private cloud hosted model
Edited by Susie Bitters