Skip to content

Model Routing within Duo Features

Overview

For self-managed GitLab Duo, some sophisticated customers are able to host multiple models, which could be a combination of enterprise infrastructure and private clouds. Self-hosted and semi-airgapped customers want the capacity to switch the routing of LLM inputs based on not just Duo features, but also the LLM input or individual user's parameters. It may be preferred to route a given request to different LLMs depending on:

  • Cost
  • information security
  • User usage budgets

Proposal

  1. Allow a self-managed customer to host more than 1 model for a given Duo feature
  2. Customer defines rules to route alternative models based on document or user permissions; what elements we capture in customer hosted logging will be decisive in how we can inform model routing.

Document-based model routing

  • Project membership
  • Project supergroups
  • CODEOWNERS
  • data security classification level of the user’s current file, repo, group
  • GitLab Control Framework

User-based model routing

  • Simple list of user IDs
  • User roles
  • User group and supergroup membership

End user controlled model routing

  • end user is able to select which model to route an input to; user classifies their own input

Context-based model routing

  • text within the input
    • example: PII detection
    • example: a zero shot model determines the text to be of high sensitivity

Definition of Done

When a user engages with Duo Chat or Code Suggestions:

  • their input is automatically routed to either the on-prem OS model or a private cloud hosted model
  • the rules by which this automation happens is configurable by the customer
  • the user has the option to manually override the input to go to the on-prem OS model, but cannot override to send it to the private cloud hosted model

References

Edited by Susie Bitters