[UX] Model Routing per Duo Feature (#15246) · Epics · GitLab.org

[UX] Model Routing per Duo Feature

# Overview For self-managed GitLab Duo, some sophisticated customers are able to host multiple models, which could be a combination of enterprise infrastructure and private clouds. Self-hosted and semi-airgapped customers want the capacity to switch the routing of LLM inputs based on not just Duo features, but also the LLM input or individual user's parameters. It may be preferred to route a given request to different LLMs depending on: * Cost * information security * User usage budgets # Proposal 1. Allow a self-managed customer to host more than 1 model for a given Duo feature 2. Allow customers to define/configure rules based on available parameters for when inputs should be routed to one model or another. 3. Users can see which model their input will be routed to (can be a high-sec or low-sec visualization) 4. The user has the option to easily and manually override the input to go to the on-prem OS model (high-sec), but cannot override to send it to the private cloud hosted model (low-sec). 5. Users are easily able to discern visually which model their input will be routed to ## Proposed Parameters #### Document-based model routing * Project membership * Project supergroups * CODEOWNERS * data security classification level of the user’s current file, repo, group * [GitLab Control Framework](https://handbook.gitlab.com/handbook/security/security-assurance/security-compliance/sec-controls/) ### User-based model routing * Simple list of user IDs * User roles * User group and supergroup membership ### End user controlled model routing * end user is able to select which model to route an input to; user classifies their own input ### Context-based model routing * text within the input * example: [PII detection](https://private-ai.com/) * example: a zero shot model determines the text to be of high sensitivity # Definition of Done When a user engages with Duo Chat or Code Suggestions: * their input is automatically routed to either the on-prem OS model or a private cloud hosted model * the rules by which this automation happens is configurable by the customer * the user has the option to manually override the input to go to the on-prem OS model, but cannot override to send it to the private cloud hosted model

epic