Model Routing within Duo Features
Overview
For self-managed GitLab Duo, some sophisticated customers are able to host multiple models, which could be a combination of enterprise infrastructure and private clouds. Self-hosted and semi-airgapped customers want the capacity to switch the routing of LLM inputs based on not just Duo features, but also the LLM input or individual user's parameters. It may be preferred to route a given request to different LLMs depending on:
- Cost
- information security
- User usage budgets
Proposal
- Allow a self-managed customer to host more than 1 model for a given Duo feature
- Customer defines rules to route alternative models based on document or user permissions; what elements we capture in customer hosted logging will be decisive in how we can inform model routing.
Document-based model routing
- Project membership
- Project supergroups
- CODEOWNERS
- data security classification level of the user’s current file, repo, group
- GitLab Control Framework
User-based model routing
- Simple list of user IDs
- User roles
- User group and supergroup membership
End user controlled model routing
- end user is able to select which model to route an input to; user classifies their own input
Context-based model routing
- text within the input
- example: PII detection
- example: a zero shot model determines the text to be of high sensitivity
Definition of Done
When a user engages with Duo Chat or Code Suggestions:
- their input is automatically routed to either the on-prem OS model or a private cloud hosted model
- the rules by which this automation happens is configurable by the customer
- the user has the option to manually override the input to go to the on-prem OS model, but cannot override to send it to the private cloud hosted model
References
Edited by Susie Bitters