Direct IDE Access with Model switching

Problem

With the introduction of Model switching, information about which model to use on a namespace level is stored on GitLab. However, to reduce latency, IDE completion can make requests to AIGW directly, but by making these requests directly we are not able to inject the right model information. In this scenario, AIGW will choose the default completion model, rather than the model selected by the customer.

We need a mechanism where IDEs can eagerly retrieve information about the model, so that it can be sent along with direct requests to AI Gateway.

Definition of Done

Customers are able to set their preferred model from available options in the Gitlab UI at the namespace level on GitLab.com
Users should experience their selected AI models consistently across all GitLab interfaces, including IDEs; model preferences set at namespace levels should be automatically respected in IDE environments
Changes to model configurations should apply to IDE experiences without requiring user intervention; IDE users should be shielded from technical complexities related to model switching
Namespace owners must be confident that their model selections are consistently enforced across all environments; Users should have visibility into which AI models are powering their experiences and be able to verify which models are active in their current environment
Maintain the low-latency experience developers expect from Duo in the IDE, regardless of the model selection mechanism

Edited Jun 09, 2025 by 🤖 GitLab Bot 🤖