Self-Hosted Models MVP - Code Generation with Mixtral 8x7B and Mistral 7B
Support for customer self-deployments of a Mistral LLM as a backend for Duo features, as an alternative to the default Vertex or Anthropic models. This initiative supports both internet-connected and air-gapped Self-Managed GitLab deployments.
Mistral OS LLMs are available under [Apache License 2.0](https://mistral.ai/technology/#models), [rated silver](https://blueoakcouncil.org/list), and are therefore pre-approved. Other models will be supported as a follow up to this MVP.
### Customers
Early-adopter customers are tracked in https://gitlab.com/groups/gitlab-org/-/epics/13700+
### Scope
See the [Blueprint](https://gitlab.com/gitlab-org/gitlab/-/tree/master/doc/architecture/blueprints/custom_models) for more information
The scope of this MVP will be support for:
- [Mixtral-8x7B-instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) for IDE Code Generation
- [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) for IDE Code Generation
### Next Steps
The next steps after this MVP are:
- https://gitlab.com/gitlab-org/gitlab/-/issues/459876+
- Additional GitLab Duo Features
### Preparation
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/448567+s
- [x] https://gitlab.com/gitlab-org/gitlab-development-kit/-/issues/2025+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455309+s
### Development Steps
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/460438+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/444216+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455590+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/454323+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455311+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455858+s
- [x] Baseline OS models for comparison
* [x] [Validate and baseline Mixtral 7x8B](https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/issues/187 "Adding Mistral OS Mixtral Models to Prompt Library")
* for Code Generation Mixtral 8x7B performs very well, outperforming Claude 2.1
* [x] [Validate and baseline Code Gemma](https://gitlab.com/groups/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/-/epics/4 "Evaluate Code Gemma")
* for Code Completion, Gemma performs well
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455303+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/460068+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/455315+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/463760+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/452489+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/463821+s
- [x] https://gitlab.com/gitlab-org/gitlab/-/issues/464361+s
### PoC Work
- [MR](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/merge_requests/602 "Draft: feat: Support for custom models") and [recorded video](https://www.youtube.com/watch?v=DOHBXrwEd-s) discussing architectural options for updating AI gateway to use custom hosted models.
### Resources
1. Glossary, self-hosted: https://docs.gitlab.com/ee/development/ai_features/glossary.html
epic