Self-Hosted Models MVP - Code Generation with Mixtral 8x7B and Mistral 7B
Support for customer self-deployments of a Mistral LLM as a backend for Duo features, as an alternative to the default Vertex or Anthropic models. This initiative supports both internet-connected and air-gapped Self-Managed GitLab deployments. Mistral OS LLMs are available under [Apache License 2.0](https://mistral.ai/technology/#models), [rated silver](https://blueoakcouncil.org/list), and are therefore pre-approved. Other models will be supported as a follow up to this MVP. ### Customers Early-adopter customers are tracked in https://gitlab.com/groups/gitlab-org/-/epics/13700+ ### Scope See the [Blueprint](https://gitlab.com/gitlab-org/gitlab/-/tree/master/doc/architecture/blueprints/custom_models) for more information The scope of this MVP will be support for: - [Mixtral-8x7B-instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) for IDE Code Generation - [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) for IDE Code Generation ### Next Steps The next steps after this MVP are: - https://gitlab.com/gitlab-org/gitlab/-/issues/459876+ - Additional GitLab Duo Features ### Preparation - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/448567+s - [x]   https://gitlab.com/gitlab-org/gitlab-development-kit/-/issues/2025+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455309+s ### Development Steps - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/460438+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/444216+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455590+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/454323+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455311+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455858+s - [x] Baseline OS models for comparison * [x] [Validate and baseline Mixtral 7x8B](https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/issues/187 "Adding Mistral OS Mixtral Models to Prompt Library") * for Code Generation Mixtral 8x7B performs very well, outperforming Claude 2.1 * [x] [Validate and baseline Code Gemma](https://gitlab.com/groups/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/-/epics/4 "Evaluate Code Gemma") * for Code Completion, Gemma performs well - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455303+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/460068+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/455315+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/463760+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/452489+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/463821+s - [x]   https://gitlab.com/gitlab-org/gitlab/-/issues/464361+s ### PoC Work - [MR](https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/merge_requests/602 "Draft: feat: Support for custom models") and [recorded video](https://www.youtube.com/watch?v=DOHBXrwEd-s) discussing architectural options for updating AI gateway to use custom hosted models. ### Resources 1. Glossary, self-hosted: https://docs.gitlab.com/ee/development/ai_features/glossary.html
epic