Self Hosted Duo Chat GA
This epic tracks the tasks required to take Self-Hosted Models to GA. The top level self-hosted Chat epic can be found [here](https://gitlab.com/groups/gitlab-org/-/epics/13760).
[This working spreadsheet](https://docs.google.com/spreadsheets/d/1YfoF_QNcOrOsMALndDCBZy8hkPOSoptl3Ijyf_9anHo/edit?gid=0#gid=0) which is a matrix of models targeted for GA, deployment and feature status.
Based on [user data for Duo Chat](https://10az.online.tableau.com/#/site/gitlab/views/DuoCategoriesofQuestions/DuoCategory?:iid=2), the GA for Chat will focus heavily on enabling tools that write, improve, or explain code. We will also focus on delivering the ability to answer questions about GitLab documentation. Chat GA will therefore include all of the functionality currently carried out by the [agent](https://gitlab.com/groups/gitlab-org/-/epics/13760#agentic), as well as [slash commands](https://gitlab.com/groups/gitlab-org/-/epics/13760#single-action-executor) for /explain, /refactor, and /test. Note that there are not currently datasets to test the performance of /refactor and /test.
The supported feature list is:
* [**Zero Shot Agent**](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/llm/chain/agents/zero_shot/executor.rb)
* [**EpicReader**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/epic_reader)
* [**IssueReader**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/issue_reader)
* [**GitlabDocumentation**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/gitlab_documentation)
* [**Explain Code**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/explain_code) - **/explain**
* [**RefactorCode**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/refactor_code) - **/refactor**
* [**FixCode**](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/llm/chain/tools/fix_code/executor.rb) - **/fix**
* [**WriteTests**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/write_tests) - **/test**
The following features will **NOT** be supported for self-hosted Chat GA, but will folllow in subsequent milestones:
* [**Explain Vulnerability**](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/llm/chain/tools/explain_vulnerability/executor.rb) - **/vulnerability_explain**
* [**Troubleshoot_Job**](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/llm/chain/tools/troubleshoot_job/executor.rb) - **/troubleshoot**
* [**MergeRequestReader**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/merge_request_reader)
* [**CIEditorAssistant**](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/llm/chain/tools/ci_editor_assistant/executor.rb)
* [**SummarizeComments**](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/lib/gitlab/llm/chain/tools/summarize_comments):
The core models to support self-hosted Chat GA will include the following models, which are also supported for both sub-features of Code Suggestions.
**Supported Models**
* [Mistral 7B-it](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
* [Mixtral 8x7B-it](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
* [Mixtral 8x22B-it](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1)
* Anthropic Claude 3.5 Sonnet
* GPT 4o
* GPT 4o Turbo
* GPT 4o-mini
* Supported model platforms
* [AWS Bedrock](https://aws.amazon.com/bedrock/) (Mistral, Anthropic)
* [vLLM](https://blog.vllm.ai/2023/06/20/vllm.html) (Mistral)
* [Azure AI ](https://azure.microsoft.com/en-us/products/ai-services)(GPTs)
<table>
<tr>
<th>Theme</th>
<th>Epic</th>
<th>Issue</th>
<th>Notes</th>
<th>Status</th>
<th>Milestone</th>
<th>Team/DRI</th>
</tr>
<tr>
<td rowspan="12">Model Support</td>
<td rowspan="7">
https://gitlab.com/groups/gitlab-org/-/epics/14671+s
</td>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/462592+s
</td>
<td>
included use cases:
* zero-shot
* code generation
* GL doc
* issue/epic
</td>
<td>
:white_check_mark:
</td>
<td>17.2</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/465722+s
</td>
<td>
included use cases:
* zero-shot
* code generation
* GL doc
* issue/epic
</td>
<td>
:white_check_mark:
</td>
<td>17.2</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/465721+s
</td>
<td>
included use cases:
* zero-shot
* code generation
* GL doc
* issue/epic
</td>
<td>
:white_check_mark:
</td>
<td>17.2</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/groups/gitlab-org/-/epics/15347+s
</td>
<td>
/test
/refactor
/explain
/fix
</td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/492077+s
</td>
<td>issue/epic performance was sub 80% 'good'; further iteration needed</td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/495062+s
</td>
<td></td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/issues/509+s
</td>
<td></td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td rowspan="4">
[Anthropic for Duo Chat](https://gitlab.com/groups/gitlab-org/-/epics/15121)
</td>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/493373+s
</td>
<td>
* issue/epic
</td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/493372+s
</td>
<td>
* zero-shot
* code generation
* GL doc
</td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/groups/gitlab-org/-/epics/15346+s
</td>
<td>
/test
/refactor
/explain
/fix
</td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/495057+s
</td>
<td></td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/groups/gitlab-org/-/epics/15730+s
</td>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/493400+s
</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td rowspan="3">Gitlab Index</td>
<td rowspan="3">
https://gitlab.com/groups/gitlab-org/-/epics/15520+s
</td>
<td>
https://gitlab.com/gitlab-org/modelops/applied-ml/code-suggestions/ai-assist/-/issues/522+s
</td>
<td></td>
<td>
:white_check_mark:
</td>
<td>17.2</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/481749+s
</td>
<td></td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/471910+s
</td>
<td></td>
<td></td>
<td>17.6</td>
<td>Custom Models</td>
</tr>
<tr>
<td rowspan="2">Testing</td>
<td rowspan="2">
https://gitlab.com/groups/gitlab-org/-/epics/15285+s
</td>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/497283+s
</td>
<td></td>
<td></td>
<td>17.5</td>
<td>Custom Models</td>
</tr>
<tr>
<td>
https://gitlab.com/gitlab-org/gitlab/-/issues/497284+s
</td>
<td></td>
<td></td>
<td>17.5</td>
<td>Custom Models</td>
</tr>
<tr>
<td>Documentation</td>
<td>
https://gitlab.com/groups/gitlab-org/-/epics/15326+s
</td>
<td></td>
<td></td>
<td></td>
<td>17.5 / 17.6</td>
<td>Custom Models</td>
</tr>
</table>
epic