Functional Testing for Self-Hosted Models (#15523) · Epics · GitLab.org

Functional Testing for Self-Hosted Models

The goal of this epic is to track the progress of our manual testing effort. Take the issues as they are created and follow the instructions below on how to perform the test. ### Current status: 12th Nov, 2024 Testing can be started on vLLM on GCP :thumbsup:. All issues have been weighed and ~"workflow::ready for development". ### Testing Matrix Legends: - :white_check_mark: - Test is completed by the engineer - :x: - Test is not yet completed by the engineer # GCP / vLLM | Cloud provider | Status | Model family | Model | Code suggestions | GitLab Duo Chat | |----------------|--------|--------------|-------|------------------|-----------------| | On GCP, running on vllm | :x: | Mistral | [Codestral 22B](https://huggingface.co/mistralai/Codestral-22B-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/501175 | \- | | On GCP, running on vllm | :white_check_mark: @manojmj @jpcyiza | Mistral | [Mistral 7B-it](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) | https://gitlab.com/gitlab-org/gitlab/-/issues/501176 | https://gitlab.com/gitlab-org/gitlab/-/issues/500452 | | On GCP, running on vllm | :x: | Mistral | [Mixtral 8x7B-it](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/501178 | https://gitlab.com/gitlab-org/gitlab/-/issues/500456 | | On GCP, running on vllm | :x: | Mistral | [Mixtral 8x22B-it](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/501181 | https://gitlab.com/gitlab-org/gitlab/-/issues/501184 | # [Azure AI model catalog](https://azure.microsoft.com/en-us/products/ai-model-catalog) | Cloud provider | Assigned engineer | Status | Model family | Model | Code completion | Code generation | GitLab Duo Chat | |----------------|-------------------|--------|--------------|-------|-----------------|-----------------|-----------------| | On Azure AI model catalog | \- | :x: | Mistral | [~~Codestral 22B~~](https://huggingface.co/mistralai/Codestral-22B-v0.1) Not available on Azure model catalog | https://gitlab.com/gitlab-org/gitlab/-/issues/501174 | https://gitlab.com/gitlab-org/gitlab/-/issues/501174 | \- | | On Azure AI model catalog | \- | :x: | Mistral | [Mistral 7B](https://huggingface.co/mistralai/Mistral-7B-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/500440 | \- | \- | | On Azure AI model catalog | \- | :x: | Mistral | [~~Mistral 7B-it~~](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) Not available on Azure model catalog | https://gitlab.com/gitlab-org/gitlab/-/issues/501177 | https://gitlab.com/gitlab-org/gitlab/-/issues/501177 | https://gitlab.com/gitlab-org/gitlab/-/issues/500454 | | On Azure AI model catalog | | :x: | Mistral | [Mixtral 8x7B-it](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/501179 | https://gitlab.com/gitlab-org/gitlab/-/issues/501179 | https://gitlab.com/gitlab-org/gitlab/-/issues/500464 | | On Azure AI model catalog | | :x: | Mistral | [Mixtral 8x22B-it](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/501180 | https://gitlab.com/gitlab-org/gitlab/-/issues/501180 | https://gitlab.com/gitlab-org/gitlab/-/issues/501185 | # Bedrock Note: Testing was once completed by Manoj, and it resulted in multiple bugs being registered. We should revisit Bedrock testing once vLLM on GCP testing is done. If vLLM testing registers any new bugs, they should be solved before Bedrock is attempted again. | Cloud provider | Assigned engineer | Status | Model family | Model | Code completion | Code generation | GitLab Duo Chat | |----------------|-------------------|--------|--------------|-------|-----------------|-----------------|-----------------| | Bedrock | Manoj | :white_check_mark: | Claude 3.5 Sonnet | [Claude 3.5 Sonnet](https://www.anthropic.com/news/claude-3-5-sonnet) | https://gitlab.com/gitlab-org/gitlab/-/issues/497190 | https://gitlab.com/gitlab-org/gitlab/-/issues/497190 | https://gitlab.com/gitlab-org/gitlab/-/issues/497190 | | Bedrock | Manoj | :white_check_mark: | Mistral | [Mixtral 8x7B-it](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) | https://gitlab.com/gitlab-org/gitlab/-/issues/500442 | https://gitlab.com/gitlab-org/gitlab/-/issues/500442 | https://gitlab.com/gitlab-org/gitlab/-/issues/495062 | ## GPT on Azure AI model catalog (Stretch Goal) | Cloud provider | Assigned engineer | Status | Model family | Model | Code completion | Code generation | GitLab Duo Chat | |----------------|-------------------|--------|--------------|-------|-----------------|-----------------|-----------------| | On Azure AI model catalog | \- | :x: | GPT | [GPT-3.5-Turbo](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#gpt-35) | https://gitlab.com/gitlab-org/gitlab/-/issues/497851 | https://gitlab.com/gitlab-org/gitlab/-/issues/497851 | \- | | On Azure AI model catalog | \- | :x: | GPT | [GPT-4](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#gpt-4) | Issue link, TBD | Issue link, TBD | \- | | On Azure AI model catalog | \- | :x: | GPT | [GPT-4 Turbo](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#gpt-4) | Issue link, TBD | Issue link, TBD | \- | | On Azure AI model catalog | \- | :x: | GPT | [GPT-4o](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#gpt-4o-and-gpt-4-turbo) | Issue link, TBD | Issue link, TBD | \- | | On Azure AI model catalog | \- | :x: | GPT | [GPT-4o-mini](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#gpt-4o-and-gpt-4-turbo) | Issue link, TBD | Issue link, TBD | \- | # How to perform the test ## Code Suggestions ### Prepare the test 1. Go to the [staging-ref test resources repo](https://staging-ref.gitlab.com/self-hosted/test-resources). Log in/sign up if you have to. You should already be invited as [a member of the test group](https://staging-ref.gitlab.com/groups/self-hosted/-/group_members) - If you don't have access to [the repo's group](https://staging-ref.gitlab.com/self-hosted) contact @jpcyiza or @sean_carroll to add you as a group member 2. Create a new branch from `main` with the following name template: `[issue number]-code-suggestions-[model tested]-[model provider]` 3. follow the instructions in [the form](https://forms.gle/DmrFqwsBqLPzgpes9) ### Reporting 1. Push the changes generated by the LLM and create an MR 2. **Comment the link to the MR in the issue you are currently working on** An example of an [MR here](https://staging-ref.gitlab.com/self-hosted/test-resources/-/merge_requests/4) **Additional actions if you encounter any bugs:** 1. create an issue in `.com` within [this epic](https://gitlab.com/groups/gitlab-org/-/epics/15523 "Functional Testing for Self-Hosted Models") and %17.6. - Use a screenshot of the related MR if needed 2. Ping @sean_carroll to make sure it is fixed promptly Here is an [example](https://gitlab.com/gitlab-org/gitlab/-/issues/499485 "Bug: including the suffix code in Code Suggestions self-hosted models promps") ## Duo Chat ### Prepare the test 1. Go to the [staging-ref test resources repo](https://staging-ref.gitlab.com/self-hosted/test-resources). Log in/sign up if you have to. You should already be invited as [a member of the test group](https://staging-ref.gitlab.com/groups/self-hosted/-/group_members) - If you don't have access to [the repo's group](https://staging-ref.gitlab.com/self-hosted) contact @jpcyiza or @sean_carroll to add you as a group member 2. Pull the repo in your local machine in order to test slash commands results. - In your local repo at the `./chat_test` folder run `asdf install` or `mise install` - In `chat_test/ruby` run `bundle install` - In `chat_test/nodejs` run `yarn install` 3. Create a new branch from `main` with the following name template: `[issue number]-duo-chat-[model tested]-[model provider]` 4. follow the instructions in [the form](https://forms.gle/M3Pf3Si8udT5AedQ8) ### Reporting 1. Push the changes generated by the LLM through the slash commands and create an MR 2. **Comment the link to the MR in the issue you are currently working on** **Additional actions if you encounter any bugs:** 1. create an issue in `.com` within [this epic for Self-Hosted Models Beta Bugs](https://gitlab.com/groups/gitlab-org/-/epics/15592 "Self-Hosted Models Beta Bugs") and %17.6. - Provide the reproducing steps and video recording the bug - If slash commands changes are involved, push them to the related MR 2. Ping @sean_carroll to make sure it is fixed promptly

epic