Baseline Evaluation of DeepSeekCoder Model Series on CEF Code Suggestion Dataset
This issue is to capture work and results around the baseline validation assessment for all 8 DeepSeekCoder models, to determine which models within the series we will support for Duo Chat Code Suggestion functions (code generation and code completion)
For each variant, we will need to host it in the local GDK and run against the complete[ Code Suggestions pipelines ](https://gitlab.com/groups/gitlab-org/modelops/ai-model-validation-and-research/-/epics/6#data-sets--use-cases "Models / Datasets / Metrics / Systems")for Code Generation (MBPP, and code_generation_v2 (development)) and Code Completion ([dataset_v2](https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1sdev-ai-research-0e2f8974!2scode_suggestion!3sdataset_v2)) to establish baselines for performance following the process outlined [here](https://gitlab.com/gitlab-org/gitlab/-/issues/468933).
* [deepseek-ai/deepseek-coder-33b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-33b-instruct)
* [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
* [deepseek-ai/deepseek-coder-7b-instruct-v1.5](https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5)
* [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct)
* [deepseek-ai/deepseek-coder-6.7b-base](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-base)
* [deepseek-ai/deepseek-coder-7b-base-v1.5](https://huggingface.co/deepseek-ai/deepseek-coder-7b-base-v1.5)
* [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base)
* [deepseek-ai/deepseek-coder-33b-base](https://huggingface.co/deepseek-ai/deepseek-coder-33b-base)
#### Definition of Done
Each Models performance is documented in this issue, providing baseline performance scores in terms of cosine similarity to ground truth.
epic