Merge all evaluation script efforts, prompts, and testing results into main branch
Scope of MR
The entire scope of the project can be found here as well as the epic here. However, this focuses on creating, implementing, and testing a model evaluation testing script.
This MR includes five main components:
-
evaluation.py
- This is the script that accomplishes the model testing. The input is a txt file where each line is a separate prompt. The output is code. - prompts - This folder contains all the prompts used in testing. The prompts for generation are split by language for organization, but are mainly the same plain english except for a few word changes (for example, C does not have a standard dictionary). The completion prompts differ by language as they are written in the language that is being tested.
- generation results - This folder includes all the results for code generations.
- completion results - This folder includes all the results for code completions.
- documentation - The documentation includes a brief overview of the efforts and how to run make targets.
Associated Issues
Test runner: https://gitlab.com/gitlab-org/gitlab/-/issues/415774+
Javascript - completion: https://gitlab.com/gitlab-org/gitlab/-/issues/415783+
Javascript - generation: https://gitlab.com/gitlab-org/gitlab/-/issues/415782+
Python - completion: https://gitlab.com/gitlab-org/gitlab/-/issues/415780+
Python - generation: https://gitlab.com/gitlab-org/gitlab/-/issues/415772+
C - completion: https://gitlab.com/gitlab-org/gitlab/-/issues/415788+
C - generation: https://gitlab.com/gitlab-org/gitlab/-/issues/415787+
Golang - completion: https://gitlab.com/gitlab-org/gitlab/-/issues/415786+
Golang - generation: https://gitlab.com/gitlab-org/gitlab/-/issues/415785+
Follow-up(s)
As with any work at GitLab, there are already some apparent follow-up items that are mapped out for the next iteration of the model evaluation scripts as well as the entire testing method. The issues below are just some of the follow-up items for consideration:
https://gitlab.com/gitlab-org/gitlab/-/issues/416330+
https://gitlab.com/gitlab-org/gitlab/-/issues/416329+
cc @sean_carroll @mray2020 @srayner @allison.browne @andrei.zubov
Closes https://gitlab.com/gitlab-org/gitlab/-/issues/415774+