Add code suggestions evaluate script
What does this MR do and why?
Resolves gitlab-org/gitlab#465551 (closed)
This MR adds a script to be able to evaluate code suggestions by running the endpoint against LangSmith with an initial dataset for testing. The dataset is added directly within LangSmith itself. This will be helpful for testing changes while developing code suggestions locally on gdk.
Helpful resources:
- Step-by-Step Guide for Conducting Evaluations using LangSmith at GitLab
- Creating and uploading a dataset
How to set up and validate locally
Follow the steps in these docs on Running the Evaluation Scripts!
Some random helpful hints:
- If you get a "Error running target function" error when trying to run the evaluate script, double check you can successfully hit the
api/v4/code_suggestions/completions
using something like curl or Postman:
![](/-/project/58448012/uploads/715f6d7ee9235bff5079a55e511d0a32/Screenshot_2024-06-06_at_6.58.10_PM.png)
-
If you see you're getting
401 Unauthorized
then you can bypass that by yolo commenting out the endpoint's authentication checks here: https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/api/code_suggestions.rb#L22-27 -
You can edit a dataset input/output pair on the fly in LangSmith by going to "Datasets & Testing" > your dataset for example
missy-testing-code-suggestions
> "Examples" tab > click an entry > "Edit"