feat: update code-suggestions evaluation scripts
What does this merge request do and why?
- update the script for
ai_gateway
source - useai_gateway
to be consistent with other sources - update the params for
ai_gateway
andgitlab
source - change the
vertexai
eval tovertexai_codegecko
- add an evaluation for
vertexai_codestral
- make sure vertex authentication is not done on import
How to set up and validate locally
For the newly-introduced vertexai_codestral
source:
Run the following command and verify that there are no errors.
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=vertexai_codestral \
--experiment-prefix=vertexai-codestral-latency \
--split=latency_test \
--limit=10 \
--evaluate-with-llm
For the other code-suggestions sources:
Run the following commands and verify that the updates did not introduce errors:
# AI Gateway
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=ai_gateway \
--experiment-prefix=aigw-codegecko-latency \
--model-name=code-gecko@002 \
--model-provider=vertex-ai \
--split=latency_test \
--limit=10 \
--intent=completion \
--evaluate-with-llm
# GitLab - completion
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=gitlab \
--split=latency_test \
--limit=10 \
--experiment-prefix=gitlab-exp \
--intent=completion \
--evaluate-with-llm
# GitLab - generation
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=gitlab \
--split=latency_test \
--limit=10 \
--experiment-prefix=gitlab-exp-generation \
--intent=generation \
--evaluate-with-llm
# VertexAI Code Gecko
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=vertexai_codegecko \
--experiment-prefix=vertexai-codegecko-latency \
--split=latency_test \
--limit=10 \
--evaluate-with-llm
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Pam Artiaga