Skip to content

feat: update code-suggestions evaluation scripts

Pam Artiaga requested to merge pam/add-codesuggestions-eval-scripts into main

What does this merge request do and why?

  • update the script for ai_gateway source - use ai_gateway to be consistent with other sources
  • update the params for ai_gateway and gitlab source
  • change the vertexai eval to vertexai_codegecko
  • add an evaluation for vertexai_codestral
  • make sure vertex authentication is not done on import

How to set up and validate locally

For the newly-introduced vertexai_codestral source:

Run the following command and verify that there are no errors.

poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=vertexai_codestral \
--experiment-prefix=vertexai-codestral-latency \
--split=latency_test \
--limit=10 \
--evaluate-with-llm

For the other code-suggestions sources:

Run the following commands and verify that the updates did not introduce errors:

# AI Gateway
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=ai_gateway \
--experiment-prefix=aigw-codegecko-latency \
--model-name=code-gecko@002 \
--model-provider=vertex-ai \
--split=latency_test \
--limit=10 \
--intent=completion \
--evaluate-with-llm

# GitLab - completion
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=gitlab \
--split=latency_test \
--limit=10 \
--experiment-prefix=gitlab-exp \
--intent=completion \
--evaluate-with-llm

# GitLab - generation
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=gitlab \
--split=latency_test \
--limit=10 \
--experiment-prefix=gitlab-exp-generation \
--intent=generation \
--evaluate-with-llm

# VertexAI Code Gecko
poetry run eli5 code-suggestions evaluate \
--dataset="code-suggestions-input-testcases-v1" \
--source=vertexai_codegecko \
--experiment-prefix=vertexai-codegecko-latency \
--split=latency_test \
--limit=10 \
--evaluate-with-llm

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Pam Artiaga

Merge request reports

Loading