Skip to content

Usability updates to code-suggestions eval script

Pam Artiaga requested to merge pam/code-suggestions-eval-usability into main

What does this merge request do and why?

  • Added an offset parameter to the evaluation script so we can test higher-numbered examples
  • Changed num_tests to limit to follow the langchain client
  • For the gitlab client, skip the certificate verification if the base url is https://gdk.test. This is always a self-signed certificate, which would result in an error if verify is not set to False.

How to set up and validate locally

Run the following script:

poetry run eli5 code-suggestions evaluate \
  --dataset="code-suggestions-input-testcases-v1" \
  --source=gitlab \
  --offset=100 \
  --limit=10 \
  --experiment-prefix=experiment-123

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.

Related issue: Evaluation script improvements (#16 - closed)

Edited by Pam Artiaga

Merge request reports