Skip to content

Support local input to avoid calling GitLab API

Alexander Chueshev requested to merge ac/support-local-input into main

What does this merge request do and why?

This MR updates the existing Prompt Library pipeline to evaluate Duo Chat without relying on GitLab API calls. The pipeline reads the resource dataset and Duo Chat completions from local files.

Close #168 (closed)

Duo Chat Completions Graph

Screenshot_2024-02-27_at_10.42.15_am

(generated with --dry-run option)

How to set up and validate locally

  1. Clone the dataset repo:
git clone git@gitlab.com:gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/datasets.git
  1. Get Duo Chat completions by running the Rake task or download the example file - 28dcafcce682028a32e9ae070b73d789.jsonl

  2. Set up env variables:

export GITLAB_TOKEN=<still required by the config> # TODO: make it optional and dependent on the input adapter
export ANTHROPIC_API_KEY=<your anthropic key goes here>
export GOOGLE_APPLICATIONS_CREDENTIALS=<path to the json key>
  1. Update the config file:
# Update the `input_data` section with appropriate values
cp data/config/duochat_eval_local_config.json.example data/config/duochat_eval_local_config.json
  1. Run the evaluation pipeline locally:
poetry run promptlib duo-chat eval --config-file=data/config/duochat_eval_local_config.json --test-run --sample-size=5

Example output:

  • dev-ai-research-0e2f8974.duo_chat_external_results.achueshev_2__independent_llm_judge
  • dev-ai-research-0e2f8974.duo_chat_external_results.achueshev_2__similarity_score
  • dev-ai-research-0e2f8974.duo_chat_external_results.achueshev_2__similarity_score

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Tan Le

Merge request reports