feat: integrate eli5 and add eval command
What does this merge request do and why?
Closes gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library#622 (closed)
How to set up and validate locally
- Add the following to your
.env:
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=[my-api-key]
-
Create a dataset in your langchain account from this jsonl (see docs at https://docs.smith.langchain.com/old/evaluation/faq/manage-datasets#upload-a-csv) and call it
gen-desc-ds -
Run an evaluation for a given prompt+version against an existing dataset:
poetry run python eval generate_description 1.0.0 gen-desc-ds
The output should include a link to see the evaluation results. Open it and verify you get stats for each example in the dataset.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Alejandro Rodríguez