List PL dataset creation pipelines not covered in ELI5

Problem to solve

The Prompt Library contains logic for creating datasets to run evaluations. This logic uses BigQuery and Apache Beam. Since we rely on LangSmith and given our evaluation consolidation efforts, this dataset logic is no longer maintained well and needs to be moved to ELI5.

Proposal

Move the following Prompt Library dataset creation pipelines to ELI5:

PL dataset creation pipeline name	Link to the code file	Dataset used by the eval runner
duo_chat code explanation dataset	promptlib/duo_chat/make_dataset_code_explanation.py	duo_chat.explain_code.1
vulnerability resolution dataset	promptlib/etv/extract/extract_data.py	will be created by #669 (closed)
code_suggestions testcase generation	promptlib/code_suggestions/generate_testcases.py	code-suggestions-input-testcases-v1
root_cause_analysis dataset	promptlib/root_cause_analysis/extract_data.py	duo_chat.slash_troubleshoot.1

Further details

A recent example of a dataset creation pipeline that has not been moved to ELI5 but is already used by the eval runner can be found in this merge request: https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/evaluation-runner/-/merge_requests/109

For comparison, here is an example of a dataset creation pipeline that is used by the Eval Runner and has already been moved to ELI5: https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/blob/main/eli5/eli5/duochat/data/collectors/qa_docs.py?ref_type=heads

We need to ensure all dataset creation pipelines are in a manageable state before transferring ownership to feature teams. Based on the collected list, we can schedule additional issues to plan the migration work.

Additional links

The Custom Models team tracks all validation datasets (both added and planned) used by the Evaluation runner in this epic: gitlab-org&16626.

Edited Apr 09, 2025 by Mark Lapierre