Create initial /tests dataset in LangSmith and backup in Datasets repo

Context

We need an initial dataset to use in our evaluation of the Duo Chat Slash command /tests. This first iteration will allow us to develop the evaluator (#515921 (closed)) and refine the dataset in parallel.

For the first iteration, we will exclude additional context in the evaluation, including those provided by the user (e.g. /tests using Jest or by Repository X-Ray.

Proposal

Create a dataset named duo_chat.code-tests.1 in LangSmith (70-120 examples). Reference the dataset creation guide.

Possible resources for obtaining examples:

GitLab-owned codebases (e.g. GitLab for Ruby, AI Gateway for Python, etc)
Existing datasets from reputable online sources
AI generated

More guidance may be found in the response to &16634 (comment 2321517554). You may also find it helpful to reference the /refactor counterpart dataset duo_chat.code-refactor.1.

Create dataset backup. Follow registration guide to add it to Datasets repo.

Additional info / Considerations

It is advised to first verify the full prompt format/text by testing this slash command locally, and identify the inputs/structure required for the dataset examples.

The current user prompt and system prompt for /tests include the following:

file_content --> I believe this is additional context, so this is out of scope (needs to be confirmed)
selected_text
input --> Additional user input is out of scope, so this text should be static (needs to be confirmed)
language_info
libraries --> Out of scope

Edited Jan 30, 2025 by Leaminn Ma