Create initial /tests dataset in LangSmith and backup in Datasets repo

Context

We need an initial dataset to use in our evaluation of the Duo Chat Slash command /tests. This first iteration will allow us to develop the evaluator (#515921 (closed)) and refine the dataset in parallel.

For the first iteration, we will exclude additional context in the evaluation, including those provided by the user (e.g. /tests using Jest or by Repository X-Ray.

Proposal

  1. Create a dataset named duo_chat.code-tests.1 in LangSmith (70-120 examples). Reference the dataset creation guide.

Possible resources for obtaining examples:

  • GitLab-owned codebases (e.g. GitLab for Ruby, AI Gateway for Python, etc)
  • Existing datasets from reputable online sources
  • AI generated

More guidance may be found in the response to &16634 (comment 2321517554). You may also find it helpful to reference the /refactor counterpart dataset duo_chat.code-refactor.1.

  1. Create dataset backup. Follow registration guide to add it to Datasets repo.

Additional info / Considerations

It is advised to first verify the full prompt format/text by testing this slash command locally, and identify the inputs/structure required for the dataset examples.

The current user prompt and system prompt for /tests include the following:

  • file_content --> I believe this is additional context, so this is out of scope (needs to be confirmed)
  • selected_text
  • input --> Additional user input is out of scope, so this text should be static (needs to be confirmed)
  • language_info
  • libraries --> Out of scope
Edited by Leaminn Ma