Skip to content

Code suggestions eval: wrap code snippets with LLM instructions consistently

Problem to solve

We need to support latency testing of AI providers in a way that's consistent with how we were doing latency testing using Prompt Library in https://gitlab.com/gitlab-org/quality/ai-model-latency-tester.

promptlib let us inject code snippets from the dataset into a template to use as the model's prompt, consistent with the prompt that GitLab Rails constructs.

This allowed us to test different providers/models using the same prompts for all.

We'll need to implement similar templates for CEF. Currently, each existing client has its own hardcoded system prompt, which is not the same for all clients.

Proposal

  • Add the existing templates
  • Add a CLI option to specify a template that will be used instead of the default (hardcoded) template
  • Modify the code suggestions clients to use the template when specified
  • When a template is not specified, all clients should use the same default template

Links / references