How to review prompt changes

Problem to solve

With prompts moving from rails to AI Gateway, I would like to clarify what role should AI Gateway maintainers play when reviewing such prompts.

Context:

A prompt change would normally require running a set of evaluations to exercise the new prompt and compare the results against the previous one. Prompt changes can occur for various reasons: to fix a very specific problem; to try a different technique (zero shot vs. n-shot), etc.

Some options

  1. Let the MR author be responsible for prompt validation.
    • This wouldn't require every maintainer to know how each prompt is being used and thus be responsible for prompt changes that might worsen the performance of a use-case.
  2. Leave the prompt validation as a responsibility of the maintainer.
    • This would require every maintainer to know each use-case and how to validate every prompt, possibly lengthening the time it takes to review the MR.

Further details

Links / references

cc @sean_carroll