How to review prompt changes
Problem to solve
With prompts moving from rails to AI Gateway, I would like to clarify what role should AI Gateway maintainers play when reviewing such prompts.
Context:
A prompt change would normally require running a set of evaluations to exercise the new prompt and compare the results against the previous one. Prompt changes can occur for various reasons: to fix a very specific problem; to try a different technique (zero shot vs. n-shot), etc.
Some options
-
Let the MR author be responsible for prompt validation.
- This wouldn't require every maintainer to know how each prompt is being used and thus be responsible for prompt changes that might worsen the performance of a use-case.
-
Leave the prompt validation as a responsibility of the maintainer.
- This would require every maintainer to know each use-case and how to validate every prompt, possibly lengthening the time it takes to review the MR.
Further details
- This issue spawned after I was pinged to review a prompt change here !1080 (comment 1996995896).