Analytics agent automated testing plan

For DAP GA we want the Foundational Agents to have been minimally evaluated to solve the problems they were designed for (&19500 (comment 2889886842).

We're currently using a test automating tool we developed to validate the responses over multiple requests: https://gitlab.com/gitlab-org/analytics-section/platform-insights/duo-analytics-agent-prompt/-/tree/main/prompt-test-automator. The tool is not automated and needs to be run locally against a GDK ( with Duo enabled and configured ).

There is a bigger initiative for agents automated testing, but I don't think it's happening before GA https://gitlab.com/gitlab-org/gitlab/-/issues/580874+

There is also a separate initiative for prompt validaiton that we might want to look at and maybe integrate with: https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library

As part of this task we should:

Add support for tools execution and validation to our prompt-test-automator
Add more tests cases to our prompt-test-automator
Look into CES and determine if it's something we want to integrate with

Edited Dec 10, 2025 by Daniele Rossetti