Investigate extending YAML evaluation framework to support Agentic Duo Chat Code Generation slash commands
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Context
The groupai framework team recently implemented a YAML-based configuration system for the CEF that enables streamlined evaluation for the Agent Platform. This framework provides reusable architecture for running evaluations through standardized YAML configs (see gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library!1646 (merged)).
Currently, this framework is only implemented for Agentic Duo Chat flows. Code Suggestions evaluations do not yet leverage this infrastructure. This investigation aims to determine the feasibility and requirements for extending the YAML-based evaluation framework to support the Duo Chat slash commands with Code Suggestions (code generation) elements including /explain, /refactor, /tests, and /fix.
References and Resources
- Original implementation MR: gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library!1646 (merged)
- Related Epic: &18721
- Docs: https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/blob/main/doc/eval_scenarios/agent_platform/agentic_duo_chat.md?ref_type=heads
Proposal
Investigate and document the requirements for extending the Agent Platform YAML evaluation framework to support Duo Chat slash commands with a code generation element (/explain, /refactor, /tests, and /fix).
At the time of creating this issue, Agentic Chat doesn't support slash commands so this issue is a placeholder and reminder for if/when that happens.
Definition of Done / Acceptance Criteria
- Technical feasibility assessment documented with clear yes/no recommendation on proceeding
- Identified key files to update and determine how/if we can re-use evaluators from regular chat
- Determine YAML config structure for Code Suggestionsre