feat: unify Agent Platform response collection using YAML (!1641) · Merge requests · GitLab.org / ModelOps / AI Model Validation and Research / AI Evaluation / CEF

What does this merge request do and why?

This MR unifies the way evaluations are run when targeting features implemented in DAP, including Agentic Duo Chat. Please refer to the example below or check the updated documentation for more details. The main goal is to have one CLI command and use YAML configs to setup different aspects of the evaluation pipelines.

Closes #781 (closed)

Related to gitlab-org&18721

BLOCKED by !1629 (merged)

How to set up and validate locally

version: 1
env:
  executor_type: "go"
  params:
    bin_path: "duo-workflow-executor/bin/duo-workflow-executor"
langsmith:
  dataset: "duo_chat.cot_qa_docs.1"
  split: "base"
  offset: 0
  limit: 1
flow:
  flow_type: "duo_chat"
  params:
    target_project: "gitlab-duo/test"
    agent_privileges: [1,2,4]  # READ_WRITE_FILES, READ_ONLY_GITLAB, RUN_COMMANDS
    # Ref: https://gitlab.com/gitlab-org/gitlab/blob/13461f57b9f087055e23651eead75cdc716c1cbb/ee/app/models/ai/duo_workflows/workflow.rb#L34
    pre_approved_agent_privileges: [1,2,4]
pipe:
  inference: true

poetry run cef agent-platform evaluate .gitlab/agent_platform_templates/duo_chat.yaml

Merge request checklist

I've ran the affected pipeline(s) to validate that nothing is broken.
Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.

Edited Sep 12, 2025 by Alexander Chueshev

feat: unify Agent Platform response collection using YAML

What does this merge request do and why?

How to set up and validate locally

Merge request checklist

Merge request reports