Skip to content

feat: add support for evaluators in the Agent Platform YAML config

What does this merge request do and why?

This MR adds the support for evaluators in the Agent Platform YAML config. Please, refer to the docs to check how the logic works - https://gitlab.com/gitlab-org/modelops/ai-model-validation-and-research/ai-evaluation/prompt-library/-/blob/a6e3d10f2850752be716ac3e5270ee3b19a404e2/doc/eval_scenarios/agent_platform/agentic_duo_chat.md.

Related to gitlab-org&18721

BLOCKED by !1641 (merged)

How to set up and validate locally

---
version: 1
env:
  executor_type: "go"
  params:
    bin_path: "duo-workflow-executor/bin/duo-workflow-executor"
langsmith:
  dataset: "duo_chat.cot_qa_docs.1"
  split: "base"
  offset: 0
  limit: 1
flow:
  flow_type: "duo_chat"
  params:
    target_project: "gitlab-duo/test"
    agent_privileges: [1,2,4]  # READ_WRITE_FILES, READ_ONLY_GITLAB, RUN_COMMANDS
    # Ref: https://gitlab.com/gitlab-org/gitlab/blob/13461f57b9f087055e23651eead75cdc716c1cbb/ee/app/models/ai/duo_workflows/workflow.rb#L34
    pre_approved_agent_privileges: [1,2,4]
pipe:
  inference: true
  assessment:
    evaluators:
      - name: llm_ragas_factual_correctness
        params:
          mode: precision
      - name: llm_ragas_answer_accuracy
poetry run cef agent-platform evaluate .gitlab/agent_platform_templates/duo_chat.yaml

Merge request checklist

  • I've ran the affected pipeline(s) to validate that nothing is broken.
  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Tan Le

Merge request reports

Loading