Add evaluation pipeline for SAST False Positive Detection using CEF's flow evaluations

Problem to solve

The Sec AI team has developed a new feature that performs agentic SAST False Positive detection using the flow registry. The way to call it is as follows:

Input = id of vulnerability
Output:
- false_positive_likelihood: number between 0 and 100
- explanation: string

We need to build a new evaluation pipeline to evaluate this feature, using CEF's flow evaluations that do not require setting up the GitLab runner.

Proposal

Use the existing MR for agentic vulnerability resolution as a reference point.

Links / references

New agentic VR overview
API for false positive detection

Edited Sep 29, 2025 by Fabrizio J. Piva