feat: optimize duo code review flow (!3448) · Merge requests · GitLab.org / ModelOps / AI Assisted (formerly Applied ML) / Code Suggestions / AI Gateway

What does this merge request do and why?

This MR addresses critical performance issues in the Duo Agent Platform (DAP) code review workflow by eliminating the LLM-based data formatting bottleneck.

Problem: The current 5-step workflow was taking 4-7 minutes for large MRs (and often just failing), with the code_review_format step being the primary bottleneck (4-5 minutes alone). This step used an LLM to transform raw GitLab API responses into XML-formatted diffs, which is inefficient for large codebases where LLMs process thousands of lines token by token.

Solution:

Replaces LLM formatting with deterministic code: New BuildReviewMergeRequestContext tool handles data fetching and formatting programmatically
Eliminates XML diff formatting: Passes raw git diffs directly, following Anthropic's recommendation that LLMs handle raw diffs better than heavily formatted XML
Consolidates workflow steps: Reduces from 5 steps to 4 steps, removing fetch_mr_data and code_review_format entirely
Maintains functionality: Preserves custom instructions support and file content context

Performance improvement:

Large MRs: ~5-7 minutes → <60 seconds (6x faster and close to current duo code reviews)
Data formatting: 4-5 minutes → <1 second
Reliability: Eliminates LLM formatting failures

Architecture change:

First Iteration (5 steps, unreliable):

graph LR
    A[fetch_mr_data<br/>~3-4 min ❌] --> B[code_review_format<br/>4-5 min ❌]
    B --> C[prescan<br/>~20s] --> D[review<br/>~30s] --> E[publish<br/>~1s]
    style A fill:#ffcccc
    style B fill:#ffcccc

Second Iteration (4 steps, reliable):

graph LR
    A[build_review_context<br/><1s ✅] --> B[prescan<br/>~20s] --> C[review<br/>~30s] --> D[publish<br/>~1s]
    style A fill:#ccffcc

This follows the principle: "Use LLMs for reasoning, deterministic code for data processing."

Link to discussion: gitlab-org/gitlab#572251 (comment 2779392956)

How to set up and validate locally

I've tested the approach with some large MRs locally and it worked fine
LangSmith tests: https://smith.langchain.com/o/477de7ad-583e-47b6-a1c4-c4a0300e7aca/projects/p/767bfdff-3740-4b9b-b354-8570b22286c7?timeModel=%7B%22duration%22%3A%227d%22%7D (executing within 60 seconds for all tests)

Local Steps:

Make sure the AI features are enabled locally using this guide
Enable duo_code_review_on_agent_platform feature flag
Assign GitLabDuo as a reviewer either by commenting /assign_reviewer @GitLabDuo or select GitLabDuo from the reviewers dropdown.
This should trigger the new code_review flow and you should be able to see the logs in LangSmith Traces

Screenshots

Before	After

Merge request checklist

Tests added for new functionality. If not, please raise an issue to follow up.
Documentation added/updated, if needed.
If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.

Edited Sep 29, 2025 by Kinshuk Singh

feat: optimize duo code review flow

What does this merge request do and why?

How to set up and validate locally

Screenshots

Merge request checklist

Merge request reports