Skip to content

feat: optimize duo code review flow

What does this merge request do and why?

This MR addresses critical performance issues in the Duo Agent Platform (DAP) code review workflow by eliminating the LLM-based data formatting bottleneck.

Problem: The current 5-step workflow was taking 4-7 minutes for large MRs (and often just failing), with the code_review_format step being the primary bottleneck (4-5 minutes alone). This step used an LLM to transform raw GitLab API responses into XML-formatted diffs, which is inefficient for large codebases where LLMs process thousands of lines token by token.

Solution:

  • Replaces LLM formatting with deterministic code: New BuildReviewMergeRequestContext tool handles data fetching and formatting programmatically
  • Eliminates XML diff formatting: Passes raw git diffs directly, following Anthropic's recommendation that LLMs handle raw diffs better than heavily formatted XML
  • Consolidates workflow steps: Reduces from 5 steps to 4 steps, removing fetch_mr_data and code_review_format entirely
  • Maintains functionality: Preserves custom instructions support and file content context

Performance improvement:

  • Large MRs: ~5-7 minutes → <60 seconds (6x faster and close to current duo code reviews)
  • Data formatting: 4-5 minutes → <1 second
  • Reliability: Eliminates LLM formatting failures

Architecture change:

First Iteration (5 steps, unreliable):

graph LR
    A[fetch_mr_data<br/>~3-4 min ❌] --> B[code_review_format<br/>4-5 min ❌]
    B --> C[prescan<br/>~20s] --> D[review<br/>~30s] --> E[publish<br/>~1s]
    style A fill:#ffcccc
    style B fill:#ffcccc

Second Iteration (4 steps, reliable):

graph LR
    A[build_review_context<br/><1s ✅] --> B[prescan<br/>~20s] --> C[review<br/>~30s] --> D[publish<br/>~1s]
    style A fill:#ccffcc

This follows the principle: "Use LLMs for reasoning, deterministic code for data processing."

Link to discussion: gitlab-org/gitlab#572251 (comment 2779392956)

How to set up and validate locally

Local Steps:

  • Make sure the AI features are enabled locally using this guide
  • Enable duo_code_review_on_agent_platform feature flag
  • Assign GitLabDuo as a reviewer either by commenting /assign_reviewer @GitLabDuo or select GitLabDuo from the reviewers dropdown.
  • This should trigger the new code_review flow and you should be able to see the logs in LangSmith Traces

Screenshots

Before After
Screenshot_2025-09-25_at_8.36.29_AM Screenshot_2025-09-26_at_2.27.56_PM

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
  • If this change requires executor implementation: verified that issues/MRs exist for both Go executor and Node executor or confirmed that changes are backward-compatible and don't break existing executor functionality.

Related to Improve performance of code_review_fetch_data a... (gitlab-org/gitlab#572251 - closed)

Edited by Kinshuk Singh

Merge request reports

Loading