Intelligent merge request reviewer selection
## Problem to Solve Knowing who should review your merge request is hard. With https://gitlab.com/groups/gitlab-org/-/epics/12878 we're introducing reviewer assignment that helps to focus selecting reviewers who will satisfy those approval rules. Identifying users who satisfy an approval rule is valuable, but it doesn't provide us more information about those users like: 1. Is the user available? 2. Is the user in my timezone? 3. Can the user merge? ## Proposal We should improve the list of users returned to account for other valuable information that can aid users in selecting the right reviewer. By adding additional criteria to the list we return and modifying the display, we'll aim to improve the selection process. Things built in to the product that we could initially include: 1. Busy Status Indicator - Incorporating this would allow us to sort reviewers who are available and prioritize them for reviews vs. users who might be unavailable or already at capacity. 1. Time zone - Incorporating this would allow us to suggest reviewers who are in similar time zones to other users creating merge requests. This allows for more rapid collaboration on MRs to improve time to merge. More advanced additions could be incorporated in the future such as: 1. Commit graph of a repository - beyond CODEOWNER paths, we can see who commits to those paths to see who might have more familiarity and be a better reviewer. 1. Current review workload - By looking at the state and number of currently assigned reviews a user has, we can determine what their available capacity is to take on new reviews in a timely manner 1. Working Hours/Calendar Integrations - beyond GitLab's native free/busy status, we can enhance this capability by integrating with common calendaring software to better understand users' working hours and availability ### How should we measure this? We should track which position the user is in that is selected to be the reviewer. This would mean if we returned an improved list of users like this: 1. User A 2. User B 3. User C 4. User D And `User A` is who is selected, we'd record that the first position was used. Alternatively, if `User C` was selected, we'd return that it was the third position. The goal of measuring this would be to ensure that users are selecting from the top positions more often than they are not. ## Progress & Phases ### ✅ Phase 1: Investigation (Completed - 18.1) **Issue**: gitlab-org/gitlab#536557 We conducted a backtesting spike to validate the intelligent reviewer assignment approach using real GitLab.com MR data. Key findings: **What Works:** - Approval rules matching + cohort analysis + favorite reviewers = ~35% accuracy - Potential for 60%+ accuracy with enhanced data (file-level authorship) - Algorithm: `+5 per approval rule match, +1 cohort, +1 favorite reviewer` **Key Insights:** - **File-level commit history** is the biggest missing piece (+40% accuracy potential) - **Cohort detection** (who reviews for whom) is valuable - **BUSY/LOAD status** useful for forward-looking suggestions - **Timezone/geo** less important than initially expected (GitLab is an outlier) **Technical Validation:** - Backtesting tool: https://gitlab.com/thomasrandolph/best-reviewer-tool - Algorithm can suggest correct reviewer in top 4 positions 60%+ of the time - Progressive enhancement strategy validated (Premium/Ultimate base + Duo AI enhancement) ### 🚧 Phase 2: Architecture & Planning (FY27 Q1) **Epic**: TBD - Architecture Blueprint: Intelligent Reviewer Assignment System **Goals:** - Create comprehensive architecture blueprint - Design file-level authorship tracking system - Define algorithm architecture (pluggable, testable) - Create all implementation issues for development - Validate technical feasibility **Key Decisions Needed:** - Data model for authorship tracking - Where to compute suggestions (backend/frontend/service) - Algorithm versioning and A/B testing approach - Integration with CODEOWNERS and approval rules - Telemetry and measurement framework ### 📋 Phase 3: Implementation (FY27 Q2+) **Planned Approach:** 1. **Basic algorithm** - Approval rules + cohorts (no new data required) 2. **File-level authorship** - Track who touches which files/lines 3. **Enhanced availability** - Better BUSY/LOAD signals 4. **AI enhancement** - Knowledge graph integration for Duo (see &17948) **Progressive Enhancement:** - **Premium/Ultimate**: Rule-based algorithm - **Duo**: LLM-enhanced with knowledge graph ## Related Work - **Investigation**: gitlab-org/gitlab#536557 - Backtesting and validation - **Architecture Epic**: TBD - FY27 Q1 blueprint work - **AI Integration**: &17948 - Deep Research Agent / Knowledge Graph - **Related**: gitlab-org/gitlab#547908, gitlab-org/gitlab#530190 *This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.* <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION --> > [!important] > This page may contain information related to upcoming products, features and functionality. > It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. > Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc. <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
epic