Problem: Latency of Contribution Review
Problem/Opportunity Statement
We have a great team doing great work right now. I want to use their time and cognitive appetites effectively. To this end, I've noticed that code review latency is bottlenecking development progress.
This dashboard shows that MRs routinely spend 3+ weeks open prior to merge. MRs remain open for several reasons, some of which are unavoidable, but a subset of these are a consequence of project policy combined with maintainer habits. The policy: MRs require approval from two maintainers prior to merge. The habits: MRs sometimes wait several days for review from a first maintainer, and they routinely wait another week or more for review from a second maintainer.
Delayed review has compounding costs (like compounding interest) for the following reasons. While an MR waits for review:
- It incurs debt of merge conflicts as the upstream project moves forward. Someone must re-pay this debt prior to merge, which is time-consuming and not intrinsically fun.
- The context around the MR fades from the author's and reviewers' minds, which requires involved parties to re-read and re-load the context into their working memory when the MR has an update in a week or two.
- (The excitement of achieving the change, such as it is, also fades.)
- The above two factors implicitly discourage small, low-stakes MRs. If each contribution has high fixed costs, contributors will pile more changes into each MR. Larger MRs are harder to review, which is likely to delay their review further. (Compounding cost!)
- Specifically, it also discourages tidying / refactoring work. Often, when implementing an app behavior change (or during code review), we notice existing messy code and want to refactor it. There is a tradeoff between refactoring now (which further delays and grows the MR), and waiting to refactor afterward (which keeps diffs and reviews more focused, but requires a follow-up MR). We're less likely to make that follow-up MR if we've already achieved the behavior change, and it will be a drag due to its own protracted /review/re-orient/respond/rebase cycle.
- It delays the progress of other queued-up work that must build on the work pending review. (This is a contruction management issue playing out in software. Delaying the foundation pour delays every subsequent job: framing, roofing, plumbing, etc.)
Final bit of context: I know both current maintainers want to be responsive and deeply engaged with the project, but we each have many duties beyond Exosphere, and that's unlikely to change. So, I'm not saying "we should just be more responsive" or "we should just work harder".
What would success / a fix look like?
First, I (@cmart) will cite some personal influences behind this thinking.
-
Tidy First? by Kent Beck
- Advocates for less/no code review required for pure structure changes (as opposed to behavior changes), if the team's culture supports it.
-
The Effective Engineer by Edmond Lau (I have this on a dead tree, happy to lend it out locally)
- Specific chapters:
-
- Invest in Iteration Speed
-
- Measure What You Want to Improve (specifically, we can measure review latency)
-
- Balance Quality with Pragmatism
-
- Specific chapters:
-
Google Engineering Practices Documentation
- Specifically, Speed of Code Reviews
- Reading the Excalidraw Blog
- Specifically, Reflections on Excalidraw, written 2 weeks and 1600 GitHub stars into the project's life (what a wild ride), section "Keeping People Engaged"
An incremental change to project policy that I think we could enact today: pure refactoring MRs can only require approval from one maintainer. Or, more formally:
- We define a pure structure change (PSC, sorry) as a code refactoring that causes no behavior changes in the app.
- Any maintainer can verify that an open MR is a PSC.
- In this case, that maintainer can, after reviewing and approving the MR, shortcut the approval process and merge it without seeking further review.
- (This also means that maintainers can self-merge their own PSC if they wish.)
- In this case, that maintainer can, after reviewing and approving the MR, shortcut the approval process and merge it without seeking further review.
- A contributor can optionally signal their MR is a PSC.
- Perhaps with a "Refactor:" prefix in the MR title, or a "PSC" label.
- Accordingly, they can expect an expedited review process.
Other things to consider:
- We could define a "core contributor" role that has a lighter set of requirements than becoming a maintainer, such that some existing contributors already meet the bar.
- Core contributors could also review and merge pure structure changes without seeking further review.
- (Unsure if this is straightforward to implement in GitLab permissions)
- Core contributors could also review and merge pure structure changes without seeking further review.
- We could consider relaxing the policy to one maintainer approval for all MRs (not only structure changes).
- We might still caution maintainers to seek further review for a particularly large or consequential behavior change, or for any change they expect to be controversial.
- To de-risk a relaxation of our review policy, we could make it super obvious how to revert the production sites to a previous state if a bad merge accidentally breaks them.
- You go here, find the last good pipeline, and re-run the deploy jobs.
- We can document this and give more people the ability to do it.
- Operationalize our review latency metrics.
- We could make these metrics more prominent, fire alerts in chat each day that something awaits review, and so forth.
- ??? too many ideas to list. Others, feel free to add your own as a comment.
Related to #938. Slightly related to #956 (closed).
CC @julianpistorius, @LordParsley, @kageurufu, @mpmwhite, @dmsnell .. basically anyone who has lived this recently.