Propose Reviewer Roulette experiment for Chart and Omnibus
What is this about?
This is a proposal for us to do an experiment by adding the GitLab Reviewer Roulette to some of our projects, and experiment having the roulette as the primary method to request for reviews, instead of our current workflow with the workflowready for review label.
Why
In order to support new reviewers from outside of groupdistribution on the GitLab Chart and GitLab Omnibus projects. We have a few more reviewers ramping up on these projects that might struggle to consistently work from the existing Merge Request Report, and automating suggestions in MR comments may improve their onboarding and accountability.
While it is possible for us to onboard external reviewers to using the Merge Request Report, there could be challenges:
- Ownership: Similar to what we've discussed for external maintanership, as external volunteer reviewers, it could be challenging to ensure that they have the same sense of ownership over timeliness and “breaching” MRs that the current Distribution team has. By having the roulette suggesting the reviewer directly, we change from a pull-based review to a push-based review approach. This means that we don't depend on reviewers watching the queue of MRs, but rather reviewers will get MRs directly assigned to them.
- Selectiveness/Bias: As external volunteer, one might feel tempted to only pick reviews directly related to their component. But we want to promote reviews horizontally across the code base, for better knowledge sharing and comprehension of the project architecture. The roulette would suggest any reviewer for any code change, thus distributing the work without bias. - In a future iteration, we plan to integrate code categorization to the Reviewer Roulette. But we've been preferring a horizontal categorization than a vertical one. Which means everyone would still be reviewing MRs across different components.
-
Visibility and Accountability: The breaching MR alert is very convenient for the groupdistribution, as it is posted in
#g_distribution
. As we add more and more external reviewers, it's less convenient for them to track every message on this channel. If we use the roulette, then accountability is clearer because they are responsible for the MRs assigned to them, and if they can’t get to it they need to specifically unassign themselves and let the author know. - Workflow adaptation: Many groups are already used to the Reviewer Roulette from other projects. For them, it will be a very natural move, as they already expect to get review requests from the roulette. Actually the roulette itself.
- Workload distribution: Defining the project maintainer which has more time capacity at a specific time, is sometimes challenging even within the Distribution group. The roulette will try to randomize the suggestions evenly, with a same weight for each reviewer/maintainer, unless they've configured themselves to not be suggested anymore, to only be suggested until a certain limit, or to tweaked their likelihood for personal reasons.
In summary, while we expect that having more external reviewers will help with the queue of MRs that we currently handle, this proposal is targeting at providing more efficiency in matching these new reviewers with the MRs ready for review.
Known issues
The roulette will also suggest a maintainer for this MR. As we haven't yet implemented code categorization, the roulette might suggest documentation maintainers for code changes, and vice-versa. So, in this first iteration, the reviewer should be mindful of it, and disregard the roulette suggestion if needed.
Future iteration
- Constantly collect feedback.
- Introduce categorization.
- Rollout to other projects, if well accepted.
FAQ
What would I need to do differently or keep doing?
For projects besides GitLab Chart and GitLab Omnibus, keep following the current workflow through Merge Request Report.
For GitLab Chart and GitLab Omnibus:
- Keep an eye on your emails, GitLab todo's, and/or your GitLab Review Requests page. However you prefer to acknowledge review requests.
- Once you got a review request, think about our SLO (the same SLO for the current workflow), and consider whether you think you'll be able to honour it, and start the review on time.
- If you'll have the time capacity: start the review within the SLO window.
- If you won't have the time capacity (busy, OOO, etc): let the Author know ASAP, so that they can look for another reviewer. Ideally, you could also try to help them find another reviewer. One way to do it, is setting your GitLab status to
busy
, then just rerun the Danger bot, so that the reviewer roulette will try to suggest someone else. You should be mindful when you are getting too many reviews that you have the option to set a limit to your buffer of reviews, or completely block the roulette from suggesting you.
Do the review as you normally would and assign a maintainer once you’re happy with the MR.
Reviewers (as normal) are responsible for making sure the MR is in an acceptable state with the checklist from the MR description adhered to. Consistency will save the maintainers time and the authors time.
Must I always assign to the suggested reviewer?
No. The roulette is just a suggestion. Don't hesitate to ask someone else to replace you, or for a second pair of eyes on your review whenever you see fit. Use or best judgement.
What if this makes our lives harder?
We stop doing it.
Didn’t we look into this before and rule it out?
Yes. We did it 3 years ago: Add engineers of Distribution to Review Roulett... (#634 - closed).
The concerns at the time seem to be solved/mitigated. Additionally, the scenario might be a different now. So we're raising the question again to reevaluate this option. Some concerns we shared back then were:
- Regarding the effort to introduce the roulette in every project and its maintainability burden, at the time one had to introduce around 400 lines of code to integrate it in the project. Nowadays, it's about 10 lines of code and no dependencies needed to be stored in the project. We also had to manage project tokens, whereas, now we use a
gitlab-org
shared token, so no token management is needed. - Regarding the possible abuse of review requests and consequent mental load to deal with it, at the time the roulette had no inbound request or reviewer availability controls. It evolved considerably in this point, by allowing each reviewer to have fine grain control of the amount of review requests they're willing to take. Additionally, detection of availability post-suggestion was also implemented, so that the Author can ask for another suggestion if needed.
- Regarding being locked with the roulette (danger bot) defaults, the roulette was indeed initially envisioned to provide its features to
gitlab-org/gitlab
. But in the last 2 year it became highly customizable. It has the concept of plugins and rules that one can easily opt-in/opt-out when configuring what they want to use from the roulette. Actually, the Roulette itself is a plugin. What manages all that is the GitLab Danger bot through thegitlab-dangerfiles
gem. It's also possible to re-use the current custom danger rules that we already have for the GitLab Chart (metadata, changelog, reviewers message), for instance.
At the time, the overall consensus of those whom have added to the conversation is that roulette is heavier than needed for our use. But since then the integration with the tool became much lighter, and with the addition of external reviewers, we have some motivation to discuss this alternative once more.
Will maintainers processes change?
No. Maintainers will still be assigned by a reviewer once the reviewer is happy with the MR.
Why are we proposing to start with the GitLab Chart and GitLab Omnibus?
If we prefer, we can start with a less active project like the GitLab Operator.
The idea to start with these projects is indeed to get the most impactful return from adding the roulette, as in less active projects it could be harder to observe the good and the bad of it. Also the external reviewers are mostly interested in reviewing these projects.
What do I need if I feel overloaded with this push based review process?
There are a few options:
- Set your review limit using a number as your GitLab status. Emojis
1️⃣ ,2️⃣ , ..., can define your review buffer size (:one:
,:two:
, ...). - Put a
🔴 (:red_circle:
) as your status and it will no longer suggest you at all. - Use the
🔶 (:large_orange_diamond:
) to decrease the likelihood of getting suggested. - At any time you can always ask the author to assign a new reviewer because “you’re at capacity” or “about to go on leave” or any other reason. You can also just assign another reviewer yourself and they can then do the same if they don’t have capacity.
The key is to just try to avoid leaving things assigned to yourself for a long time if you don’t actually have time to get to it.
If you have a review buffer limit set, after you've approved and assigned your MR to a maintainer, consider removing yourself as the reviewer, so that your buffer gets one more empty spot for next reviews.
Where can I learn more about these emoji statuses for availability control
There are also on leave, sick, focus time, and others status that you can use for different purposes.
https://docs.gitlab.com/ee/development/code_review.html#reviewer-roulette
Does roulette run on a scheduled worker?
It runs as part of the danger CI/CD job.
There's a scheduled worker, which will validate the roulette suggestion from time to time and mark the suggested reviewer as not anymore valid, if they went OOO or busy after the suggestion. We don't need to maintain this scheduled pipeline, it's a given.
If this happens and the Author wants to get another suggestion, they can do by rerunning the danger CI/CD job.
What if the danger CI/CD job fails to run?
Then the user doesn't get the reviewer suggestion. But this is also a problem for the current approach, as the danger bot is also responsible for pasting the massage which tells the user how to ask for reviews. If danger fails the author will also see this failed CI job and need to action it anyway.
What if there's no one available for the Roulette to suggest
This can happen if everyone has set themselves as busy, OOO, sick, etc. This is similar to the case when we breach MRs. One option could be to tell the users to ping @gitlab-org/distribution
, or we can fallback them to the current workflow with the workflowready for review label, and get into the Merge Request report. If this happens, we should track this experience to know when the process is failing. I think this can happen in the beginning, but as we add more and more reviewers, the expectation is that it won't happen anymore. I've honestly never seen it happen on gitlab-org/gitlab
. With 15 reviewers already for gitlab-chart
we expect this is unlikely.
How the roulette identifies reviewers and maintainers?
It uses our team member yml config to define who is a maintainer
, reviewer
, trainee_maintainer
, maintainer docs
, or any other custom category.
So if one wants to add themselves as reviewer, they just need to update their file. For example:
# ...
projects:
gitlab-chart: reviewer
omnibus-gitlab: trainee_maintainer
reviewer
or trainee_maintainer
?
Is there a difference between being a Yes. In relation to a reviewer
, trainee_maintainers
are 3 times as likely to get picked for that project's MR, unless they're using emoji status that reduce their likelihood of being suggested, or have blocked themselves by one of the blocking emoji status.
What is the responsibility of a reviewer?
This should be the same as the current process. As follows:
- Make sure the Author completed their MR template checklist.
- Complete the Reviewer MR template checklist.
- Identify possible flaws/risks in the implementation.
- Test the solution in accordance to the definition of done and indicated test scenarios.
- Suggest improvements.
- When ready, assign a maintainer.
How much effort will it be to implement the roulette?**
We’ve already implemented it for the GitLab Chart in this open MR: Upgrade danger and add roulette [EXPERIMENT] (gitlab-org/charts/gitlab!3252 - merged).
It's a small MR. Once merged, the process gets "activated" for the whole project.
What do we do about the existing “ready for review” label?
We have options:
Stop using it, and communicate it clearly in the danger bot message that the author should assign the reviewer as suggested by the reviewer roulette. We can still link to our SLO guidelines, so if the reviewer don't get there in time, they will follow they same call for help process.-
We can keep it, and ask the author to execute the command
@gitlab-bot ready @reviewer_username
. This command assigns the reviewer and automatically adds the reviewer label.
There's discussion to deprecate workflowready for review, and ideally in future the distinction we could have just DRAFT vs. non-DRAFT. So the initial suggestion, is to experiment living without this label, as this could be valuable.
We’ve also opened an MR to add a section in our Distribution Merge Requests doc gitlab-com/www-gitlab-com!126266 (merged)! clarifying this workflow experiment.
Update: We decided to go with option 2, keeping the "ready for review" label via @gitlab-bot ready @reviewer_username
, as the maintainers use it to identify high priority MRs that are waiting for review.
Do we still have an SLO for these projects?
Yes, our existing SLO should still be applicable. Ideally when you are assigned an MR for review then you’ll respond within the same SLO expectations that we have for the review dashboard queue.
Will this increase the workload on maintainers?
Technically the amount of MRs the maintainer gets should not increase as the total number of MRs created is outside of the control of this team. But it may change the latency between an MR being created and when it reaches a maintainer. We hope this process will also reduce the number of times an MR reaches a maintainer after a breached SLO deadline without having had a reviewer look at it first. Over time this should improve the quality of the MRs and might mean that there are less problems that the maintainer may need to pick up on, and less breaching MRs (internal).
How do we track the success/failure of this experiment?
Feel free to add feedback as comments to this very own issue, as suggested on #1304 (comment 1576589857)