Scaling the database review process
Each merge request that contains database-related changes has to be reviewed and approved by @gl-database (which translates to @abrandl). The review discussion usually concentrates around query performance, data migrations, database design and correctness.
This process does not scale. There are a lot of engineers submitting merge requests and only a few (one or two) persons doing database reviews. This is a bottleneck that is best felt in the week before code freeze where pressure to review and approve significantly rises and the queue of pending reviews is long (as of today, 24 MRs long with currently 12 MRs - note this is from roughly 4 days worth of pings, with only a couple MRs older than that).
I would like to propose a new process that eventually gives us the following benefits:
- Transport database knowledge to engineering teams
- Remove the database review bottleneck
- Cultural shift towards engineering taking more responsibility for database
Proposal
We have a process in place how to become a GitLab maintainer. An engineer starts out as a first reviewer who always assigns to a maintainer for final review and merge. Eventually, once the reviewer is deemed qualified he/she is being promoted to maintainer.
I'd like to propose a similar process for database maintainers. The idea is to follow these steps:
- Engineers interested in database work volunteer as database reviewers and act as a first line of defense by conducting a first review and database discussion with the MR author. This already lowers the pressure and reduces the amount of review work for @gl-database.
- Once a database reviewer is satisfied, she pings @gl-database to conduct a final database review and to approve the MR. Database approvals should be limited to database maintainers.
- After a while, the reviewer is encouraged to ask for maintainership and - if deemed qualified - is promoted as such.
How does this relate to the above benefits?
- Transport deep database knowledge to engineering teams
Once @gl-database is pinged on the MR, there are at least three persons involved: The MR author, a database reviewer and a database maintainer. In case there is still a database concern in the MR after the reviewer deemed it ready, this is a good opportunity to "educate" all participants to understand this concern better. It is specific, concrete and fresh in everybody's minds. Personally, I deem that a much better opportunity to learn than semi-artificially crafting "good practices" and running training sessions.
Becoming a database maintainer and thereby sharpening the database skill is also an interesting career opportunity for a software engineer to become more specialized.
- Remove the database review bottleneck
From day 1 - there are database reviewers in place that only ping @gl-database if they're either satisfied or not sure about something. This already reduces the workload for @gl-database because oftentimes it's a matter of directing the MR author to a piece of documentation or similar. With reviewers getting more routine, the workload is going to drastically reduce.
Once reviewers are ready to be promoted maintainer, the bottleneck will be gone.
- Cultural shift towards engineering taking more responsibility for database
Today, engineering implements awesome new features every day and @gl-database (in infrastructure!) approves the database-related code change. This means, infrastructure takes a good portion of responsibility over database-related changes and made responsible if GitLab.com falls over because of a database-related change.
A database-related change in this context is a code change, not an infrastructural change. Hence it seems unnatural for an infrastructure team to take a major part of responsibility for it. DBREs would still participate in code reviews (when necessary) and advice on database design and queries and so on (personally, I quite enjoy doing this and also being able to work in software engineering as part of my role). However, I feel strongly that the approval process and hence the ownership and responsibility for database-related code changes ultimately belongs to the engineering side and I would like to see us shifting towards this.
We used to have a database engineering team (in infrastructure) where this review process felt more natural because we were focused on application changes at that time. However, this team has dispersed and DBREs now operate from the SRE team with a broader responsibility over databases infrastructure - hence reviewing database changes is sometimes deemed as a blocker for database-related infrastructure work.
/cc @glopezfernandez @Finotto @gitlab-com/gl-infra
Pings to engineering via @marin @DouweM @smcgivern @rnienaber @lmcandrew @meks @ayufan @yorickpeterse @tommy.morgan @dhavens