Team specific database and performance specialists
Currently the database team consists out of two members: me and @_stark, and both of us have different experience areas/levels (e.g. I'm more of an application development guy and not an infrastructure guy).
One of the recurring problems I have ran into over the past 2 years is that I have to review a lot of changes from different teams, and often the bigger picture/context is not clear to me. For example, in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/14879 the CI team is reworking their clustering setup. While I can review the database changes in isolation I lack the context to understand what else may need to be done and what fits in the bigger picture.
It's unreasonable to expect a small number of people (3 per January, hopefully a few more later in 2018) to understand everything that goes on in the CI, Platform, Discussion, Edge, and Frontend teams. This applies to both database and performance in general.
As such I would propose the following setup:
Instead of having DB specialists act as flying goalies we have a few "generic" ones, and a handful of team specific specialists. In this setup the generic DB specialists work on cross team problems / architecture decisions, while the team specific ones focus on the problems of those teams. For example, a DB specialist in the CI team will specifically help the CI team. These members would still be part of the DB team and report to the same people, they just focus on specific teams.
This setup would make my life a lot easier as I no longer have to review every merge request out there, instead these team specific members could take care of that and refer to me (or somebody) else for additional guidance/info if necessary.
This is also where performance specialists come in. @sytses in the past brought up the idea of reintroducing a performance team, but I never liked this for a simple reason: having a generic performance team doesn't work. We tried this in 2016 and it's too hard to have a small team (less than 5 at the time) try and fix all problems. I however do think this could work if we introduce team specific performance specialists. These members would focus on generic performance problems (e.g. Ruby code, memory usage, Gitaly N+1 problems) while the database specialists focus on database specific issues.
This however does mean we either need to hire or train quite a lot of people over time. Per team I think it's best to have at least 2 database specialists, and 2 performance specialists. This way if somebody gets hit by a bus the team doesn't immediately lose their specialist. This however means we have to hire 20 additional developers, which is very hard considering it took us over a year to hire just 2 additional database specialists (of which one hasn't started yet).
Instead of hiring 20 developers right away I think we should start with hiring/training 1 database specialist for the CI team. This team often has the largest/most complex database changes, so having an additional person on that team alone would help a lot. After the CI team I think focusing on Discussion is best since they have to take care of two of the most crucial parts of GitLab: issues and merge requests. @smcgivern has been working on large migrations for MR diffs and I think having somebody to help him out with similar changes in the future could be really beneficial.
We may want to extend this to the SRE team @pcarranza is building as well. That way the SRE team won't have to go through the exact same problems.
@sytses @edjdev @pcarranza any thoughts on this?