Proposal of updated Reference Architectures including Gitaly Cluster and Postgres 12
With the recent approvals for Gitaly Cluster and Postgres 12 in the Reference Architectures for 13.8 I'll propose the new updated Reference Architectures to include these.
In this issue I'll detail the new proposed Architectures, each in their own comment threads below, based on analysis and testing the new / updated components.
There's also a few notes on the new Architectures that should be called out:
- Only the HA Reference Architectures (3k+) will be seeing notable changes. Non HA Reference Architectures will remain unchanged as they still will use Gitaly Sharded and don't need any further changes.
- For all Reference Architectures moving to Cluster the number of Gitaly nodes will now be standardised to 3 as recommended. This will be an increase for some and a decrease for others. Performance was found to be good with this change.
- As mentioned above Praefect's database options are varied depending on user needs (if HA is of top concern) and will be laid out in the docs. In the hardware recommendation table it will be called out as it's own item, thankfully needing very little specs now with the great performance improvements detailed in this issue.
- For load balancing calls to the Praefect nodes the same internal load balancer can be used without any change in specs.
- Architectures from 5k up will have their Postgres machine sizes bumped up to the next size. This is to give us more CPU headroom in the future as on these larger architectures the lab condition CPU usage on Postgres has been higher than we'd like in terms of comfort for some time to compensate for real life conditions. With the big change of Cluster this feels like the best time to do it to minimise disruption, especially as we hope in the near future to be able to recommend moving the Praefect database over to the same cluster as GitLab's main one.
- With this change it will give us plenty of headroom moving forward so I don't expect any further changes to Postgres specs in the long term.
- Note that this is not specifically due to any performance issues with Postgres 12 or Patroni. Both of these components we're found to perform in line with their predecessors (Postgres 11 and RepMgr)
- Customers happy with the previous database specs can stick with those if desired as this is mostly a proactive headroom move but they may find performance degradation in the future with newer versions of GitLab or a change in their workload shape.
Finally the architectures are of course currently subject to change. Will update this issue when they're verified.
Edited by Grant Young