purposely failover primary to up-to-date secondary
there's currently no technology that we as admin can purposely choose primary or failover from primary. This would be useful in some cases
- when there's scheduled maintenance. before terminating a gitaly cluster node which acts as primary for certain nodes, it re-elects a secondary as primary so that we are sure the terminated node doesn't contain any latest data that hasn't been replicated to secondaries.
- we don't want one node to act as primary for too many repositories. we want primary node election and write operation evenly distributed.
- We currently operate with 3 gitaly cluster nodes (say, gitaly-1,2,3) with default failover election strategy (per_repository), strong consistency and replicated to all nodes. Theoretically, each node is assigned as primary for 1/3 of all repositories in this virtual storage. if there's a node failure or node rollout on gitaly-1, then praefect will re-elect either gitaly-2 or 3 as primary. say, it's evenly distributed so each node is responsible as primary for 1/2 of all repositories. Then gitaly-1 comes back online. However, IIUC, praefect won't re-elect gitaly-1 as primary for those repositories that used to use gitaly-1 as primary. Hence, at this point, gitaly-1 acts as primary for no repository and each gitaly-2 and 3 remain primary for 1/2 of all repositories. Say, there's yet another node failure in gitaly-2, will praefect re-elect gitaly-1 to take all from gitaly-2? or does it again evenly distribute to gitaly-1 and 3? i.e. gitaly-1 for 1/4 and gitaly-3 for 3/4.
Edited by Masa Yoshida