Improve Vault failure tolerance and failover downtime
- 5 replicas means a quorum of 5 means we can tolerate 2 node failures, instead of 1 node failure with 3 or 4 replicas. Vault uses little resources so we can afford it, see https://www.vaultproject.io/docs/internals/integrated-storage#deployment-table
-
performance_multiplier = 1
allows the fastest failure detection possible at the expense of higher CPU and network usage, reducing downtime when a node fails, see https://www.vaultproject.io/docs/configuration/storage/raft#performance_multiplier
Part of https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/15449