Improve PostgreSQL replication documentation/runbooks
Our guides need to state:
- How to set up replication (both a primary and N secondaries)
- How to perform a failover
pg_basebackupmight hang for 10 minutes while not producing output (while waiting for a primary to respond)
- What files need to be placed where
- Where the credentials for the
gitlab_replicatoruser are stored (1Password, Chef vaults, etc)
- What graphs/monitoring we have to help out engineers (e.g. links to individual dashboards in Grafana)
@jnijhof I'm assigning this to you since you know the most about this aspect. Feel free to re-assign if necessary.
For PostgreSQL replication please consider using the replication slots (available since 9.4): if all of your hot standbys use replication slots on master, master will keep all the WALs required for them to get in sync (even if they've lagged behind for days). Only drawback is that you can run out of space on master if some of replicas is broken long enough, so you need monitor this carefully.
closedToggle commit list