Add section to DR documentation describing upgrade impact on secondary
I have a question regarding instructions for "disabling" a secondary Geo node during the upgrade of the primary. If a secondary node is set up as a DR solution should we disable PostgreSQL replication and/or other syncing while the primary is upgraded?
Imagine a situation in which the primary node fails to upgrade (it could be an HA setup) and a secondary should be promoted to primary - what is the right process?
Possible scenarios:
Disable DR node
- "Disconnect" secondary DR node (state is now fixed at a point in time)
- Upgrade primary and remaining secondaries
- If successful, upgrade DR node and re-sync; else promote secondary
All nodes are enabled, including DR node
- Upgrade primary and all secondaries without "disconnecting" a DR node
- If successful primary upgrade you are done; else promote secondary
The question is really: Is there any way that a failure in the primary upgrade can promote to the secondary which is designated for DR purposes?
This is not so important if Geo nodes are used purely for geo replication but highly relevant for DR applications.
Proposal
- Create a documentation entry in the DR section that describes the risks listed in the ticket
- Create instructions for stopping replication e.g
sudo gitlab-ctl stop postgresql
sudo gitlab-ctl stop repmgrd # if HA
# ...
sudo gitlab-ctl start repmgrd # if HA
sudo gitlab-ctl start postgresql
- Crosslink to upgrade instructions.
Edited by Fabian Zimmer