Document that Geo replication continues during Maintenance mode despite UI showing unhealthy
Problem to solve
When Maintenance Mode is engaged, the Geo Nodes UI shows an outdated status for all Geo secondaries, and eventually they become Unhealthy
. This leads to confusion about the true state of the system when upgrading, etc.
Documenting this on the Maintenance Mode page would prevent confusion and provide greater efficiency to the user.
Other links/references
A premium customer spent a great deal of time trying to overcome what they thought was a broken Geo secondary due to lack of this information.
- https://gitlab.zendesk.com/agent/tickets/230376 (internal link)
- https://gitlab.my.salesforce.com/0014M00001lbBxrQAE (internal link)
A related issue to allow node status writes during Maintenance Mode so UI reports true status of replication #292983 (closed) was fixed by !70010 (merged) in 14.4. We still need to modify the docs for affected versions GitLab 13.9 through 14.3.
Proposal
- Add a section in https://docs.gitlab.com/ee/administration/geo/replication/troubleshooting.html for
Geo node status unhealthy after enabling maintenance mode
.- Mention that it's fixed in 14.4, and affected versions are 13.9 through 14.3.
- Show how to get the actual status of Geo secondary sites during maintenance mode.
- Modify https://gitlab.com/gitlab-org/gitlab/-/blob/a59a3917bad7ba4dd1245670fdbddf04e09c87d5/doc/administration/geo/disaster_recovery/planned_failover.md#finish-replicating-and-verifying-all-data.
- Say that 13.9 through 14.3 are affected by a bug and link to the Troubleshooting section.
- Mention and link to the Troubleshooting section in https://docs.gitlab.com/ee/administration/geo/replication/version_specific_updates.html for affected versions
Edited by Michael Kozono