Skip to content

Fix zero-downtime upgrade process/instructions for multi-node Geo deployments

We have observed downtime while following the zero-downtime upgrade instructions for multi-node Geo deployments. This issue covers identifying and fixing the blockers to zero-downtime, and updating zero-downtime upgrade instructions.

  1. Identify at which step(s) downtime occurs during an upgrade. This might involve using HAProxy dashboards, real-time server logs and/or other means to get live feedback (end-to-end tests have a delay related to built-in waits inherent to these types of tests).

  2. Fix any blockers to zero-downtime upgrades.

  3. Test revised zero-downtime upgrade process on current and previous versions of GitLab (the versions with version-specific instructions available on docs.gitlab.com)

  4. Revise zero-downtime instructions for current GitLab version, and update instructions for the previous GitLab versions tested in previous step (either with corrected instructions for zero-downtime upgrades or removal of instructions if zero-downtime is not possible for those versions)