Skip to content

Upgrade Geo multi-server installation from 13.0.10 to latest 13.1 version

Versions

Current: 13.0.10 Target: 13.1.x

Upgrade checklist

Preflight - upgrade

  • Schedule a (recorded) Zoom meeting
  • Check if the Geo HA update instructions contain version specific changes
  • Check if PostgreSQL is already the latest shipped version. If not, ensure PostgreSQL upgrade instructions are followed
  • Check if any upgrade warnings exist
  • Verify that the Geo cluster is healthy pre-upgrade by visiting the Admin-Geo Nodes dashboard
  • Remove deploy nodes from load balancers/stop sidekiq and run looping-pipeline to confirm test pass
  • Find the latest packaged version of GitLab that can be used for zero downtime upgrades
  • Set up the looping test pipeline to run during the upgrade procedure.
  • Open the HAProxy stats dashboard for each site, to monitor health checks
  • Have readiness check failure logger script ready (readiness_logger.sh):
#! /bin/bash

while :
do
  curl --fail primary-url/-/readiness &> /dev/null || echo "$(date) Primary failed" | tee -a failure_log.txt
  curl --fail secondary-url/-/readiness &> /dev/null || echo "$(date) Secondary failed" | tee -a failure_log.txt
done

Upgrade

  • Retrieve a beverage of choice within a drinkable temperature range
  • Join Zoom meeting and wait for arrival. Hit the record button
  • Manually trigger the looping test pipeline to start before upgrading the primary site and before upgrading the secondary site (if failures happen during the primary site update). Wait a few minutes for tests to begin
  • Start logger: ./readiness_logger.sh
  • Perform upgrade steps described in latest documentation
  • During the upgrade process, monitor readiness_logger output, HAProxy stats dashboard and the looping test pipeline for any failures
  • Record any issues encountered during the upgrade
  • Verify cluster health post upgrade
  • Verify PostgreSQL version is correct

Postflight

  • Record the upgrade outcome as SUCCESS (upgrade with zero downtime), FAILED, PARTIAL SUCCESS (upgrade but with downtime or unconfirmed downtime)
  • Open new issues and inform @nhxnguyen and @fzimmer to confirm next steps
  • Create new issue for the next upgrade demo (the next versions) and assign to @nhxnguyen and @fzimmer
  • Update Geo validation tests docs page (doc/administration/geo/replication/geo_validation_tests.md)