Test and validate PostgreSQL 10.0 upgrade for Geo
With %12.0 release, we shipped new PostgreSQL 10.0 that will be automatically upgraded. This automatic upgrade doesn’t handle Geo secondary nodes automatically (this is a limitation on how PostgreSQL upgrade process work). We need to validate our documentation has the correct documentation and covers the whole process. Ideally we should instruct our users to disable PostgreSQL 10.0 automatic upgrade to plan ahead when/how to rollout it when Geo is enabled. A simplified walkthrough of the upgrade to 12.0 with Geo would be:
- Disable PostgreSQL automatic upgrade
- Upgrade GitLab to 12.0
- Schedule downtime on secondary nodes
- Put secondary in maintenance mode (halt access)
- Upgrade primary
- Trigger postgresql upgrade on secondary
- Backup/restore database from primary
- Enable access again
Update
The 12.0 release will not impact Geo users as the auto upgrade will be skipped for both primary and secondary nodes. When the related issue omnibus-gitlab#4309 (comment 181022415) is complete, we will need to verify and document the upgrade process.
Update (September 2019)
We have performed some upgrades ourselves that have not succeeded first time, and we have a few reports from customers that have struggled over these releases. We need to take a closer look at if there is a problem around this series of upgrades.
- Install 11.11 in an HA configuration
- Upgrade to 12.0
- Upgrade to 12.1
Update from 13 September 2019
the current state of the investigation is:
- There are issues when upgrading from 11.11.5 through to 12.1.8 using Geo for zero downtime upgrades
- There are issues when upgrading from 11.11.5 through to 12.1.8 using down time upgrades
- We don't know yet how HA upgrades behave but @mkozono will perform an upgrade soon.
The individual problems may be different bugs and a first next step is to try and pin those down.