Add guidance for Geo customers upgrading to glibc-2.28 or later
Problem to solve
Currently, our Requirements for running Geo document states that you must complete the steps to Check OS locale data compatibility. However, we do not provide guidance on how to upgrade to an OS that includes glibc-2.28 or later, or how to upgrade the package itself.
We already provide upgrade guidance to customers who are not running Geo. We want to provide similar guidance for customers who are upgrading from glibc-2.27.
Further details
As customers begin to upgrade older OSs, I expect we'll see this issue more frequently. It's viable to use the backup and restore procedure to restore to a newer OS (since we use pg_dump which is not affected), but this might be too complicated or time consuming in complex environments.
GitLab is deprecating CentOS 7 which used glibc-2.27. The deprecation will encourage customers to upgrade to newer operating systems. Newer operating systems will be running glibc-2.28 or later. A major update to locale data in glibc-2.28 causes Postgres indexes created with earlier versions of glibc to be corrupted.
Proposal
We need to provide guidance for Geo customers who would like to upgrade their OS.
-
We could list all of the options as noted in https://about.gitlab.com/blog/2022/08/12/upgrading-database-os/ -
We should not recommend people attempt the same procedure that GitLab.com did, because it requires DBREs to be involved. We should also not recommend Logical Replication due to complexity and lack of validation testing. -
Add Geo subsection to Reindex all indexes during the scheduled downtime windowsection. -
Add Geo subsection to https://docs.gitlab.com/ee/administration/postgresql/upgrading_os.html#backup-and-restore. E.g.: - Stop GitLab on all sites
- Backup Postgres on the primary site
- Upgrade OS on all sites
- Restore Postgres on the primary site
- Set up Postgres replication to the secondary site again
- Start GitLab on all sites
-
Mention in https://docs.gitlab.com/ee/administration/postgresql/upgrading_os.html#replication-and-failover that when you install the new OS on a new server, you might need to reinstall the GitLab package targeting that OS. Link to https://docs.gitlab.com/ee/administration/package_information/supported_os.html#update-gitlab-package-sources-after-upgrading-the-os. -
Add a section to https://docs.gitlab.com/ee/administration/package_information/supported_os.html called e.g. Corrupted Postgres indexes after upgrading the OS. Mention that as part of upgrading the OS, if yourglibcversion changes, then you must follow https://docs.gitlab.com/ee/administration/postgresql/upgrading_os.html to avoid corrupted indexes.
Who can address the issue
The database and Geo teams
Other links/references
- #433310 (closed)
- &8573 (and its issues)
- https://docs.gitlab.com/ee/administration/postgresql/upgrading_os.html
- https://www.cybertec-postgresql.com/en/icu-collations-against-postgresql-data-corruption/
- https://postgresql.verite.pro/blog/2018/08/27/glibc-upgrade.html
- https://docs.gitlab.com/ee/administration/geo/replication/troubleshooting/common.html#check-os-locale-data-compatibility
- https://dba.stackexchange.com/questions/240930/postgresql-difference-between-collations-c-and-c-utf-8
- https://docs.gitlab.com/ee/administration/postgresql/replication_and_failover.html#near-zero-downtime-upgrade-of-postgresql-in-a-patroni-cluster
- https://peter.eisentraut.org/blog/2023/03/14/how-collation-works
- https://pganalyze.com/blog/5mins-postgres-17-builtin-c-utf8-locale
- https://pganalyze.com/blog/5mins-postgres-collations