GSTG - Fix archive and delayed DR replicas (failed after the OS upgrade / rollback)
During the 1st dry-run of gstg OS upgrade rollout (from16.04 to 20.04) and rollback (from20.04 to 16.04) all the archive and delayed DR replicas stop applying the WALs and become out of sync
Affected servers:
-
postgres-ci-dr-archive-2004-01-db-gstg.c.gitlab-staging-1.internal -
postgres-ci-dr-delayed-2004-01-db-gstg.c.gitlab-staging-1.internal -
postgres-dr-archive-01-db-gstg.c.gitlab-staging-1.internal -
postgres-dr-archive-2004-01-db-gstg.c.gitlab-staging-1.internal -
postgres-dr-delayed-01-db-gstg.c.gitlab-staging-1.internal -
postgres-dr-delayed-2004-01-db-gstg.c.gitlab-staging-1.internal -
postgres-registry-dr-archive-01-db-gstg.c.gitlab-staging-1.internal -
postgres-registry-dr-delayed-01-db-gstg.c.gitlab-staging-1.internal
Acceptance Criteria:
-
Identify the root cause affecting DR replicas to go out of sync -
Implement the fix -
Synchronize all the DR replicas -
Revalidate the fix after the next gstg OS upgrade dry-run
Edited by Biren Shah