Skip to content

pg-upgrade-logical: Improvements for multiple slots mode and Reverse logical replication

Vitaliy Kukharik requested to merge switchover-multiple-slots into master

Multiple slots mode (if pg_publication_count > 1)

What's new

  1. Added support for a logical replication configuration with multiple publications, slots, and subscriptions in the switchover.yml playbook.
  2. In the task "Wait until the logical replication lag is 0 bytes", the limit of 10 seconds has been removed since we need to wait for a zero replication lag in order to achieve a consistent state of the database (important if there are multiple publications/subscriptions)
    • skip if force_mode is true
  3. switchover_rollback.yml: Wait until the logical replication lag is 0 bytes before stop traffic to the new cluster leader and start traffic to the old cluster leader
    • skip if force_mode is true
  4. Fix the variable name for the loop for drop_subscription.yml and drop_publication.yml playbooks
  5. Monitor the locks and terminate the backend blocking the create publication query (for more than 15 seconds)
  6. Reverse logical replication.
    • if 'enable_reverse_logical_replication' is 'true'
    • enabled by default
  7. Added a pre-check to prepare a password file (.pgpass) on the Source cluster for reverse logical replication.
  8. Added new playbook stop_reverse_replication.yml for stop reverse logical replication after upgrade
    • it is executed after a decision is made that a rollback to the old cluster is no longer necessary.
  9. After switchover, drop publications replicated from the source cluster (details).
  10. When using multiple publications/slots, switch first the R/W traffic (Primary) and then the R/O traffic (replicas)
    • To ensure data consistency at the time of traffic switching to the new cluster.
    • Note: Added switchover_leader.yml and switchover_replica.yml playbooks which are run in the right order based on the value of the pg_publication_count variable.
    • Additionally: a minor code refactoring was performed.
  11. Optimization of "logical replication" tasks to reduce pgbouncer pause time
  12. Updated README
Edited by Vitaliy Kukharik

Merge request reports