Repmgr standby setup forces Postgres to create a socket with the primary Postgres node IP

Summary

Repmgr standby setup forces Postgres to create a socket with the primary Postgres node IP.

Steps to reproduce

This issue was found in the versions 12.1.4 and above. To reproduce you need to have:

GitLab HA deployment with consul;
Postgres Master (with repmgr) up and running;
Secondary Postgres up and running;
Try to setup the secondary Postgres as a standby node, with the following command:

gitlab-ctl repmgr standby setup ${master_ip} -w

What is the current bug behavior?

The presented output is as follows after running the gitlab-ctl repmgr standby setup ${master_ip} -w

Stopping the database
Removing the data
Cloning the data
Starting the database
uninitialized constant Timeout::TimeoutError
There is no repmgr command standby
  Available repmgr commands:
  master register -- Register the current node as a master node in the repmgr cluster
  standby
    clone MASTER -- Clone the data from node MASTER to set this node up as a standby server
    register -- Register the node as a standby node in the cluster. Assumes clone has been done
    setup MASTER -- Performs all steps necessary to setup the current node as a standby for MASTER
    follow MASTER -- Follow the new master node MASTER
    unregister --node=X -- Removes the node with id X from the cluster. Without --node removes the current node.
    promote -- Promote the current node to be the master node
  cluster show -- Displays the current membership status of the cluster

The timeout is due Postgres not be running anymore, this happened because standby setup brought the primary configuration that contains the primary node IP which makes Postgres try to create a socket using the primary's IP. This can be verified in the postgress logs where 10.10.1.30 is actually the primary IP (these logs come from the secondary Postgres node).

2019-09-25_15:38:41.56520 LOG:  could not bind IPv4 address "10.10.1.30": Cannot assign requested address
2019-09-25_15:38:41.56522 HINT:  Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
2019-09-25_15:38:41.56522 WARNING:  could not create listen socket for "10.10.1.30"
2019-09-25_15:38:41.56522 FATAL:  could not create any TCP/IP sockets
2019-09-25_15:38:41.56523 LOG:  database system is shut down

What is the expected correct behavior?

The expected behavior would be to have the secondary node being set up with Postgres restarted successfully after cloning the primary data over.

Details of package version

Provide the package version installation details

gitlab-ee-12.1.4-ee.0.el7.x86_64

OBS: This was also reproduced on the version 12.3.0 with package gitlab-ee-12.3.0-ee.0.el7.x86_64

Environment details

Operating System: Centos7
Installation Target, remove incorrect values:
- VM: Virtual Box
Installation Type, remove incorrect values:
- New Installation
Is this a single or multiple node installation? Multinode
Resources
- CPU: 1cpu
- Memory total: 2Gb

Configuration details

Provide the relevant sections of `/etc/gitlab/gitlab.rb`

roles ['postgres_role']
postgresql['port'] = 5432
postgresql['listen_address'] = '10.10.1.31'
postgresql['hot_standby'] = 'on'
postgresql['wal_level'] = 'replica'
postgresql['shared_preload_libraries'] = 'repmgr_funcs'
gitlab_rails['auto_migrate'] = false
consul['services'] = %w(postgresql)
postgresql['pgbouncer_user_password'] = 'xxxxx'
postgresql['sql_user_password'] = 'xxxxxxx'
postgresql['max_wal_senders'] = 4
postgresql['max_replication_slots'] = 4
postgresql['max_replication_slots'] = 4
postgresql['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.10.1.30/32 10.10.1.31/32 10.10.1.32/32 10.10.1.33/32 10.10.1.38/32)
repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 10.10.1.30/32 10.10.1.31/32 10.10.1.32/32)
consul['monitoring_service_discovery'] = true
node_exporter['listen_address'] = '10.10.1.31:9100'
postgres_exporter['listen_address'] = '10.10.1.31:9187'
postgres_exporter['env']['DATA_SOURCE_NAME'] = "user=gitlab password='xxxxxx' host=10.10.1.31 database=postgres sslmode=disable"
consul['configuration'] = {
  bind_addr: '10.10.1.31',
  retry_join: %w(10.10.1.34 10.10.1.35 10.10.1.36)
}
repmgr['master_on_initialization'] = false

Edited Oct 28, 2019 by Adriano S. Fonseca