PG HA: Temporary replication - all replication slots are in use
Running GitLab provisioner fails when attempting to register with the expected master.
The logs seem to indicate a problem with having too many replication slots in use.
TASK [database : Register with the expected master] ****************************
skipping: [35.225.148.78]
fatal: [35.226.181.32]: FAILED! => {"changed": true, "cmd": ["gitlab-ctl", "repmgr", "standby", "setup", "ci-pipeline-74573910-database-0", "-w"], "delta": "0:00:14.112302", "end": "2019-08-04 02:05:10.804859", "msg": "non-zero return code", "rc": 1, "start": "2019-08-04 02:04:56.692557", "stderr": "Error running command: /opt/gitlab/embedded/bin/repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h ci-pipeline-74573910-database-0 -U gitlab_repmgr -d gitlab_repmgr -D /var/opt/gitlab/postgresql/data standby clone\nERROR: NOTICE: destination directory '/var/opt/gitlab/postgresql/data' provided\nINFO: connecting to upstream node\nINFO: Successfully connected to upstream node. Current installation size is 36 MB\nINFO: creating directory \"/var/opt/gitlab/postgresql/data\"...\nNOTICE: starting backup (using pg_basebackup)...\nHINT: this may take some time; consider using the -c/--fast-checkpoint option\nINFO: executing: '/opt/gitlab/embedded/bin/pg_basebackup -l \"repmgr base backup\" -D /var/opt/gitlab/postgresql/data -h ci-pipeline-74573910-database-0 -p 5432 -U gitlab_repmgr -X stream '\npg_basebackup: could not create temporary replication slot \"pg_basebackup_15263\": ERROR: all replication slots are in use\nHINT: Free one or increase max_replication_slots.\npg_basebackup: child process exited with error 1\npg_basebackup: removing contents of data directory \"/var/opt/gitlab/postgresql/data\"\nWARNING: standby clone: base backup failed\nERROR: unable to take a base backup of the master server\nWARNING: destination directory (/var/opt/gitlab/postgresql/data) may need to be cleaned up manually", "stderr_lines": ["Error running command: /opt/gitlab/embedded/bin/repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h ci-pipeline-74573910-database-0 -U gitlab_repmgr -d gitlab_repmgr -D /var/opt/gitlab/postgresql/data standby clone", "ERROR: NOTICE: destination directory '/var/opt/gitlab/postgresql/data' provided", "INFO: connecting to upstream node", "INFO: Successfully connected to upstream node. Current installation size is 36 MB", "INFO: creating directory \"/var/opt/gitlab/postgresql/data\"...", "NOTICE: starting backup (using pg_basebackup)...", "HINT: this may take some time; consider using the -c/--fast-checkpoint option", "INFO: executing: '/opt/gitlab/embedded/bin/pg_basebackup -l \"repmgr base backup\" -D /var/opt/gitlab/postgresql/data -h ci-pipeline-74573910-database-0 -p 5432 -U gitlab_repmgr -X stream '", "pg_basebackup: could not create temporary replication slot \"pg_basebackup_15263\": ERROR: all replication slots are in use", "HINT: Free one or increase max_replication_slots.", "pg_basebackup: child process exited with error 1", "pg_basebackup: removing contents of data directory \"/var/opt/gitlab/postgresql/data\"", "WARNING: standby clone: base backup failed", "ERROR: unable to take a base backup of the master server", "WARNING: destination directory (/var/opt/gitlab/postgresql/data) may need to be cleaned up manually"], "stdout": "Stopping the database\nRemoving the data\nCloning the data", "stdout_lines": ["Stopping the database", "Removing the data", "Cloning the data"]}
fatal: [35.222.6.164]: FAILED! => {"changed": true, "cmd": ["gitlab-ctl", "repmgr", "standby", "setup", "ci-pipeline-74573910-database-0", "-w"], "delta": "0:00:14.230953", "end": "2019-08-04 02:05:10.949441", "msg": "non-zero return code", "rc": 1, "start": "2019-08-04 02:04:56.718488", "stderr": "Error running command: /opt/gitlab/embedded/bin/repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h ci-pipeline-74573910-database-0 -U gitlab_repmgr -d gitlab_repmgr -D /var/opt/gitlab/postgresql/data standby clone\nERROR: NOTICE: destination directory '/var/opt/gitlab/postgresql/data' provided\nINFO: connecting to upstream node\nINFO: Successfully connected to upstream node. Current installation size is 36 MB\nINFO: creating directory \"/var/opt/gitlab/postgresql/data\"...\nNOTICE: starting backup (using pg_basebackup)...\nHINT: this may take some time; consider using the -c/--fast-checkpoint option\nINFO: executing: '/opt/gitlab/embedded/bin/pg_basebackup -l \"repmgr base backup\" -D /var/opt/gitlab/postgresql/data -h ci-pipeline-74573910-database-0 -p 5432 -U gitlab_repmgr -X stream '\npg_basebackup: could not create temporary replication slot \"pg_basebackup_15264\": ERROR: all replication slots are in use\nHINT: Free one or increase max_replication_slots.\npg_basebackup: child process exited with error 1\npg_basebackup: removing contents of data directory \"/var/opt/gitlab/postgresql/data\"\nWARNING: standby clone: base backup failed\nERROR: unable to take a base backup of the master server\nWARNING: destination directory (/var/opt/gitlab/postgresql/data) may need to be cleaned up manually", "stderr_lines": ["Error running command: /opt/gitlab/embedded/bin/repmgr -f /var/opt/gitlab/postgresql/repmgr.conf -h ci-pipeline-74573910-database-0 -U gitlab_repmgr -d gitlab_repmgr -D /var/opt/gitlab/postgresql/data standby clone", "ERROR: NOTICE: destination directory '/var/opt/gitlab/postgresql/data' provided", "INFO: connecting to upstream node", "INFO: Successfully connected to upstream node. Current installation size is 36 MB", "INFO: creating directory \"/var/opt/gitlab/postgresql/data\"...", "NOTICE: starting backup (using pg_basebackup)...", "HINT: this may take some time; consider using the -c/--fast-checkpoint option", "INFO: executing: '/opt/gitlab/embedded/bin/pg_basebackup -l \"repmgr base backup\" -D /var/opt/gitlab/postgresql/data -h ci-pipeline-74573910-database-0 -p 5432 -U gitlab_repmgr -X stream '", "pg_basebackup: could not create temporary replication slot \"pg_basebackup_15264\": ERROR: all replication slots are in use", "HINT: Free one or increase max_replication_slots.", "pg_basebackup: child process exited with error 1", "pg_basebackup: removing contents of data directory \"/var/opt/gitlab/postgresql/data\"", "WARNING: standby clone: base backup failed", "ERROR: unable to take a base backup of the master server", "WARNING: destination directory (/var/opt/gitlab/postgresql/data) may need to be cleaned up manually"], "stdout": "Stopping the database\nRemoving the data\nCloning the data", "stdout_lines": ["Stopping the database", "Removing the data", "Cloning the data"]}
Logging as this is a separate matter from the other errors with the PG_HA job.
Specific error: ERROR: all replication slots are in use
Edited by Robert Marshall