Possible infinite loop with Patroni bootstrapping a follower in the standby cluster and pg_basebackup
Summary
During my tests with #224653 (closed) in one attempt to bootstrap a follower in the standby cluster, because there was an issue with providing the password to the correct files, I got into a state where pg_basebackup was issued against the leader, but was missing the password. Because of that, it kept asking for the password and logging that. I'm not sure if the process was being respawned or if that was a behavior of pg_basebackup itself.
Steps to reproduce
I haven't tried to reproduce the issue yet. So I have a few ideas on what is the required state (as during the tests I have tried multiple changes in order to get the authentication working reliably).
Few things we could try are:
- Add the host to pg_hba with MD5 only (no trust)
- Remove the replication password from the configuration file and run
gitlab-ctl reconfigure
before trying to bootstrap the cluster - Set a wrong replication password
What is the current bug behavior?
The follower would not bootstrap, and by looking at the logs it would spawn Password:
indefinitely:
2020-09-04_17:12:20.47387 2020-09-04 17:12:20,472 INFO: no action. i am a secondary and i am following a standby leader
2020-09-04_17:12:20.47937 Password:
2020-09-04_17:12:20.48794 Password:
...
When looking at top/htop, you shouuld see a stuck pg_basebackup
process:
pg_basebackup
: gitlab-+ 240924 27.9 0.0 7412 6360 ? R 16:58 5:52 /opt/gitlab/embedded/bin/pg_basebackup --pgdata=/var/opt/gitlab/postgresql/data -X stream --dbname=dbname=postgres
What is the expected correct behavior?
If it's an authentication issue, it should fail correctly and not keep the pg_basebackup
running.
Results of GitLab environment info
Expand for output related to GitLab environment info
System information System: Ubuntu 20.04 Proxy: no Current User: git Using RVM: no Ruby Version: 2.6.6p146 Gem Version: 2.7.10 Bundler Version:1.17.3 Rake Version: 12.3.3 Redis Version: 5.0.9 Git Version: 2.28.0 Sidekiq Version:5.2.9 Go Version: unknown rake aborted! PG::ConnectionBad: fe_sendauth: no password supplied /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/info.rake:48:in `block (3 levels) in ' /opt/gitlab/embedded/bin/bundle:23:in `load' /opt/gitlab/embedded/bin/bundle:23:in `' Tasks: TOP => gitlab:env:info (See full trace by running task with --trace)