Error running reconfigure on standby instances
Coming from gitlab-com/gl-infra/infrastructure#10175
In the postgresql::enable. The
pg_helper.is_slave? method bases its results on the output of
SELECT pg_is_in_recovery(). If that command fails,
is_slave? returns false, and
reconfigure tries to create the
In some instances, the database might not be ready (i.e. a standby initializing), and a psql command will exit non-zero, with a
database not ready issue.
reconfigure will read this as the database not being a standby instance, and attempt to create the user, which will also fail.
I think we have a few options
is_slave? as is and
pg_isreadyto wait for the system to come online. This could be a while, and in this particular case we could wait a while to determine we don't need to do anything
pg_isreadyto skip steps that require a ready database, possibly warning the user that we are doing so. Running
reconfigurelater when the database is ready should do the right thing
I think the latter option is my preferred.
We could check for a
recovery.conf file in
data_dir as part of
is_slave?. But this would only work temporarily as recovery.conf is going away in PostgreSQL 12.
We may also consider making
psql_cmd return more than boolean, or add another method for executing psql commands so we can differentiate between exit codes.