Skip to content

pg_dump failing on GitLab instances with multiple PG read-replicas

Summary

GitLab backup uses the pg_dump command to take a backup of the PostgreSQL database. Since 15.11, a number of customers running Linux package (Omnisbus) PostgreSQL with multiple read replicas have reported that this command is failing.

There have been types of failure

  1. PQsocket() error ActiveRecord::StatementInvalid: PG::ConnectionBad: PQsocket() can't get socket descriptor

  2. PQconsumeInput() error ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly

Steps to reproduce

  1. Create environment on 15.11.13 using HA Postgres
  2. Ensure the DB size is around 5 GB in size
    1. Whilst the issue doesn't seem related to a specific size of DB, it can occur intermittently. As such the more data in the DB the easier it is to recreate.
  3. Run the backup tool from a rails node
    1. Ensure the node is configured to point at the Patroni leader and not via PGBouncer.

Example Project

What is the current bug behavior?

The backup fails with one of the following errors:

  1. PQsocket() error ActiveRecord::StatementInvalid: PG::ConnectionBad: PQsocket() can't get socket descriptor

  2. PQconsumeInput() error ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly

In testing, whilst running on 15.11.13 the error ActiveRecord::StatementInvalid: PG::ConnectionBad: PQsocket() can't get socket descriptor would appear somewhat intermittently, the backup would pass at first then seemed to begin failing more consistently. After upgrading to the latest nightly 16.2.4+rnightly.295461.af8a91b9-0 the error consitently changed to ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() server closed the connection unexpectedly

What is the expected correct behavior?

The pg_dump portion of the backup completes successfully without errors.

Relevant logs and/or screenshots

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Workaround

Use pg_dump directly

Edited by Rutger Wessels