postgres-exporter v0.14.0 (GitLab 16.5.0) leaking (and exhausting) client connections to PostgreSQL
Summary
A customer reported errors from their GitLab 16.5 instance:
PG::ConnectionBad: connection to server on socket "/var/opt/gitlab/postgresql/.s.PGSQL.5432" failed:
FATAL: sorry, too many clients already
It was resolved with gitlab-ctl stop postgres-exporter
Before [..] we had around 979 postgres processes, after the command [..] 60
postgres-exporter
versions
Reviewing the change log upstream I noticed this which ran alarm bells ..
0.14.0 / 2023-09-11
[CHANGE] Change database connections to one per scrape #882 #902
I suspect this change is shaking out connect leak bugs, since 0.14 includes a bug fix for a connection leak issue that cited #882
I've also found https://github.com/prometheus-community/postgres_exporter/issues/921 which states there's a leak in 0.14
The fix for that is in the change log for 0.15
0.15.0 / 2023-10-27
[BUGFIX] Adjust collector to use separate connection per scrape #936
Workarounds
gitlab.rb
modifications.
- Set
prometheus_monitoring['enable'] = false
to disable all exporters and Prometheus (if you don't use it) - Or, set
postgres_exporter['enable'] = false
to disable just the PostgreSQL exporter
Apply with gitlab-ctl reconfigure
Steps to reproduce
Running postgres-exporter
on a single node GitLab install is sufficent
Diagnose postgres-exporter
as the root cause with:
-
Run on the database console (
sudo gitlab-psql
)select count(*) from pg_stat_activity;
-
Stop the exporter
sudo gitlab-ctl stop postgres-exporter
-
Re-run the
select count(*)
query in the first step. The main source of database connections should besidekiq
andpuma
so a reduction in connections of10
or higher would be strongly suggestive of a problem withpostgres-exporter
What is the current bug behavior?
available PostgreSQL connections get drained
What is the expected correct behavior?
postgres-exporter doesn't leak database connections