Update gitlab rails and exporters for differentiating separate pgbouncer pools

With #1682 (closed), we now run 2 connection pools on the pgbouncer-sidekiq

  • gitlabhq_production_sidekiq_urgent for all urgent-* sidekiq shards
  • gitlabhq_production_sidekiq for all other sidekiq shards

Screenshot_2024-10-07_at_9.42.24_AM

However, metrics like pg_stat_activity_marginalia_sampler_active_count will not reflect the pgbouncer connection pool which the endpoint is using. Example of labels:

{__name__="pg_stat_activity_marginalia_sampler_active_count",
 __tenant_id__="gitlab-gstg",
 application="sidekiq",
 cluster="gstg-gitlab-gke",
 command="BEGIN",
 endpoint="Ci::ClickHouse::FinishedPipelinesSyncCronWorker",
 env="gstg",
 environment="gstg",
 fqdn="patroni-ci-v16-03-db-gstg.c.gitlab-staging-1.internal",
 instance="patroni-ci-v16-03-db-gstg",
 job="scrapeConfig/monitoring/prometheus-agent-postgres",
 machine_type="n2-standard-8",
 monitor="default",
 port="80",
 prometheus="monitoring/gitlab-rw-prometheus",
 provider="gcp",
 region="us-east1",
 server="127.0.0.1:5432",
 service="postgres",
 shard="default",
 stage="main",
 state="idle in transaction",
 tier="db",
 type="patroni-ci",
 usename="gitlab",
 wait_event="ClientRead",
 wait_event_type="Client",
 zone="us-east1-b"}

This means we could end up in the situation where:

  • gitlabhq_production_sidekiq_urgent pgbouncer database is fully saturated
  • gitlabhq_production_sidekiq pgbouncer database is barely used

Under such circumstances, when aggregating on the patroni fqdn, we would see that only 50% of the non-idle connections are used even though 1 pool is clearly saturated.

Approach

  1. Update rails to add db_config_database into marginalia: gitlab-org/gitlab!168272 (merged)
  2. Update gitlab-exporters queries.yaml's query to capture db_config_database:(\w+): gitlab-cookbooks/gitlab-exporters!344 (merged)
Edited by Sylvester Chin