Skip to content

Evaluate the need for WalReceiverSaturation metric for BBM

Overview

Some time ago, !150544 (merged) introduced a new BBM health status indicator based on WAL receiver saturation metric.

Currently, the SLI is set to 90% saturation. I've been seeing that this indicator is halting BBMs for most of on-peak hours, leaving room to execute the BBM mostly on weekends.

Deliverables

  • Disable the Gitlab::Database::HealthStatus::Indicators::WalReceiverSaturation, which is controlled by db_health_check_wal_receiver_saturation FF;
  • Monitor the performance of BBMs and the DB health;
  • Remove the Gitlab::Database::HealthStatus::Indicators::WalReceiverSaturation indicator from the codebase;
  • Open a new CR issue to remove the metrics from application config (example: !150544 (merged))
  • Clean up db_health_check_using_mimir_client. We're already fetching metrics through Mimir. We can remove the Prometheus client.
Edited by Leonardo da Rosa