feat: add sync WAL lag Prometheus metric for physical mode

Summary

Add Prometheus metrics to monitor WAL replay lag for the sync instance in physical mode.

Details

  • dblab_sync_wal_lag_seconds - WAL replay lag in seconds
  • dblab_sync_status - Status of the sync instance
  • dblab_sync_uptime_seconds - Uptime of the sync instance
  • dblab_sync_last_replayed_timestamp - Unix timestamp of last replayed transaction

Why

Critical for customers using DBLab instances for data extraction for analytics - need to monitor how far behind the sync instance is.


Acceptance Criteria

  • dblab_sync_wal_lag_seconds, dblab_sync_status, dblab_sync_uptime_seconds, and dblab_sync_last_replayed_timestamp metrics are available on the Prometheus exporter endpoint
  • WAL lag metric accurately measures the delay between the primary and sync instance in seconds
  • Metrics update correctly as the sync instance replays WAL records

Definition of Done

  • Metrics implementation merged and deployed
  • Tested with a physical mode DBLab instance showing accurate WAL lag values
  • Prometheus can scrape and visualize all 4 metrics
Edited by Nikolay Samokhvalov