Skip to content

Add metrics on searching for caught up replicas [RUN ALL RSPEC] [RUN AS-IF-FOSS]

What does this MR do?

Add new metrics to reflect how often we search for caught-up replicas (in the current implementation, we demand that all of the replicas should be caught up to unstick). It would allow us to understand how often do we perform that operation and the distribution of the results.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Test plan:

  1. Configure replication on GDK: https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/database_load_balancing.md
  2. Check that the metrics appeared on /-/metrics (some write required, e.g. create an issue)
  3. You could also simulate replication delay - link - so, you would see this metric with the false label (which would mean all_caught_up? was false and we picked primary)
  4. Additionally, test for the structured logging output
  5. Also: check both with load balancing enabled and not; verify that the middleware loading order change didn't cause any issue

Security

N/A

Related to #326125 (closed)

Edited by Aleksei Lipniagov

Merge request reports