Geo: Extend Unhealthy status threshold from 10 minutes to 1 hour
What does this MR do and why?
Partially mitigates #381354. A 10 minute old status is not a critical failure so the Geo site should not show a red "Unhealthy" in that case.
Customers with large sites may take 10 minutes to generate a status. In that scenario, the status will flap Unhealthy/Healthy.
A 1 hour old status is much more likely to indicate a problem.
So this MR increases the threshold from 10 minutes to 1 hour.
References
How to set up and validate locally
- Set up Geo https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/geo.md#easy-installation
- Visit Admin > Geo > Sites
- Wait for the secondary site status to be
Healthy - In the secondary GDK,
gdk stop rails-background-jobs - Wait 50 minutes
- Visit Admin > Geo > Sites
- The secondary site status should still be
Healthy - Wait 11 minutes
- The secondary site status should be
Unhealthy
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Michael Kozono