fix(patroni): PatroniXminAgeTooLarge non paging
What
- Delete
PatroniXminAgeTooLargeWarning
alert. - Increase time for
PatroniXminAgeTooLargeError
to 45min - Make
PatroniXminAgeTooLargeError
non paging
Why
-
PatroniXminAgeTooLargeWarning
is ans4
alert so no one looks at it and it's cluttering our code and alertmanager. - Looking at past examples gitlab-com/gl-infra/production#8568 (closed), https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17529, gitlab-com/gl-infra/production#8460 (closed) this cleared up on its own after 45min and we didn't see any performance degradation.
- At the moment because of
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17529 we
can't ignore
REINDEX
and FK validations. This is causing pages for the on-call for jobs that we run on the weekend and it's not actionable.The alert might still be important to look at so create an issue to look at it async. If we keep seeing this firing with no user impacting problem we might need to delete this alert
Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17529 Signed-off-by: Steve Azzopardi sazzopardi@gitlab.com
Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/17529