Skip to content

fix(alerts): ai-gateway igress error ratio

Steve Xuereb requested to merge fix/alert-ai-gateway into master

What

Reduce the error ratio from 0.999 to 0.98 for the runway_ingress SLI for the ai-gateway service.

Before After
Screenshot_2024-01-23_at_10.53.36
source
Screenshot_2024-01-23_at_10.53.08
source

Why

Screenshot_2024-01-23_at_10.55.06

This alerts paged the on-call 25 times in the last 14 days (1-2 times every weekday).

This alert is not actionable at the moment until we fix https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17366.

Reference: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17366

Merge request reports