Skip to content

fix(alerts): lower camoproxy error ratio

Steve Xuereb requested to merge fix/camoproxy-error-ratio into master

What

Lower the error ratio from 0.99 to 0.8

before after
Screenshot_2022-06-22_at_10.25.12
Source
Screenshot_2022-06-22_at_10.25.34
Source
Screenshot_2022-06-22_at_10.29.52
Source
Screenshot_2022-06-22_at_10.30.11
Source

Why

We have little control over camoproxy since it depends on the origin of the content we are serving, if that origin is down, camoproxy will serve 500 error as well.

Looking at the past pages we see most of them auto-resolve within 10 minutes.

We could lower the severity of this service to be an s3 but this would mean that it would result in us never paging for this service. Instead, lower the error ratio to page on large error ratios as seen in https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/15863#note_1000390851

Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/15863

Edited by Steve Xuereb

Merge request reports