feat: web-pages dependsOn api
What
Don't page on web-pages when the api service is firing alerts
already.
This creates the following inhibit rules:
inhibit rules
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
Why
On 2022-03-31
we've seen service degradation on the api service and as a result
web-pages was also violating the SLO. This is because web-pages
depends heavily on api for domain information and any other data
retrival that lives in the monolith.
In
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16030#note_1024703191
we have validated the service chains work as expected. The chain we have
here is patroni -> api -> web-pages. If patroni is firing an
alert neither api or web-pages will fire an alert if they are
violating SLOs.
Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16030