feat: web-pages dependsOn api
What
Don't page on web-pages
when the api
service is firing alerts
already.
This creates the following inhibit rules:
inhibit rules
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="loadbalancer"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="loadbalancer_https"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="server"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="loadbalancer"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="nginx_ingress"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="workhorse"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
- equal:
- env
- environment
- pager
source_matchers:
- component="rails_requests"
- type="api"
target_matchers:
- component="server_headers"
- type="web-pages"
Why
On 2022-03-31
we've seen service degradation on the api
service and as a result
web-pages
was also violating the SLO. This is because web-pages
depends heavily on api
for domain information and any other data
retrival that lives in the monolith.
In
https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16030#note_1024703191
we have validated the service chains work as expected. The chain we have
here is patroni
-> api
-> web-pages
. If patroni
is firing an
alert neither api
or web-pages
will fire an alert if they are
violating SLOs.
Reference: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16030