Create self monitoring dashboard for our Omnibus users
Problem to solve
Omnibus users cannot see data in the default self monitoring dashboard.
Intended users
Further details
The current default self monitoring dashboard relies on recording rules that are present in gitlab.com but not on Omnibus. As such, Omnibus customers do not see any data in their charts.
This issue is to discuss building a self monitoring default dashboard using the default metrics that are available in the core Gitlab product.
Metrics used in the current self monitoring dashboard:
-
max(max_over_time(gitlab_service_errors:ratio{environment="{{ci_environment_slug}}", type="web", stage="main"}[1m])) by (type) * 100
-
Uses
gitlab_service_errors:ratio
which is defined in autogenerated-error-ratios.yml#L12 assum by (environment, tier, type, stage) (gitlab_component_errors:rate >= 0) / sum by (environment, tier, type, stage) (gitlab_component_ops:rate > 0)
-
The above definition in turn uses
gitlab_component_errors:rate
which is defined in -
The definition for
gitlab_service_errors:ratio
also uses recording rules forgitlab_component_ops:rate
which is defined in
-
-
avg(slo:max:gitlab_service_errors:ratio{environment="{{ci_environment_slug}}", type="web", stage="main"}) or avg(slo:max:gitlab_service_errors:ratio{type="web"}) * 100
- Uses
slo:max:gitlab_service_errors:ratio{type="web"}
which is defined in autogenerated-service-slos.yml#L173.
- Uses
-
avg(slo:max:gitlab_service_errors:ratio{environment="{{ci_environment_slug}}", type="api", stage="main"}) or avg(slo:max:gitlab_service_errors:ratio{type="web"}) * 100
- Uses
slo:max:gitlab_service_errors:ratio{type="api"}
defined in autogenerated-service-slos.yml#L12.
- Uses
Proposal
In order to ensure that the default self monitoring dashboard works in Omnibus, we can create a self monitoring dashboard using base metrics that exist in the GitLab product, or we can add the required recording rules to the Omnibus Prometheus.
Documentation
https://docs.gitlab.com/ee/administration/monitoring/gitlab_self_monitoring_project/index.html
Availability & Testing
What does success look like, and how can we measure that?
The default self monitoring dashboard shows data on an Omnibus installation out of the box.