doc: Debugging Gitaly
All threads resolved!
All threads resolved!
Compare changes
@@ -42,7 +42,7 @@ A Gitaly dashboard could be either auto-generated or manually drafted. We use Js
Such dashboards usually include two parts. The second half contains panels of custom metrics collected from Gitaly. The first half is more complicated. It contains GitLab-wide indicators telling if Gitaly is "healthy" and node-level resource metrics. The aggregation and calculation are sophisticated. In summary, those dashboards tell us if Gitaly performs well according to predefined [thresholds](https://gitlab.com/gitlab-com/runbooks/-/blob/d0a5ff2f0ae23984679e0cf6e3361c6d4e71550b/metrics-catalog/services/gitaly.jsonnet), . We could contact [Scalability:Observability Team](../../../infrastructure/team/scalability/observability/) for any questions.
Such dashboards usually include two parts. The second half contains panels of custom metrics collected from Gitaly. The first half is more complicated. It contains GitLab-wide indicators telling if Gitaly is "healthy" and node-level resource metrics. The aggregation and calculation are sophisticated. In summary, those dashboards tell us if Gitaly performs well according to predefined [thresholds](https://gitlab.com/gitlab-com/runbooks/-/blob/master/metrics-catalog/services/gitaly.jsonnet), . We could contact [Scalability:Observability Team](../../../team/scalability/observability/) for any questions.
@@ -54,7 +54,7 @@ Some examples of using built-in dashboards to investigate production issues, fro
A panel in a dashboard is a visualization of the aggregated version of underlying metrics. We use [Prometheus](https://prometheus.io/docs/introduction/overview/) to collect metrics. To simplify, the Gitaly server exposes an HTTP server ([code](https://gitlab.com/gitlab-org/gitaly/-/blob/3b872218f78151d681011e5ef2bbc22b3721f6a2/internal/cli/gitaly/serve.go#L514)) that allows Prometheus instances to fetch metrics periodically.
A panel in a dashboard is a visualization of the aggregated version of underlying metrics. We use [Prometheus](https://prometheus.io/docs/introduction/overview/) to collect metrics. To simplify, the Gitaly server exposes an HTTP server ([code](https://gitlab.com/gitlab-org/gitaly/-/blob/master/internal/cli/gitaly/serve.go#L514)) that allows Prometheus instances to fetch metrics periodically.
@@ -64,7 +64,7 @@ Unfortunately, we don't have a curated list of all Gitaly metrics as well as the
- Aggregated metrics, such as combining different metrics or downsizing metrics due to high cardinality issues. The list of Gitaly's aggregated metrics is listed [in this file](https://gitlab.com/gitlab-com/runbooks/-/blob/e1d0ad78d24d51c36ea7dea28765ba16fd588d42/mimir-rules/gitlab-gprd/gitaly/gitaly.yml).
@@ -103,6 +103,25 @@ The query uses [PromQL](https://prometheus.io/docs/prometheus/latest/querying/ba
@@ -112,4 +131,8 @@ Kibana (Elastic) Dashboards