Further documentation about stage group dashboards

In #665 (closed), we aim to provide basic introductory documentation about the stage group dashboards.

This issue is to extend that documentation with other useful topics. Further items will be added as we communicate with the stage groups.

From #665 (comment 471948549) (Done, already included in the introductory documentation)

Overview and summary of the components inside a dashboard. Some details to pay attention to filters (PROMETHEUS_DS, environment, deploy, canary-deploy, feature-flags), aggregation period, time interval, etc.
Meanings of each panel/metrics shown in the stage group dashboard. This may be too verbose, but it is helpful, especially for less experienced engineers. For example, when looking at the request rate per action web panel of Monitor group, t's hard to tell what 1 value of ProjectsController#Show actually means, whether it is 1 request per minute or 1 request per 30 seconds, why on the dashboard it's 1.2, not 1, what Request rate per action git means, etc.
How to use the metrics for debugging?
- Drill down, filters
- Explore more with Promql + the Explore feature of grafana.
- A real example of how to debug a production issue
How to customize and expand the metrics dashboards
How did we record and process the metrics? The pulling mechanism of prometheus and metric aggregation may affect user assumptions about accuracy and precision over a time period.
Further links and documents
Roadmap and milestones in future

Edited Dec 30, 2020 by Quang-Minh Nguyen