Create baseline monitoring dashboard and runbook to help us troubleshoot issues during dogfooding
Overview
As we plan to kick off the dogfooding of Pre-receive Secret Detection feature soon on GitLab.com, it's essential for us to have:
- A dashboard that engineers will look at if/when there is an incident/problem.
- A runbook that explicitly guides how to look at this dashboard and how to improve it.
Resources
Below is a list of available resources that could be used for achieving both tasks.
Components
While the feature, in its current form, doesn't have any external components and is entirely encapsulated within the application server as a dependency, it does interact with a number of components as outlined in this push event sequence diagram. Those components are:
-
Workhorse
git-receive-pack
-
Gitaly
PostReceivePack
PreReceiveHook
-
ListAllBlobs()
RPC -
ListBlobs()
RPC -
GetTreeEntries()
RPC
-
Rails
-
/internal/allowed
Endpoint
-
Tools
- Kibana (Logs)
- Grafana (Metrics)
- Sentry (Error Tracking)
Existing Dashboards and Projects
- Grafana
- Sentry
Assistance Required
There may be some assistance required from groupgitaly or @gitlab-com/gl-infra
when creating the dashboard.
Proposal
-
Create a dashboard that provides an overview of the components discussed above. -
Create a runbook that explains how to read the dashboard and how to improve it.