Let's figure out how monitoring works at GitLab

We have a lot of documentation:

But none of this describes how monitoring is wired up and configured for GitLab.com. Example stuff that is missing:

Thanos and it's configuration
Network peering the fact that our dashboards talk to ONE thanos server that reaches out to the appropriate servers
How alerting is managed with our various environments
What trickster does
And then nothing monitors the monitor, so at times, our dashboards will disappear and we have nothing to tell us that has happened until someone looks at a blank dashboard
I've personally struggled to look at our various chef roles and cookbooks to connect all the dots, so it would be wise to figure out how things are configured and what components are where

Use this issue to:

Edited Jun 17, 2019 by John Skarbek