infrastructure

infrastructure

Keep GitLab running and make it scale!

GitLab Infrastructure

These are not the runbooks you are looking for

I'm here to see how the GitLab.com infrastructure is made

You should read the production architecture then.

Where and how to look for data

General System Health

Blackbox Monitoring

  • GitLab Web Status: front end perspective of GitLab. Useful to understand how GitLab.com looks from the user perspective. Use this graph to quickly troubleshoot what part of GitLab is slow.
  • GitLab Git Status: front end perspective of GitLab ssh access.

Public Whitebox Monitoring

We offer a monitoring infrastructure site that is publicly accessible.

This monitoring site is updated hourly with any change we make in the private one, so it is a 1:1 copy of the private dashboards.

There are some metrics that are not visible in this public site because we do not keep a copy of metrics obtained through influxdb.

  • Fleet overview: useful to see the fleet status from the inside of GitLab.com. Use this graph to quickly see if the workers or the database are under heavy load, and to check load balancer bandwidth.
  • Postgres Stats: useful to understand how is the database behaving in depth. Use this graph to review if we have spikes of exclusive locks, active or idle in transaction processes
  • Postgres Queries use this dashboard to understand if we have blocked or slow queries, dead tuples, etc.
  • Storage Stats use this dashboard to understand storage use and performance.

Private Whitebox Monitor

  • Host Stats: useful to dive deep into a specific host to understand what is going on with it. Select a host from the dropdown on the top.
  • Business Stats: shows many pushes, new repos and CI builds.
  • Daily overview: shows endpoints with amount of calls and performance metrics. Useful to understand what is slow generally.