2019-11-04 Spike in 500 errors, gitaly CPU starvation

Some GitLab.com actions were resulting in 500 status response. Gitaly nodes seemed to have running at 100% CPU at https://dashboards.gitlab.net/d/gitaly-main/gitaly-overview?orgId=1&from=now-1h&to=now

image__3_

  • 09:12 UTC, Pagerduty alerts the oncall about gitaly being below the SLO
  • 09:22 UTC, Alert resolves

During the period of the incident, there was no response on the actions taken to resolve the incident.

Assignee Loading
Time tracking Loading