2019-09-04 CPU pinned at 100% on file-01-stor-gprd

Summary

A brief summary of what happened. Try to make it as executive-friendly as possible.

Service(s) affected : ~"Service:Gitaly" Team attribution : Minutes downtime or degradation : ~2 hours

Timeline

2019-09-04

  • 22:09 UTC - Alert "CPU use percent is extremely high on file-01-stor-gprd.c.gitlab-production.internal for the past 2 hours" triggered
  • 23:59 UTC - We lowered the concurrency limit for SSHUploadPack manually on file-01 to 15. This instantly lowered the CPU usage and cleared the alert https://gitlab.slack.com/archives/C101F3796/p1567641560389700
Edited Sep 05, 2019 by Alejandro Rodríguez
Assignee Loading
Time tracking Loading