GitLab.com outage 2017-10-23

Timeline of events (in UTC time):

  • 13:16 load average on api-08 goes up to 19 and seem to make nfs-file-12 load to go way up
  • 13:17 load average on nfs-file-12 goes up to 400+
  • 13:18 to 13:20 number of reqs per second goes down from ~500 to ~70
  • 13:20 load average on the frontend fleet goes up to 30 in average
  • 13:25 to 13:27 number of reqs per second goes down again to ~60
  • 13:25 web-05 and web-10 went MIA
  • 13:26 load average on nfs-file-10 goes up to 300
  • 14:05 web-10 came up after issuing reboot from the Azure panel
  • 14:16 web-02 came up after issuing reboot from the Azure panel
  • 14:24 web-09 came up after issuing reboot from the Azure panel
Edited Oct 23, 2017 by Victor Lopez
Assignee Loading
Time tracking Loading