Increased CPU load on web nodes
Please note: if the incident relates to sensitive data, or is security related consider labeling this issue with security and mark it confidential.
- 
Main slack thread: https://gitlab.slack.com/archives/C8HG8D9MY/p1558080883153200 in 
#backend - Slack thread in 
#g_verifyhttps://gitlab.slack.com/archives/C0SFP840G/p1558084380233200 
Summary
A brief summary of what happened. Try to make it as executive-friendly as possible.
Service(s) affected : Team attribution : Minutes downtime or degradation :
- Post deployment patch - https://ops.gitlab.net/gitlab-com/gl-infra/patcher/merge_requests/88
 
Timeline
2019-05-16
- 20:27 UTC deployer: Marin Jankovski is starting a deploy pipeline of 11.11.0-rc2.ee.0 on gprd
 
2019-05-17
- 00:03 UTC patcher: Alex Hanselka is starting a deploy pipeline of post-deployment-patch on gstg
 - 00:12 UTC spike in CPU utilization on all web nodes in gprd
 - 00:19 UTC patcher: Alex Hanselka finished a deploy of post-deployment-patch on gstg
 - 00:19 UTC patcher: Alex Hanselka is starting a deploy pipeline of post-deployment-patch on cny
 - 00:23 UTC patcher: Alex Hanselka finished a deploy of post-deployment-patch on cny
 - 00:25 UTC patcher: Alex Hanselka is starting a deploy pipeline of post-deployment-patch on gprd
 - 00:51 UTC deployer: Marin Jankovski finished a deploy of 11.11.0-rc2.ee.0 on gprd
 - 01:27 UTC patcher: Alex Hanselka finished a deploy of post-deployment-patch on gprd
 - 07:37 UTC HighCPU alerts on web nodes
 - 07:50 UTC GitLabComLatencyWebCritical alerts
 - 08:20 UTC status.io incident opened
 - 08:53 UTC blocking all paths that end with 
deploy_keys.jsonin HAProxy to no effect - 10:34 UTC deployer: John Jarvis is starting a deploy pipeline of 11.11.0-rc1.ee.0 on gprd (Rollback)
 - 14:35 UTC ha-ctl process killed manually to make the rollback deployment pipeline move again
 - 14:50 UTC GitLabComLatencyWebCritical resolved
 - 15:26 UTC status.io incident resolved
 
Edited  by 🤖 GitLab Bot 🤖
