Load on file-cny-01 is high
Summary
We are seeing gRPC timeouts quite frequently.
We are seeing lots of git upload-pack
processes for gitlab-org/gitlab
.
More information will be added as we investigate the issue.
Timeline
All times UTC.
2020-04-22
- 11:20 - @stanhu opens incident issue, notification in
#incident-management
via Production-watch APP - 11:40 - @mwasilewski-gitlab determines that only the
gitlab-org/gitlab
project repository is being impacted. - 14:32 - @AnthonySandoval joins call as IMOC
- After receiving a brief update and confirming the scope of the issue is internal to the gitlab project on canary, @AnthonySandoval decides to keep the scope of the incident an S3 and to defer status page communications.
- 14:40 - The canary load is being monitored for specific gRPC calls to identify git actions that are increasing load. There is a gitlab project fork that is identified as having similarly unusual high activity.
- 14:53 - @nnelson observes that load on the canary node continues to decline.
- 15:00 - @igorwwwwwwwwwwwwwwwwwwww browses through
remote_ip
in Kibana, but we've determined there is no single IP we can attribute the requests to. - 15:08 - @igorwwwwwwwwwwwwwwwwwwww confirms that the Workhorse GitUploadPack endpoint invokes the PostUploadPack RPC call on Gitaly. Identifying the upstream source of the load.
- 15:24 - @AnthonySandoval declares the incident resolved without an identified root cause. SREs will continue to investigate the logs for additional insights.
Resources
- If the Situation Zoom room was utilised, recording will be automatically uploaded to Incident room Google Drive folder (private)
Edited by 🤖 GitLab Bot 🤖