2020-03-11: Gitaly error rate is high on file-45
Summary
More information will be added as we investigate the issue.
Timeline
All times UTC.
2020-03-11
- 17:37 - Gitaly error rate abruptly rises. On 1 Gitaly node (
file-45
), CPU and memory usage rise rapidly. PagerDuty alert: https://gitlab.pagerduty.com/incidents/PRHLS53 - 17:55 - @cindy and @nnelson identify the specific project receiving the excess traffic.
- 18:04 - The extra workload ends abruptly. Resource usage returns to normal on
file-45
.
Resources
@cindy's Kibana graph showing the Gitaly gRPC calls correlated with the workload spike:
https://log.gprd.gitlab.net/goto/283f590166882b80982dc661aadcb560
(Not including a screenshot to protect the identity of the project.)
Edited by 🤖 GitLab Bot 🤖