Determine failures in staging in relation to urgent-cpu-bound shard
Two persons raised concerns after we had moved the urgent-cpu-bound
queue from VM's into Kubernetes. After restoring the VM in staging, preprod continues to show the same errors. At the time I was unable to determine root cause. Let's rope in some people on this issue to see what we need to investigate. This should be considered a blocker for moving this shard to Kubernetes into production.
Threads that started discussion:
- https://gitlab.slack.com/archives/C3JJET4Q6/p1593701658496800
- https://gitlab.slack.com/archives/CB3LSMEJV/p1593734524127900
Bits of investigative detail were captured in the latter thread.
Potential Problems:
-
Unable to create a project from a template - #984 (comment 374728626) -
Issue related to external-diff storage - #984 (comment 374315984) -
Source MR's reporting non-existant - gitlab-org/gitlab#227585 (closed)
Edited by Stan Hu