QA Pipeline Exceed Timeouts Causing Pipeline Failures across Omnibus
During triage, troubleshooting pipeline build failures.
Tracked failure to QA jobs exceeding runtime of 2 hours, our established timeout.
- Original Merge Request that is spawning the failing job: !5041 (merged)
- Top level job spawning the failure https://gitlab.com/gitlab-org/omnibus-gitlab/-/jobs/1143810872
- QA Pipeline that is causing the timeout: https://gitlab.com/gitlab-org/gitlab-qa-mirror/-/pipelines/279645427
In particular, there is a Gitaly 500 error that looks like the most likely causal factor as seen in job https://gitlab.com/gitlab-org/gitlab-qa-mirror/-/jobs/1144106685 and summarized below.
Failures:
1) Create Gitaly Cluster replication queue allows replication of different repository after interruption
Failure/Error: repository.push_all_branches
QA::Support::Run::CommandError:
The command HOME="/tmp/qa-netrc-credentials/20" git push --all 2>&1 failed (1) with the following output:
error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500 Internal Server Error
fatal: the remote end hung up unexpectedly
fatal: the remote end hung up unexpectedly
Everything up-to-date
These failures are cascading across dep builds, and potentially all triggered package builds.