`Trigger:gitlab-docker` sometimes pushes a corrupt image. Fail or retry job instead?
We sometimes see all gitlab-qa
test jobs fail because the omnibus-gitlab docker image is corrupted somehow. E.g.: https://gitlab.com/gitlab-org/gitlab-qa/pipelines/119068388
In the failed jobs we see errors like:
Unable to find image 'registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:d710d138f3c3a9a2b5f744e304f115023a75a6af' locally
docker: Error response from daemon: manifest for registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:d710d138f3c3a9a2b5f744e304f115023a75a6af not found: manifest unknown: manifest unknown.
Sometimes it's not clear what went wrong, but in this case there's an error while pushing, e.g. https://gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/-/jobs/442031297
{"status":"Pushing","progressDetail":{"current":1944770560,"total":1885250170},"progress":"[==================================================\u003e] 1.945GB","id":"5d6a7e1e00b1"}
{"status":"Pushed","progressDetail":{},"id":"5d6a7e1e00b1"}
{"errorDetail":{"message":"unknown blob"},"error":"unknown blob"}
Pushed registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:d710d138f3c3a9a2b5f744e304f115023a75a6af
Job succeeded
I expect the error is a problem with the registry, but would it be possible to fail the job if there's an error like that? That way we wouldn't run all the downstream jobs that would fail anyway.
Or retry the job?