release-tools - publish job succeeds even though some part of it failed
In the last patch release, I realized the job security_release_publish succeeded even though an action in the rake task failed. Examples below:
-
https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/14761861#L100
2024-07-24 13:32:46.610450 F ReleaseTools::RemoteRepository -- Failed to push -- {:remote=>:canonical, :ref=>"v17.2.1-ee", :output=>"send-pack: unexpected disconnect while reading sideband packet\nfatal: the remote end hung up unexpectedly\nConnection to gitlab.com closed by remote host.\r\n"} -
https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/14761862#L84
2024-07-24 13:37:05.759228 F ReleaseTools::RemoteRepository -- Failed to push -- {:remote=>:security, :ref=>"17-1-stable", :output=>"remote: error: cannot lock ref 'refs/heads/17-1-stable': is at f3bc738690c23756db4b21438a95a65898eb22a4 but expected f9bdf6edbb62fbcfed6019b70a3edcac2432d75d \nTo gitlab.com:gitlab-org/security/cluster-integration/gitlab-agent.git\n ! [remote rejected] 17-1-stable -> 17-1-stable (failed to update ref)\nerror: failed to push some refs to 'gitlab.com:gitlab-org/security/cluster-integration/gitlab-agent.git'\n"}
Because of that, these failures happened unattended. This is a risk for the reliability of the release pipeline.
We should also need to check if it behaves the same in the monthly pipeline, to fix both.
Exit Criteria
-
When a task in a publish job fails, the job should fail
Edited by Dat Tang