Add retry to a failed jobs for that downstream pipeline capability for `trigger:` jobs in CI
Problem to solve
When there is a
trigger: job with
strategy: depend and the downstream job fails, it currently is not possible to continue the upstream pipeline with a retry.
This ends up being problematic when the downstream job is flaky or occasional intermittent errors are expected for integration tests. Internally we are running into this limitation when we trigger gitlab-QA as a downstream job.
trigger: depends set on the job and in this specific example, the downstream job failed once, and then we retried it to pass. Although all the downstream jobs for sanity are green, the upstream job remains failed.
The feature we would like is to add a retry to the
gstg-qa-sanity job, so we can re-trigger downstream, where hopefully a subsequent run will pass.
As a workaround for now instead of using
trigger: keywoard for multi-project pipelines we are using a
curl with the
This example will trigger a job using the
$CI_JOB_TOKEN and wait up to 5 minutes for the downstream job to succeed. Note that the status check requires a gold plan.
script: - apk add curl jq # /full/path/to/project - project="full%2fpath%2fto%2fproject" - | resp=$(curl --fail -s --request POST --form "token=$CI_JOB_TOKEN" --form ref=master https://gitlab.com/api/v4/projects/$project/trigger/pipeline) id=$(echo $resp | jq -r '.id') web_url=$(echo $resp | jq -r '.web_url') echo "Waiting for pipeline $web_url ..." # the status check below does not work on a free plan, you must have a gold account for retry in $(seq 1 20); do status=$(curl -s --header "JOB-TOKEN: $CI_JOB_TOKEN" "https://gitlab.com/api/v4/projects/$project/pipelines/$id" | jq -r '.status') echo "Got pipeline status $status , retry $retry/10" [[ $status == "success" || $status == "failed" ]] && break sleep 15 done if [[ $status != "success" ]]; then echo "$web_url has status $status, failing" exit 1 fi
This behaves the same as using the
trigger: keyword with the benefit of retries for a failure downstream.
For cases where we need to depend on the success of the downstream pipeline, it requires additional polling (which can also use the
User experience goal
- We will add a retry button for bridge jobs that retries the failed job on a downstream pipeline
- The retry button will appears on a bridge job only if there is at least one failed job in a downstream pipeline