New keyword to require optional jobs complete if ran
Summary
There are pipelines crafted in which a job is not always required (security checks early in the dev process, pre-deploy jobs, etc.) but when the job is ran it needs to be successful. Today there is not a combination of keywords that let a user declare a job as optional
but require it to be successful when ran. This can lead to aged artifacts being used downstream.
Details available here: https://gitlab.zendesk.com/agent/tickets/325961 (internal)
Use Case
Consider a pipe who's structure is described in this pseudo code:
pipe: main
stage: ci
job: trigger: ci
child-pipe: ci
stage: ci-1
job: create docker svc1
artifact: svc1.image-info.json
job: create docker svc3
artifact: svc2.image-info.json
stage: summary
job: summary
needs:
- svc1 optionally
- svc2 optionally
cache:
key: commit-sha + '_ci'
paths: [ *.image-info.json ]
stage: cd
job: trigger alpha-cd
child-pipe: alpha-cd
stage: cd
job: cd-svc1
cache:
policy: pull only
key: commit-sha + '_ci'
logic: if svc1.image-info.json is found - deploy it
job: cd-svc1
cache:
policy: pull only
key: commit-sha + '_ci'
logic: if svc2.image-info.json is found - deploy it
Now, consider that the CI jobs got flaky for some reason.
Lets assume that the pipe produced svc1 successfully, but not svc2.
As a result, the developer opens the CI and replays the flaky jobs until they get all docker images produced.
However, by now the ci-summary
has already ran, and includes the image info of only svc1, even though the build page shows green "V" for all jobs -
i.e - IT PRESENT A FALSE INDICATION - WHICH IS AN INTEGRITY ISSUE
Now, when the developer plays alpha-cd
- they get only svc1 deployed.
If they go back to the CI and replay ci-summary
- this time ci-summary
is able to pull all image-infos, and the CDs that are fired after it will deploy both services.
What is the current behavior?
optional: true is not communicated well, and is not enough if a job depends optionally on another job which failed - we got a problem!
after replaying the failed job - the build page shows green "V" for all jobs - i.e - IT PRESENT A FALSE INDICATION - WHICH IS AN INTEGRITY ISSUE.
We need a different indication to distinct between: a) - If that runs or not b) - If that job does not run, but if it run - I need it MUST succeed.
What we got now is needs[].optional: true
which NOW I understand that is implemented basically with:
a) - Not necessary if that job ends with success or not,
which includes jobs that skipped running and jobs that ran and failed.
That is an issue that should be addressed.
Proposal
Introduce a new keyword that merges needs:optional
and allow_failure:false
only when a job is run and shows the proper pipeline status based on the job.