Decouple allowing failure from skipping manual jobs
This concerns GitLab CI/CD. For a manual job, it is required to set allow_failure: true in order to allow completion of the whole pipeline without triggering the manual job. We are using this to have a manual job that deploys the commit to a testing environment. We don't want to deploy everything, but leave this choice to the team, so a manual job makes sense for us. Sometimes, deployments fail (for various reasons not to be discussed). Our idea was to add another job (in a later stage) that will "clean up" failed deployments. The cleanup job is set to when: on_failure.
The issue is, that since we want the pipeline to complete without deploying, we have to set the the deployment job to allow_failure: true and since failure is then allowed, the cleanup job will be skipped.
I think there's some confusion going on about a job being "allowed to fail" and "allowed to be skipped". It appears to me that setting allowed failure on manual jobs is a hack and has unintended side effects such as the one sketched above. It would make more sense to me to have an additional parameter such as allow_skip: true which would mean that the pipeline can continue (and succeed) without that job.