Cancel the pipeline immediately if any jobs fails
Release Notes
You can now configure a pipeline to be cancelled immediately when a job fails. Please see documentation for additional details.Special thanks to @zillemarco for contributing to the feature.
Problem to solve
We should allow the user to configure a pipeline in such a way that when one job fails it cancels all other already started jobs to reduce CI/CD minutes consumed when the pipeline would fail anyway.
Proposal
Acceptance Criteria
When
workflow:
auto_cancel:
on_job_failure: all # or: none, default none
is configured then a single job failure will cause all jobs in the pipeline to be cancelled.
When this configuration is not included then the pipeline will behave as it does today (won't be cancelled).
Cascading to children is required but specify a cancellation reason is not required.
Engineering
Looking at it again, I might even make the keyword more specific, to be workflow:cancel_on_job_failure
.
My thinking is that workflow:cancel_on_job_failure: all
would persist somewhere attached to the Pipeline record, and then we'd check in the BuildFinishedWorker
:
# frozen_string_literal: true
module Ci
class BuildFinishedWorker # rubocop:disable Scalability/IdempotentWorker
...
def process_build(build)
# We execute these in sync to reduce IO.
build.update_coverage
Ci::BuildReportResultService.new.execute(build)
build.execute_hooks
ChatNotificationWorker.perform_async(build.id) if build.pipeline.chat?
build.track_deployment_usage
build.track_verify_environment_usage
build.remove_token!
if build.failed? && !build.auto_retry_expected?
+ if build.pipeline.cancel_on_job_failure == :all # Something like this
+ ::Ci::CancelPipelineWorker.perform_async(build.pipeline_id, build.pipeline_id) # Add param for passing cascade_to_children: true?
+ end
+
::Ci::MergeRequests::AddTodoWhenBuildFailsWorker.perform_async(build.id)
end
module Ci
class Pipeline < Ci::ApplicationRecord
+ def cancel_on_job_failure
+ # Return the config value of the workflow:cancel_on_job_failure keyword.
+ # This can be a simple string ("all") for now, or become something more
+ # complicated later, e.g. { states: ['created', 'pending'] }
+ end
In the future, to extend this with more granular configuration, we'd replace the simple pipeline.cancel_running
call with something more complex:
cancel_on_job_failure:
state: [created, pending]
if build.failed? && !build.auto_retry_expected?
- build.pipeline.cancel_running if build.pipeline.cancel_on_job_failure == :all
+ cancel_all_immediately_cancellable_jobs if build.pipeline.cancel_on_job_failure.present?
::Ci::MergeRequests::AddTodoWhenBuildFailsWorker.perform_async(build.id)
end
end
+ def cancel_all_immediately_cancellable_jobs
+ # a bunch of query logic that passes configuration options into Ci::BuildCancelService, etc.
+ end
I'm not going to sketch out the whole future implementation here, but I'm just demonstrating that by inserting a check in the BuildFinishedWorker, we're working in a relatively flexible asynchronous place where we can make calls to queue other asynchronous querying and cancellation functionality.
What is NOT included in the MVC
- A new failure type will not be reported, pipelines will report as canceled, jobs will report as canceled.
- No extra error reporting in the job log will be inserted.
Configurable through new Workflow syntax
workflow:
auto_cancel:
on_job_failure: all
I scratched that last bullet point about workflow syntax, because a bunch of us got together and agreed that introducing a very specific syntax directly under the workflow
keyword is a relatively light-touch way to do exactly what's asked in this issue, while leaving us room to iterate and make improvements, more fine-tuned configuration, etc.
So for this (useful!) MVC, we will only configure this at the workflow
level, and have the only allowed value all
apply to all jobs in the Pipeline.
Considerations
- It should be optional and off by default.
- Possibly a GitLab CI yaml level configuration.
- Deployment jobs should not be canceled for MVC.
interruptible
?
How does this work with Importantly, it does not. interruptible
configuration, while being named very generically, is a very specific functionality where a Pipeline may be cancelled by a newer, different pipeline running on the same ref. That is the only application of it.
This change, specifically, is to enable Pipelines to cancel themselves after a single job failure.
If these two configurations are to intersect in the future, we'll have to decide how to do that. The naming of interruptible
makes it somewhat difficult because there are so many different kinds of interruption that customers want. This is a future concern, and will not be addressed here.
What does success look like, and how can we measure that?
- For GitLab pipelines we should see a decrease in time for failed pipelines by >= 10%
- We'd expect to see XX pipelines on GitLab.com with this configuration added 30 days after GA
internal customer
For the- Create an ability to define a pipeline as
fail-fast
- When a job fails, it should immediately cancel all running jobs in the pipeline and set the pipeline status to
failed
For all users
- All other jobs of a pipeline are cancelled if one job fails.
Links / references
- TravisCI is considering a similar feature - https://github.com/travis-ci/travis-ci/issues/2062
- Jenkins has a feature called
failFast
- https://jenkins.io/doc/book/pipeline/syntax/#parallel
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.