Provides users the option to force-cancel a canceling a job from the UI
Release Notes
A user with Maintainer
role will be able to cancel a job that has been stuck in the 'canceling' state from the Job details page.
Status update (2024-10-17)
- Specific to the related bug with the Runner Kubernetes executor: There is a fix that has been merged in v17.3 and that was used to patch GitLab Runner v16.11 to v17.2.
- The following patches have been released as of 2024-07-27:
* GitLab Runner v17.2.1 / GitLab Runner Helm Chart v0.67.1
* GitLab Runner v17.1.1 / GitLab Runner Helm Chart v0.66.1
* GitLab Runner v17.0.2 / GitLab Runner Helm Chart v0.65.2
* GitLab Runner v16.11.3 / GitLab Runner Helm Chart v0.64.3
We also had a separate issue (#483290 (closed)) that also causes jobs to be stuck in cancelling but happens in specific circumstances. See #483290 (comment 2097069681) to determine the appropriate fix.
Overview
This issue is being opened as per the documentation.
Description: A GitLab Premium customer reports that Job IDs are stuck in "Waiting for resource," but the UI does not show a status of running
or pending
, rather it shows canceling
.
-
Project:
/sparksuite-family/hoa-express/main-stack/
-
Job IDs:
7027551018
,7027751534
-
Job status:
canceling
-
How often the problem occurs: Problem began occurring last week and has been seen sporadically since then.
-
Steps to reproduce the problem:
They have not been able to reproduce this issue consistently. It has been seen multiple times in the last week.
Thee job has been re-run since the initial failure but you can see a recording of the issue below: https://images.sparksuite.com/v/4QCsZEKKoJOs7jcmEyku
Zendesk ticket (internal link only)
Troubleshooting notes
User/Customer | GitLab Hosted or Self-Managed Runner | Runner Executor |
---|---|---|
Wes Cossick | Self-Managed Runner | Docker Machine |
Niklas van Schrick | Self-Managed Runner | Kubernetes |
SFDC | Self-Managed Runner | Kubernetes |
Internal link | Kubernetes | |
Jon Benson | Self-Managed Runner |
Implementation Guide
Allow users to force-cancel a canceling
job if it is stuck in canceling.
A job could end up stuck in canceling due infrastructure issues(like a runner ran out of memory) pr a runner was forcibly killed.
From a backend perspective we can add to CommitStatus:
event :cancel do
transition canceling: :canceled
transition running: :canceling, if: :supports_canceling?
transition CANCELABLE_STATUSES.map(&:to_sym) + [:manual] => :canceled
end
- Update the jobs cancel button text and tooltip when the tooltip is
canceling
as per the design. (collaborate with Front end grouppipeline execution Developers@pburdette
and@jivanvl
) -
Canceling
should also be added to theCANCELABLE_STATUSES
at the job level but not at the pipeline level. Example (added Jan 27, 2025):
module Ci
module HasStatus
extend ActiveSupport::Concern
# ... other status constants remain the same ...
# Base CANCELABLE_STATUSES that will be inherited by Build
CANCELABLE_STATUSES = (ALIVE_STATUSES + ['scheduled']- ['canceling']).freeze
end
end
module Ci
class Build < Ci::ApplicationRecord
include HasStatus
# Override CANCELABLE_STATUSES to include 'canceling'
CANCELABLE_STATUSES = (HasStatus::CANCELABLE_STATUSES + ['canceling']).freeze
# ... rest of Ci::Build class remains the same ...
end
end
- Add unit and integration and feature specs
Design Proposal
- Add a force-cancel button when a job is in canceling status.
- Add a tooltip to the button with text:
Force cancel a job stuck in Canceling state