Skip to content

Provides users the option to force-cancel a canceling a job from the UI

Release Notes

A user with Maintainer role will be able to cancel a job that has been stuck in the 'canceling' state from the Job details page.

Status update (2024-10-17)

  • Specific to the related bug with the Runner Kubernetes executor: There is a fix that has been merged in v17.3 and that was used to patch GitLab Runner v16.11 to v17.2.
  • The following patches have been released as of 2024-07-27:
* GitLab Runner v17.2.1 / GitLab Runner Helm Chart v0.67.1
* GitLab Runner v17.1.1 / GitLab Runner Helm Chart v0.66.1
* GitLab Runner v17.0.2 / GitLab Runner Helm Chart v0.65.2
* GitLab Runner v16.11.3 / GitLab Runner Helm Chart v0.64.3

We also had a separate issue (#483290 (closed)) that also causes jobs to be stuck in cancelling but happens in specific circumstances. See #483290 (comment 2097069681) to determine the appropriate fix.

Overview

This issue is being opened as per the documentation.

Description: A GitLab Premium customer reports that Job IDs are stuck in "Waiting for resource," but the UI does not show a status of running or pending, rather it shows canceling.

  • Project: /sparksuite-family/hoa-express/main-stack/

  • Job IDs: 7027551018, 7027751534

  • Job status: canceling

  • How often the problem occurs: Problem began occurring last week and has been seen sporadically since then.

  • Steps to reproduce the problem:

    They have not been able to reproduce this issue consistently. It has been seen multiple times in the last week.

Thee job has been re-run since the initial failure but you can see a recording of the issue below: https://images.sparksuite.com/v/4QCsZEKKoJOs7jcmEyku

Zendesk ticket (internal link only)

Troubleshooting notes

User/Customer GitLab Hosted or Self-Managed Runner Runner Executor
Wes Cossick Self-Managed Runner Docker Machine
Niklas van Schrick Self-Managed Runner Kubernetes
SFDC Self-Managed Runner Kubernetes
Internal link Kubernetes
Jon Benson Self-Managed Runner

Implementation Guide

Allow users to force-cancel a canceling job if it is stuck in canceling.

A job could end up stuck in canceling due infrastructure issues(like a runner ran out of memory) pr a runner was forcibly killed.

From a backend perspective we can add to CommitStatus:

    event :cancel do
      transition canceling: :canceled
      transition running: :canceling, if: :supports_canceling?
      transition CANCELABLE_STATUSES.map(&:to_sym) + [:manual] => :canceled
    end
  • Update the jobs cancel button text and tooltip when the tooltip is canceling as per the design. (collaborate with Front end grouppipeline execution Developers @pburdette and @jivanvl)
  • Canceling should also be added to the CANCELABLE_STATUSES at the job level but not at the pipeline level. Example (added Jan 27, 2025):
module Ci
  module HasStatus
    extend ActiveSupport::Concern

    # ... other status constants remain the same ...

    # Base CANCELABLE_STATUSES that will be inherited by Build
    CANCELABLE_STATUSES = (ALIVE_STATUSES + ['scheduled']- ['canceling']).freeze
  end
end

module Ci
  class Build < Ci::ApplicationRecord
    include HasStatus
    
    # Override CANCELABLE_STATUSES to include 'canceling'
    CANCELABLE_STATUSES = (HasStatus::CANCELABLE_STATUSES + ['canceling']).freeze

    # ... rest of Ci::Build class remains the same ...
  end
end
  • Add unit and integration and feature specs

Design Proposal

  1. Add a force-cancel button when a job is in canceling status.
  2. Add a tooltip to the button with text: Force cancel a job stuck in Canceling state

force-cancel-job.png

Edited by Rutvik Shah