Skip to content
GitLab
Next
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • gitlab-runner gitlab-runner
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 3.5k
    • Issues 3.5k
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 95
    • Merge requests 95
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
    • Model experiments
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • GitLab.orgGitLab.org
  • gitlab-runnergitlab-runner
  • Issues
  • #3376
Closed
Open
Issue created Jun 08, 2016 by NeroBurner@NeroBurner

Long running jobs canceled in GitLab UI, but runner continues process

Note (revised 2022-04-28)

If you are still experiencing similar issues as described in this issue, then add a comment with your issue details to the CI process does not receive SIGTERM on termination issue.

Overview

I have a long running compile job (~40 minutes). I made 2 pushes one after another. I stopped the first running job with the ui. It tells me that the job is canceled. But the second job stays at pending.

I suspect that the runner finishes the job and is not properly terminated. Is there a way to test my hypothesis?

I'm using the shell executor for the runner (gitlab and runner are on ubuntu 16.04)

Edit: as written in a comment below steps to reproduce the problem:

create a project with a simple gitlab-ci.yml file:

build:
  stage: build
  tags:
    - ubuntu_amd64
  script: 
    - ping localhost

start a pipeline and cancel it. This should also terminate the ping command (but it doesn't)

on the runner see if the process is still running

ps aux | grep ping
gitlab-+ 19828  0.0  0.0   8656  1724 ?        S    07:59   0:00 ping localhost

or just kill it with killall ping (use sudo if the runner is under another user)

Proposal

At the moment we are simply killing the process group with SIGKILL and then ignore the result. Instead of doing this we should allow the process to gracefully shutdown by first sending SIGTERM and after a specific timeout send SIGKILL to the process. This will help with the processes being killed properly. We already have this implemented with the custom executor and should try and reuse the code to implement the same feature.

Merge Requests

  1. Extract process killer form custom executor
  2. Extract commander interface from custom executor
  3. Add Process groups to process pkg
  4. Use the same termination commands on Windows
    • For windows on the shell executor, we pass taskkil while in the process package we just call process.Kill() investigate which one is better or if we should use both.
  5. Rename test file
  6. Send SIGTERM then SIGKILL for shell executor

Original merge request

Edited Apr 28, 2022 by Darren Eastman
Assignee
Assign to
Time tracking