Skip to content

Stage play manual jobs may randomly leave some jobs in skipped state

Summary

When using manual jobs with a needs based DAG flow, the use of the stage-level play button can cause improper transitions and leave some further jobs in a skipped state.

Steps to reproduce

  1. Add the following pipeline definition in any CI/CD enabled project
Click to expand `.gitlab-ci.yml`
image: bash:latest

stages:
  - first-auto
  - second-manual
  - third-auto
  - fourth-manual
  - fifth-manual

first-auto:
  stage: first-auto
  script: echo

second-manual-job-one:
  stage: second-manual
  script: echo
  needs:
    job: first-auto
  when: manual

second-manual-job-two:
  stage: second-manual
  script: echo
  needs:
    job: first-auto
  when: manual

third-auto-job-one:
  stage: third-auto
  script: echo
  needs:
    job: second-manual-job-one

third-auto-job-two:
  stage: third-auto
  script: echo
  needs:
    job: second-manual-job-two

fourth-manual-job-one:
  stage: fourth-manual
  script: echo
  needs:
    - job: second-manual-job-one
    - job: third-auto-job-one
  when: manual

fourth-manual-job-two:
  stage: fourth-manual
  script: echo
  needs:
    - job: second-manual-job-two
    - job: third-auto-job-two
  when: manual

fifth-manual:
  stage: fifth-manual
  script: echo
  needs:
    - job: first-auto
    - job: fourth-manual-job-one
    - job: fourth-manual-job-two
  when: manual
  1. Visit the created pipeline

  2. Await completion of "first-auto" stage job

  3. Use the stage-level play icon (not per-job level) to run both jobs of "second-manual" stage

  4. Observe (this may need a few retries of 2-4) that even after completion of these jobs, some jobs remain in a skipped state in the "third-auto" and "fourth-manual" stages.

Note: Sometimes this occurs only after the stage play button is clicked over the "fourth-manual" job if the randomness has not affected any prior stage jobs.

Also note: This behavior does not reproduce if you use the per-manual-job play buttons instead of the stage-level ones.

Example Project

Example pipeline where one of the two 'fourth-manual' stage jobs remained skipped and would not auto transition: https://gitlab.com/gitlab-gold/hchouraria/sample-ci/-/pipelines/375727600

Example pipeline where one of the two 'third-auto' stage jobs remained skipped and would not auto-transition: https://gitlab.com/gitlab-gold/hchouraria/sample-ci/-/pipelines/375730159

What is the current bug behavior?

Some of the subsequent stage's jobs are randomly left skipped after the stage level play button is used.

What is the expected correct behavior?

Stage level play button must behave the same as job level play buttons and transition states of next stage jobs correctly every time.

Relevant logs and/or screenshots

Unexpected skipped states that occur at any stage when the stage buttons are used:

Screen_Shot_2021-09-23_at_11.44.19

Screen_Shot_2021-09-23_at_11.44.35

Screen_Shot_2021-09-23_at_11.44.58

Output of checks

This bug happens on GitLab.com

It was also reported occurring on GitLab 13.12 by a premium customer over support ticket: https://gitlab.zendesk.com/agent/tickets/236412 (internal link)

Possible fixes

TBD

Workarounds

  • Avoid the stage-level play button and instead fire the manual jobs directly via per job play button
  • If jobs appear stuck in a similar way as screenshots above, retry/re-run its prior stage job
Edited by Furkan Ayhan