when: manual is blocking pipeline even though it shouldn't

Summary

I wrote a two-stage pipeline where a job in the second stage is marked as waiting for upstream job even though it is marked as manual. The pipeline is thus blocked even though all the non-manual jobs were run.

Steps to reproduce

Here is the full gitlab-ci configuration below.

There are two sets of jobs: jobs for the MR and jobs for master. I encountered the problem with a MR but I suppose the same would happen with master branch.

For both the MR and the master, there are again two sets of job: jobs that must be run every time and manual jobs that can be run if desired. To understand why: the first set of job rebuild only the needed artefacts based on the master branch, but sometimes we can decide to trigger a build for everything, that's what the manual job is for.

For theses 4 combinations, there is two stages:

  • first the build stage with 2 jobs, 1 to validate things and 1 to build artefacts,
  • and then the deploy stage, that depends on the job that build artefacts.

Note that the dependency between jobs is expressed differently for MR and master:

  • via dependencies and needs for the MR because we want to deploy as soon as possible even if some of the build stage jobs didn't succeed
  • only via dependencies for master because we only want to deploy if ALL the build stage succeeded.
.gitlab-ci.yml
variables:
  YARN_CACHE_DIRECTORY: .yarn-cache/

stages:
  - build
  - deploy

.yarn-nx:
  image: node:10
  before_script:
    # required for nx to be able to find origin/master and other branches
    - git fetch
    - yarn --frozen-lockfile --non-interactive --link-duplicates --cache-folder $YARN_CACHE_DIRECTORY
  cache:
    key: yarn-cache
    paths:
      - $YARN_CACHE_DIRECTORY

.validate:
  extends:
    - .yarn-nx
  stage: build
  script:
    - yarn run format:check $BASE
    - yarn run affected:lint $BASE
    - yarn run affected:test $BASE

.build:
  extends:
    - .yarn-nx
  stage: build
  script:
    - yarn affected:build $BASE --prod
  # TODO replace with workspace when it's available
  # cf https://gitlab.com/groups/gitlab-org/-/epics/1418
  artifacts:
    expire_in: 1 day
    paths:
      - dist/

.deploy:
  stage: deploy
  image: kaniko
  script:
    - |
      for APP_DIR in $(ls -1 -d dist/apps/*/)
      do
        APP=$(basename $APP_DIR)
        echo Building docker image for $APP
        IMAGE=$BUILD_IMAGE_PREFIX/$APP:$CI_COMMIT_SHA
        echo Building and tagging $IMAGE
        /kaniko/executor --context dir://$CI_PROJECT_DIR/$APP_DIR --dockerfile $CI_PROJECT_DIR/Dockerfile --destination $IMAGE
        TAGGED_IMAGE=$BUILD_IMAGE_PREFIX/$APP:${DOCKER_TAG:-$CI_COMMIT_REF_SLUG}
        echo Tagging $TAGGED_IMAGE
        crane copy $IMAGE $TAGGED_IMAGE
      done

.yarn-nx:mr:
  variables:
    BASE: --base=origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME

.yarn-nx:master:
  variables:
    BASE: --base=$CI_COMMIT_BEFORE_SHA

.yarn-nx:all:
  variables:
    BASE: --all

.deploy:mr:
  variables:
    # using a different name for the intermediary images allows us to delete them easily from gitlab
    BUILD_IMAGE_PREFIX: $CI_REGISTRY_IMAGE/build

.deploy:master:
  variables:
    BUILD_IMAGE_PREFIX: $CI_REGISTRY_IMAGE
    DOCKER_TAG: latest

build:mr:
  extends:
    - .build
    - .yarn-nx:mr
  only:
    - merge_requests

build:all:mr:
  extends:
    - .build
    - .yarn-nx:all
  only:
    - merge_requests
  when: manual

build:master:
  extends:
    - .build
    - .yarn-nx:master
  only:
    - master

build:all:master:
  extends:
    - .build
    - .yarn-nx:all
  only:
    - master
  when: manual

validate:mr:
  extends:
    - .validate
    - .yarn-nx:mr
  only:
    - merge_requests

validate:all:mr:
  extends:
    - .validate
    - .yarn-nx:all
  only:
    - merge_requests
  when: manual

validate:master:
  extends:
    - .validate
    - .yarn-nx:master
  only:
    - master

validate:all:master:
  extends:
    - .validate
    - .yarn-nx:all
  only:
    - master
  when: manual

deploy:mr:
  dependencies:
    - build:mr
  # this job will be triggered as soon as the build job is done
  needs:
    - build:mr
  extends:
    - .deploy
    - .deploy:mr
  only:
    - merge_requests

deploy:all:mr:
  dependencies:
    - build:all:mr
  # this job will be triggered as soon as the build job is done
  needs:
    - build:all:mr
  extends:
    - .deploy
    - .deploy:mr
  only:
    - merge_requests
  when: manual

deploy:master:
  dependencies:
    - build:master
  # we don't use needs because we want the whole build stage to be done
  extends:
    - .deploy
    - .deploy:master
  only:
    - master

deploy:all:master:
  dependencies:
    - build:all:master
  # we don't use needs because we want the whole build stage to be done
  extends:
    - .deploy
    - .deploy:master
  only:
    - master
  when: manual

Actual behavior

In practice, if I trigger no manual job, I ends up with this: image

As you can see the pipeline is blocked by the deploy:all:mr job.

Expected behavior

  • I'm expecting deploy:mr to run automatically as soon as build:mr has succeeded (but the pipeline to be considered finished only if both stage succeeded of course)
  • I'm expecting deploy:all:mr to be runnable manually as soon as build:all:mr has succeeded.
  • I'm expecting deploy:master to run automatically as soon as both build:master and validate:master have succeeded (and not be influenced by build:all:master and validate:all:master since they are manual)
  • I'm expecting deploy:all:master to be runnable manually as soon as build:master, validate:master AND build:all:master have succeeded.

Environment description

This is running on Gitlab.com.