Child pipeline job does not wait for parent pipeline job to complete and fails
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
Job in child pipeline that needs parent pipeline job, does not wait for parent pipeline job to complete before attempting to download artifacts.
Steps to reproduce
I have test this on GitLab CE 14.8.3 and gitlab.com. Create the following 2 YML files and run a pipeline.
.gitlab-ci.yml
test_parent:
script:
- echo "HELLO" > test_parent_child.log
- echo "$CI_PIPELINE_ID"
- sleep 60
artifacts:
paths:
- test_parent_child.log
test_trigger:
variables:
PARENT_PIPELINE_ID: $CI_PIPELINE_ID
trigger:
include: .child.yml
strategy: depend
.child.yml
test_child:
script:
- cat test_parent_child.log
- echo "WORLD"
- echo "$PARENT_PIPELINE_ID"
- echo "$CI_PIPELINE_ID"
needs:
- pipeline: $PARENT_PIPELINE_ID
job: test_parent
Example Project
Example project that shows the issue
Example of pipeline with failure
Example of pipeline that passes when retrying the failed child job
What is the current bug behavior?
When a pipeline is created, the test_parent
and test_child
jobs will run at the same time if runners are available. The test_child
job requires artifacts from the test_parent
job. the test_parent
job has not completed so the test_child
job will fail with the message:
This job depends on other jobs with expired/erased artifacts:
Please refer to https://docs.gitlab.com/ee/ci/yaml/index.html#dependencies
If I retry the test_child
job after the test_parent
job has completed, then it can download the artifacts and is successful.
Running with gitlab-runner 14.8.0~beta.44.g57df0d52 (57df0d52)
on blue-2.shared.runners-manager.gitlab.com/default XxUrkriX
Preparing the "docker+machine" executor 00:06
Using Docker executor with image ruby:2.5 ...
Pulling docker image ruby:2.5 ...
Using docker image sha256:27d049ce98db4e55ddfaec6cd98c7c9cfd195bc7e994493776959db33522383b for ruby:2.5 with digest ruby@sha256:ecc3e4f5da13d881a415c9692bb52d2b85b090f38f4ad99ae94f932b3598444b ...
Preparing environment 00:01
Running on runner-xxurkrix-project-34540976-concurrent-0 via runner-xxurkrix-shared-1647438010-f8ecf6b4...
Getting source from Git repository 00:01
$ eval "$CI_PRE_CLONE_SCRIPT"
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/brandon.hussey/test-parent-child/.git/
Created fresh repository.
Checking out 1535e6c8 as main...
Skipping Git submodules setup
Downloading artifacts 00:01
Downloading artifacts for test_parent (2211114000)...
Downloading artifacts from coordinator... ok id=2211114000 responseStatus=200 OK token=iZG_XvCz
Executing "step_script" stage of the job script 00:01
Using docker image sha256:27d049ce98db4e55ddfaec6cd98c7c9cfd195bc7e994493776959db33522383b for ruby:2.5 with digest ruby@sha256:ecc3e4f5da13d881a415c9692bb52d2b85b090f38f4ad99ae94f932b3598444b ...
$ cat test_parent_child.log
HELLO
$ echo "WORLD"
WORLD
$ echo "$PARENT_PIPELINE_ID"
493738613
$ echo "$CI_PIPELINE_ID"
493738678
Cleaning up project directory and file based variables 00:00
Job succeeded
What is the expected correct behavior?
The test_child
job should pend until test_parent
completes. The test_child
job should not fail immediately because the test_parent
job has not completed yet.
Relevant logs and/or screenshots
Output of checks
This bug happens on GitLab.com. It also happens on my GitLab CE instance running 14.8.3.
Results of GitLab environment info
I am using omnibus GitLab CE installation locally.
Results of GitLab application Check
N/A
Possible fixes
None