Multi-step Gitlab CI jobs
Description
When you have one job that depends on data generated in a previous job, you can use the caching or artifact uploading feature of GitLab CI. However, sometimes the size of number of generated data is too large to make caching or artifact uploading feasible. For example, consider a build process that generates files in dozens or hundreds of directories.
To easily solve this, you can combine both "steps" into a single job. After the first build step finishes, the second step can run directly in the same working directory and have immediate access to all of the generated files. All is fine.
The only problem with this solution is a UI problem -- The Gitlab Pipeline/build webpage UI will not distinguish between these multiple steps. There is no way to visually indicate that step 1 was successful but then step 2 failed. You would have to manually read the log file.
Proposal
Introduce the concept of "steps" within a single job. In terms for gitlab-ci.yml
, here is a proposed syntax:
stages:
- build
- test
build_job:
stage: build
steps:
- do_build:
- ./configure
- make
- post_build:
- sh validate_foobar.sh
test:
stage: test
script:
- sh run_tests.sh
In this example, there is only 1 job in the build
stage, but it's divided into two steps. They would be run 1 after the other (dispatched together, always in the same runner), and the post_build
step would only run if the do_build
step was successful. The test
job is just 1 step, so it can use the existing syntax.
Visually, the pipeline view could be unchanged, I think. If you click into a build, it might looks like this on the sidebar (for a failed do_build
step)
Stage:
Build
Steps:
➜ ⊗ do_build
➜ ⊝ post_build
Links / references
For additional UI inspiration, you can look at a travis build log where the web interface allows you to identify and expand/collapse named sections