Allow pipeline stages to run non-linearly and independently of one-another
Description
Currently with GitLab CI, pipeline stages run one after the other, in a linear fashion (e.g. using default stages, test
waits for build
to finish, and deploy
waits for test
to finish). However, that ordering can cause pipelines to take longer to run than necessary.
Imagine the following scenario:
- There are 5 stages:
-
build
- Download dependencies and compile the project -
pretest
- Setup testing environment (generate seed data, build database, etc.); depends ontest
stage -
test
- Run automated tests; depends onpretest
stage -
prepare
- Create packages for deployment; only depends onbuild
stage -
deploy
- Deploy packages to review environment (or staging or production); only run whenprepare
andtest
are green; uses artifacts fromprepare
-
There are a few ways to construct this pipeline currently in GitLab, but none are optimal:
- Run each stage in the above order, independently. This means that
prepare
doesn't run until aftertest
is green. This means that the pipeline run time is the sum of all the stages. - Move the prepare stage to either before or after
pretest
. This results in the same run time as the first option. - Move the
prepare
stage jobs into thepretest
stage. If theprepare
jobs take longer than the otherpretest
jobs, thetest
stage will be delayed, increasing the time before developers get a result for their build. The pipeline run time is nowbuild + max(pretest, prepare) + test + deploy
. - Move the
prepare
stage jobs into thetest
stage. Similarly to above, if theprepare
jobs take longer than the othertest
jobs, thedeploy
stage will be delayed. The pipeline run time is nowbuild + pretest + max(test, prepare) + deploy
.
My suggestion is to allow multiple stages to run at the same time. In this instance, start the pretest
and prepare
jobs when build
is green, start test
when pretest
is green, and start deploy
when prepare
and test
are green. The pipeline run time would be build + max(prepare, pretest + test) + deploy
. The prepare
stage would only have an impact on the pipleline's build time if it takes longer than the pretest and test stages together, in which case, its run time impact would be reduced.
Proposal
Update the stages
declaration in .gitlab-ci.yml
to support defining stage dependencies. I suggest that it would look like the following (using the above scenario):
stages:
- build
- pretest
- test
- prepare:
- requires:
- build
- deploy
By default, a stage requires all previous stages to complete before they can run. I use requires
here to avoid confusion with the dependencies
keyword (for artifact loading) in jobs. The special value of previous
could be used for default behavior (require all previous stages), e.g.
stages:
- build
- deploy:
- requires: previous
- build
is equivalent to
stages:
- build
- test
- deploy
Additionally, one can define an empty requires
list (i.e. requires: []
) to run the stage at the same time as the first defined stage. To prevent circular requires
declarations, you may only have a requires
on a stage that was listed before the current stage (e.g. above, the test
stage could not require the prepare
stage).
In addition to the backend support for this syntax, there will need to be changes to the UI to support the non-linearity of stages.
Additional scenarios
A few other scenarios supported by this proposal that would otherwise not be supported:
- Tests could grouped by their stage. For example, there could be
linting
,unit tests
, andintegration tests
stages that all run at the same time, but are used to give a light description to the tests within it. (For GitLab CE, I had to lookup whatrubocop
was). - Tests could be run as soon as they have everything they need. I have some tests that only require the Git repo (i.e., no downloading third-party dependencies). Some, but not all, tests require the database to be seeded; for these, I seed the database only in the tests that need it, but this results in duplication of effort.