Improve CI pipeline
Introduction
It seems like our CI pipeline has grown organically and we are up to runtimes between 12 and 20 minutes again.
Let's visualize our pipeline first and try to document what every stage and job do as of today
graph RL
subgraph pre-build
PRE_BUILD:::stage
build_docker_image:::immediate
update_screenshots:::manual
danger-review:::immediate
end
subgraph test
TEST:::stage
build
visual
build & unit_tests & generate_utility_classes & lint -.- TEST:::stage
sast & license_management & dependency_scanning & code_quality -.- TEST:::stage
visual_gitlab_integration
end
subgraph publish
PUBLISH:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
end
subgraph deploy
DEPLOY:::stage
pages("pages – master only"):::only_master
review
review_stop:::manual
upload_artifacts
end
subgraph deployed
DEPLOYED:::stage
create_integration_branch:::manual
end
DEPLOYED --> PUBLISH --> DEPLOY --> TEST --> PRE_BUILD
publish_to_npm -.->|"waits for"| DEPLOY
visual & visual_gitlab_integration -->|"needs"| build_docker_image
review & pages & upload_artifacts & publish_to_npm -->|"needs"| build
create_integration_branch -->|"needs"| upload_artifacts
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
Okay, there is a lot digest here. Let's cover some basics:
-
#ecb613
colored boxes are our stages -
#aaaaaa
colored boxes are manual jobs -
#80ff00
are jobs which are immediatially executed; as soon as the pipeline starts -
#20dfaf
are jobs which are only executed on master
We start at the PRE_BUILD stage and go over
Problem 1: Implicit dependency on pre-build
Usually how GitLab CI works, it starts executing jobs as soon as every job from a previous stage is
finished. This means that every job in the TEST
stage will have to wait until everything from the
PRE_BUILD
stage is finished.
This means that for example lint
or unit_tests
wait, even though they don't utilize any of the
previous scripts.
Luckily for as, DAG now supports an empty needs: []
. So a lot of those jobs can
be executed immediately with enabling that feature.
Proposal 1: Give every job without a dependency needs: []
Proposal 2: Move danger-review
to test
, because it seems to belong there semantically
Problem 2: Separation of concerns when it comes to build & test & deploy
Right now the test
stage contains both build and test scripts. This is can be a bit confusing, as the build scripts are not where one might expect.
Proposal to move the build
script to a new build
stage between test
and deploy
. Together with the aformentioned empty needs feature, the build script can actually start immediately.
Additionally if we have a look at the build
job and it's dependencies, we see something concerning:
graph TD
subgraph publish
PUBLISH:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
end
subgraph deploy
DEPLOY:::stage
pages("pages – master only"):::only_master
review
upload_artifacts
end
subgraph deployed
DEPLOYED:::stage
create_integration_branch:::manual
end
subgraph test
TEST:::stage
build
end
DEPLOYED --> PUBLISH --> DEPLOY --> TEST
publish_to_npm -->|"waits for"| DEPLOY
review & pages & upload_artifacts & publish_to_npm -->|"needs"| build
create_integration_branch -->|"needs"| upload_artifacts
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
There are 4 jobs depending on build
:
-
review
: Our review app deployment for feature branches -
upload_artifacts
: Creates a npm package tarball, which can be used for testing purposes, for example:create_integration_branch
-
pages
, on master: Our GitLab Pages deployment -
publish_to_npm
, on master: Publishing new versions to NPM
Interestingly, two of those jobs seem to depend on yarn run storybook-static
(review
/pages
) and two of them on yarn run build
(upload_artifacts
/publish_to_npm
). The build job currently
Proposal 3: Split build
into build_storybook
and build_package
and move it to a new build
stage.
Proposal 4: Looking at upload_artifacts
, it could probably be merged into build_package
, as it is only zipping the build results of build
Minor stage semantics
Proposal 5: Move publish_to_npm
to the deploy
stage, because it shouldn't depend on the success for deploying gitlab
pages
Proposal 6: Create a new manual
stage and move all manual jobs to that stage, in order for them to be discovered more easily.
All proposed changes coming together
The pipeline would look like this:
graph RL
subgraph pre-build
PRE_BUILD:::stage
build_docker_image:::immediate
end
subgraph test
danger-review:::immediate
lint:::immediate
generate_utility_classes:::immediate
unit_tests:::immediate
visual
visual_gitlab_integration
dependency_scanning & code_quality & license_management & sast -.- TEST:::stage
end
subgraph build
build_storybook:::immediate
BUILD:::stage
build_package:::immediate
end
subgraph deploy
DEPLOY:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
review
pages("pages – master only"):::only_master
end
subgraph manual
MANUAL:::stage
review_stop:::manual
create_integration_branch:::manual
update_screenshots:::manual
end
MANUAL --> DEPLOY --> BUILD --> TEST --> PRE_BUILD
publish_to_npm -.->|"waits for"| BUILD
visual & visual_gitlab_integration & update_screenshots -->|"needs"| build_docker_image
create_integration_branch & publish_to_npm -->|"needs"| build_package
review & pages -->|"needs"| build_storybook
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
Follow up issues
To be created by @leipert
- Investigate to run
danger-review
andreview
only on merge requests - Move the build_docker_image step out of gitlab ui, as it is not updated that often and creates an unnecessary dependency for
visual
,visual_gitlab_integration
andupdate_snapshots
- remove generate_utility_classes, as it is implicitly done by
build_storybook
andbuild_package
- Try to utilize caching in order to reduce build times
- Use the mermaid chart from above and document everything in GitLab UI docs/.