Improve CI pipeline
Introduction
It seems like our CI pipeline has grown organically and we are up to runtimes between 12 and 20 minutes again.
Let's visualize our pipeline first and try to document what every stage and job do as of today
graph RL
subgraph pre-build
PRE_BUILD:::stage
build_docker_image:::immediate
update_screenshots:::manual
danger-review:::immediate
end
subgraph test
TEST:::stage
build
visual
build & unit_tests & generate_utility_classes & lint -.- TEST:::stage
sast & license_management & dependency_scanning & code_quality -.- TEST:::stage
visual_gitlab_integration
end
subgraph publish
PUBLISH:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
end
subgraph deploy
DEPLOY:::stage
pages("pages – master only"):::only_master
review
review_stop:::manual
upload_artifacts
end
subgraph deployed
DEPLOYED:::stage
create_integration_branch:::manual
end
DEPLOYED --> PUBLISH --> DEPLOY --> TEST --> PRE_BUILD
publish_to_npm -.->|"waits for"| DEPLOY
visual & visual_gitlab_integration -->|"needs"| build_docker_image
review & pages & upload_artifacts & publish_to_npm -->|"needs"| build
create_integration_branch -->|"needs"| upload_artifacts
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
Okay, there is a lot digest here. Let's cover some basics:
-
#ecb613colored boxes are our stages -
#aaaaaacolored boxes are manual jobs -
#80ff00are jobs which are immediatially executed; as soon as the pipeline starts -
#20dfafare jobs which are only executed on master
We start at the PRE_BUILD stage and go over
Problem 1: Implicit dependency on pre-build
Usually how GitLab CI works, it starts executing jobs as soon as every job from a previous stage is
finished. This means that every job in the TEST stage will have to wait until everything from the
PRE_BUILD stage is finished.
This means that for example lint or unit_tests wait, even though they don't utilize any of the
previous scripts.
Luckily for as, DAG now supports an empty needs: []. So a lot of those jobs can
be executed immediately with enabling that feature.
Proposal 1: Give every job without a dependency needs: []
Proposal 2: Move danger-review to test, because it seems to belong there semantically
Problem 2: Separation of concerns when it comes to build & test & deploy
Right now the test stage contains both build and test scripts. This is can be a bit confusing, as the build scripts are not where one might expect.
Proposal to move the build script to a new build stage between test and deploy. Together with the aformentioned empty needs feature, the build script can actually start immediately.
Additionally if we have a look at the build job and it's dependencies, we see something concerning:
graph TD
subgraph publish
PUBLISH:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
end
subgraph deploy
DEPLOY:::stage
pages("pages – master only"):::only_master
review
upload_artifacts
end
subgraph deployed
DEPLOYED:::stage
create_integration_branch:::manual
end
subgraph test
TEST:::stage
build
end
DEPLOYED --> PUBLISH --> DEPLOY --> TEST
publish_to_npm -->|"waits for"| DEPLOY
review & pages & upload_artifacts & publish_to_npm -->|"needs"| build
create_integration_branch -->|"needs"| upload_artifacts
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
There are 4 jobs depending on build:
-
review: Our review app deployment for feature branches -
upload_artifacts: Creates a npm package tarball, which can be used for testing purposes, for example:create_integration_branch -
pages, on master: Our GitLab Pages deployment -
publish_to_npm, on master: Publishing new versions to NPM
Interestingly, two of those jobs seem to depend on yarn run storybook-static (review/pages) and two of them on yarn run build (upload_artifacts/publish_to_npm). The build job currently
Proposal 3: Split build into build_storybook and build_package and move it to a new build stage.
Proposal 4: Looking at upload_artifacts, it could probably be merged into build_package, as it is only zipping the build results of build
Minor stage semantics
Proposal 5: Move publish_to_npm to the deploy stage, because it shouldn't depend on the success for deploying gitlab pages
Proposal 6: Create a new manual stage and move all manual jobs to that stage, in order for them to be discovered more easily.
All proposed changes coming together
The pipeline would look like this:
graph RL
subgraph pre-build
PRE_BUILD:::stage
build_docker_image:::immediate
end
subgraph test
danger-review:::immediate
lint:::immediate
generate_utility_classes:::immediate
unit_tests:::immediate
visual
visual_gitlab_integration
dependency_scanning & code_quality & license_management & sast -.- TEST:::stage
end
subgraph build
build_storybook:::immediate
BUILD:::stage
build_package:::immediate
end
subgraph deploy
DEPLOY:::stage
publish_to_npm("publish_to_npm – master only"):::only_master
review
pages("pages – master only"):::only_master
end
subgraph manual
MANUAL:::stage
review_stop:::manual
create_integration_branch:::manual
update_screenshots:::manual
end
MANUAL --> DEPLOY --> BUILD --> TEST --> PRE_BUILD
publish_to_npm -.->|"waits for"| BUILD
visual & visual_gitlab_integration & update_screenshots -->|"needs"| build_docker_image
create_integration_branch & publish_to_npm -->|"needs"| build_package
review & pages -->|"needs"| build_storybook
classDef stage fill:#ecb613
classDef manual fill:#aaa
classDef immediate fill:#80ff00
classDef only_master fill:#20dfaf
Follow up issues
To be created by @leipert
- Investigate to run
danger-reviewandreviewonly on merge requests - Move the build_docker_image step out of gitlab ui, as it is not updated that often and creates an unnecessary dependency for
visual,visual_gitlab_integrationandupdate_snapshots - remove generate_utility_classes, as it is implicitly done by
build_storybookandbuild_package - Try to utilize caching in order to reduce build times
- Use the mermaid chart from above and document everything in GitLab UI docs/.