feat: switch docker builds to user buildx instead of native
Description
This PR switches the build pipelines to use Docker's buildx for building images instead of the native docker build engine. By doing this we are now able to cache multi-stage builds, which is of particular importance for the MFEs.
Grove was using BuildKit before with the BUILDKIT_INLINE_CACHE=1
modifier that was added to the builds. When enabled, the layers of each build step are included in the final image. However, when using multi-stage builds, the layers for the intermediate stages are not included, which is why most MFE builds do not use the cache.
It's possible to use plain docker to remediate this, by pushing an image for each build stage, but the MFE containers means pushing at least 6 (up to 20 different images) that will need to be hardcoded somewhere.
Docker's buildx
command allows doing this by adding extra options to cache these layers. The route chosen here is to create an extra image at build time tagged build-cache
. This is pushed in addition to the latest
image for any of our required images. We then add a --cache-from
for this image so that it is used for the next build. This means that cache misses are now a thing of the past.
- First build: https://gitlab.com/opencraft/devstacks/keith/grove-do/-/pipelines/822935067
- Second build cached: https://gitlab.com/opencraft/devstacks/keith/grove-do/-/pipelines/823136385
- A build after making a change to the discussion MFE: https://gitlab.com/opencraft/devstacks/keith/grove-do/-/jobs/4014854351
References
- https://docs.docker.com/build/cache/backends/registry/
- https://www.docker.com/blog/image-rebase-and-improved-remote-cache-support-in-new-buildkit/
- https://github.com/moby/buildkit/issues/1981
- https://medium.com/titansoft-engineering/docker-build-cache-sharing-on-multi-hosts-with-buildkit-and-buildx-eb8f7005918e
Caveat
There is an existing issue which precludes using this procedure for Shared Runners
. I ran into this issue when trying to build using buildx
for the first time. However it's only an issue for the first build. User's can either use a custom runner or run the build locally to bypass this and use the cache.
Supporting information
Testing instructions
- View any of the pipelines linked above.
- If time permits, check out the grove-do repository. Run a pipeline for any instance and verify the caching behaviour.
Checklist
If any of the items below is not applicable, do not remove them, but put a check in it.
-
All providers include the new feature/change -
All affected providers can provision new clusters -
Unit tests are added/updated -
Documentation is added/updated -
The TOOLS_CONTAINER_IMAGE_VERSION
in ci_vars.yml is updated -
The grove-template repository is updated