Reduce GDK-in-a-box image size

What does this merge request do and why?

The current GDK-in-a-box image is massive (21.4GB uncompressed), as it contains a lot of overhead (compile caches, full repo history), which is not needed for the majority of the use-cases for this image.

This MR addresses this by:

  • cleaning up the various cache folders that hold caches from dependency builds and compilations (so NOT the caches from e.g. the GitLab frontend components and templates
  • using the shallow option when cloning the repo.

This reduces the image by 50% from 21.4GB to 10.7GB

gitlab-gdk-in-a-box-optimized                                             1106765995b9   44 hours ago    10.7GB
registry.gitlab.com/gitlab-org/gitlab-development-kit/gitlab-gdk-in-a-box 2f9f5af02262   3 days ago      21.4GB

5 largest layers in the image

command used to get that info
docker history registry.gitlab.com/gitlab-org/gitlab-development-kit/gitlab-gdk-in-a-box --format=json --no-trunc --human=false | jq -s -r '.|=sort_by(.Size|tonumber)|.[]| [((.Size|tonumber)/1e6|round|tostring)+ " MB", (.CreatedBy|sub("\\|2 git_checkout_branch=.* git_remote_origin_url= ";""))] | @tsv' | tail -n5

before

84 MB     RUN /bin/bash -c mkdir -p ~/.config/mise...
139 MB    # debian.sh --arch 'arm64' out/ 'bookworm' '@1762202650'
1798 MB   RUN /bin/sh -c ARCH=$(uname -m)...
4625 MB   RUN /bin/bash -c eval "$(~/.local/bin/mise activate bash)"...
14713 MB  RUN /bin/bash -c ./install-config-gdk.sh...

after

84 MB     RUN /bin/bash -c mkdir -p ~/.config/mise...
97 MB     # debian.sh --arch 'arm64' out/ 'bookworm' '@1762202650'
1537 MB   RUN /bin/sh -c ARCH=$(uname -m)...
2577 MB   RUN /bin/bash -c eval "$(~/.local/bin/mise activate bash)"...
6360 MB   RUN /bin/bash -c ./install-config-gdk.sh...

what is NOT optimized

  • multi stage builds
    Multi stage builds are normally a no-brainer, but in this case it isn't.
    The gdk-in-a-box image is not the typical use-case for a container image, as it basically mimicks what a VM does.
    This anti-pattern makes sense here, so switching to multi-stage builds (and e.g. loose the overhead of the needed packages for compilation) will have no real benefit.
  • compiled resources
    Removing all cached data has a big impact on the initial loading times of the frontend, and their impact on disk size is small.
    So that cache is not removed.

Why does this matter?

container image size is most of the times seen as an area where you shouldn't really care about.
But is has impact on multiple levels for many stakeholders:
it impacts costs (more storage, bandwidth), time (longer pull times) for both developers as end users.

This optimization on it's own may not have a direct visible impact on those levels, but at scale it contributes towards it.

How to set up and validate locally

The easiest way to test this is to compile the image locally using following steps. These steps mimic what happens in the CI pipeline.

cd support/gdk-in-a-box/container
cp ../../../packages_debian.txt ../../../.mise-version ./
docker build -t gitlab-gdk-in-a-box-optimized .

afterwards, you can validate this works by using the set-up steps in the docs

You will most likely need to change the port mapping for port 3000, as it will already be in use if you're running a GDK setup.

docker run -d -h gitlab-gdk-in-a-box-optimized.local --name gitlab-gdk-in-a-box-optimized \
-p 2022:2022 \
-p 2222:2222 \
-p 3000:3000 \
-p 3005:3005 \
-p 3010:3010 \
-p 3038:3038 \
-p 5100:5100 \
-p 5778:5778 \
-p 9000:9000 \
gitlab-gdk-in-a-box-optimized

browse to http://localhost:3000 (change the port if needed), and verify you have a smooth experience in logging into GitLab

test the gdk commands by connecting to the running container

docker exec -it gitlab-gdk-in-a-box-optimized /bin/bash

What did I test

  • the speed
    • are the GitLab ui response times similar / better than the default image?
      yes, frontend loads as fast, no delay or compilation penalty
    • are gdk commands similar in speed?
      all but gdk update is slower, as it does a fetch but ignores the shallow clone somehow
      (there is logic to check this, but somehow this goes wrong. I'll create a separate MR for this, as it's not related to this improvement itself.)
  • the functionality
    • is the gdk functionality still working?
      yes, commands like gdk stop|start|restart|configure|update still work as before.

Impacted categories

The following categories relate to this merge request:

Merge request checklist

  • This MR references an issue describing the change.
  • This change is backward compatible. If not, include steps to communicate to users.
  • Tests added for new functionality. If not, raise an issue to follow-up.
  • Observability added/updated (logging, metrics, tracing).
  • Documentation added/updated.
  • Announcement added for notable changes.
  • gdk doctor test added.
Edited by Mattias Michaux

Merge request reports

Loading