Skip to content

Build multi-architecture images

Cameron Swords requested to merge enable-multiarchitecture-builds into master

What does this MR do?

This MR makes it easy for projects that use ci-templates to create a multi-architecture (arm64, amd64) Docker image.

Projects can build a multi-architecture image by following these steps:

  1. Set the variable BUILD_MULTI_ARCHITECTURE_IMAGE: "true".
  2. Create a TARGETOS and TARGETARCH arguments in the Dockerfile. These are passed to go build by setting GOOS=$TARGETOS and GOARCH=$TARGETARCH.
  3. Depending on the project, there may be additional steps to get the project to compile and run in arm64.
  4. Consider running a smoke test to verify correct working behaviour of the arm64 version of the image.

Projects can continue to build a single-architecture image without making any changes.

See semgrep multi-architecture for an example of using these templates and building a multi-architecture image. See semgrep single-architecture for an example of using these templates and building a single-architecture image (current behaviour).

How this works

Instead of requiring a user to emulate to run an arm64 Docker image on an amd64 host, this change produces an image that can run natively on both architectures. Emulation is used during the arm64 build to produce a native arm64 image in the output.

This MR is guided by the blog Faster Multi-Platform Builds: Dockerfile Cross-Compilation Guide. DAST browserker already follows this approach, so many of the problems encountered have already been addressed.

Building multi-architecture images

docker buildx provides functionality for building multi-architecture images. First, platform information is installed. This assumes the GitLab runner uses an amd64 architecture. arm64 is uninstalled prior to installing, this was found to fix an issue where the build would occasionally fail.

docker run --privileged --rm tonistiigi/binfmt --uninstall arm64
docker run --privileged --rm tonistiigi/binfmt --install arm64

With the platform having understanding of how to build natively (amd64) and the alternative architecture (arm64), docker is instructed to build for both platforms.

docker buildx build --platform=linux/arm64,linux/amd64 ... --file "${DOCKERFILE}"`

Note that docker buildx requires a builder. For more information, see Build architecture.

Docker build will run each command in the Dockerfile twice, once for each architecture.

#37 [linux/amd64 stage-5  8/12] COPY --from=tracking /analyzer-tracking /analyzer-tracking
...
#48 [linux/arm64 stage-5  8/12] COPY --from=tracking /analyzer-tracking /analyzer-tracking
...

The output is an image that can be used for both arm64 and amd64.

> docker manifest inspect registry.gitlab.com/cam_swords/semgrep-multi-architecture:5

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 3674,
         "digest": "sha256:1dbd30c5dd7c2b7e5889801420e85e995a6b02bfadda024632a1752ffe37ac7b",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 3674,
         "digest": "sha256:82fdd27caf8ddf761c5f9058a55f6b596a9e31285158eaac7f2147154e3687df",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      }
   ]
}

The multi-architecture image is essentially empty, it simply references two different images with associated architectures.

Container registry

Tagging multi-architecture images

Tagging multi-architecture images is harder than building, because to my knowledge, docker buildx does not provide this functionality. Using the normal docker tag [source] [target] on a multi-architecture source image will lose any multi-architecture information in the target image.

To tag, the manifests in the source image are found using docker manifest inspect. Each specific architecture image is pull, tagged, and pushed to the new target using a temporary architecture-specific tag. A new manifest is then created and pushed which comprises the new tagged image.

Docker build caching

Docker builds use the container registry as a cache to reduce build times. See --cache-from.

It is also recommended that the run cache is used when building the Go analyzer. See semgrep multi-architecture for an example of how to do this.

Container registry

These cached images should likely be added to a clean up policy.

Limitations

  • Docker image builds will take longer because each step is run twice, and one of those is run emulating a different architecture.

Future optimizations

The release of GitLab arm64 runners means emulation isn't required to build an arm64 image. Instead, there would be three jobs to build a Docker image per FIPS/non-FIPS variant. These would be:

  • build-arm64. Runs on an arm64 runner, produces a native arm64 image.
  • build-amd64. Runs on an amd64 runner, produces a native amd64 image.
  • build. Builds a multi-architecture image containing manifests that reference the architecture-specific images built in build-arm64 and build-amd64.

If folks would like, I'm happy to take on this work after this MR is merged.

Implementation concerns

Calling scripts using wget

Adding a script to the job scripts is unwieldy once the script is large enough. For this reason, wget is used to download scripts which are then immediately executed.

Each script uses an if statement based on the BUILD_MULTI_ARCHITECTURE_IMAGE environment variable to ensure that old behaviour is preserved for projects that do not want to build a multi-architecture image.

Evidence this works

The follow table compares the normal and multi-architecture image builds. Note the increased build time. (see future optimizations). Also, notice that running the multi-architecture image on an arm64 machine does not produce the WARNING.

Semgrep (normal)
https://gitlab.com/cam_swords/semgrep (diff from main)
Build time 2 minutes 2 seconds
docker manifest inspect registry.gitlab.com/cam_swords/semgrep:5
{
	"schemaVersion": 2,
	"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
	"config": {
		"mediaType": "application/vnd.docker.container.image.v1+json",
		"size": 10812,
		"digest": "sha256:5ab5601006c9afaa575e7b20d5c52fe55e76df79f3fa20510357b241a55fd27b"
	},
	"layers": [...]
}
docker run -ti --rm registry.gitlab.com/cam_swords/semgrep:5
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Using rules from https://gitlab.com/gitlab-org/security-products/sast-rules/-/tree/v2.5.5
[INFO] [Semgrep] [2024-06-12T11:09:34Z] ▶ GitLab Semgrep analyzer v5.3.0
[INFO] [Semgrep] [2024-06-12T11:09:34Z] ▶ Detecting project
[WARN] [Semgrep] [2024-06-12T11:09:34Z] ▶ No match in /
Semgrep (multi-arch)
https://gitlab.com/cam_swords/semgrep-multi-architecture (diff from main)
Build time 8 minutes 54 seconds
docker manifest inspect registry.gitlab.com/cam_swords/semgrep-multi-architecture:5
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 3255,
         "digest": "sha256:7bcf897312786db476df82b6a2deaa0a61acfde610d89cbbf3d8ea49ae8f07c2",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 3255,
         "digest": "sha256:c99b23608cf6e99cf199eee5fdec881fb031501a3bce07c15a967aab138db7f0",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      }
   ]
}
docker run -ti --rm registry.gitlab.com/cam_swords/semgrep-multi-architecture:5
Using rules from https://gitlab.com/gitlab-org/security-products/sast-rules/-/tree/v2.5.5
[INFO] [Semgrep] [2024-06-12T11:10:12Z] ▶ GitLab Semgrep analyzer v5.3.0
[INFO] [Semgrep] [2024-06-12T11:10:12Z] ▶ Detecting project
[WARN] [Semgrep] [2024-06-12T11:10:12Z] ▶ No match in /

What are the relevant issue numbers?

Provide multi-architecture images for Sec analy... (gitlab-org&13757)

original issue was Provide multi-architecture images for Sec analy... (gitlab-org/gitlab#355809 - closed).

Does this MR meet the acceptance criteria?

Edited by Lucas Charles

Merge request reports