Resolve "Fix 5mb limit for artifact verification"

Background

The grouppipeline security group is working towards providing users with SLSA Level 3 Provenance Attestations. As a simplified TL;DR, in the context of GitLab, a provenance statement is a JSON document that correlates the SHA-256 sum of an artifact with the build information. A worker then performs a digital signature, called a provenance attestation, stored as a “Sigstore Bundle” blob. To calculate the SHA-256, we need to read the artifacts using the SupplyChain::ArtifactsReader.

What does this MR do and why?

At the moment, there is a hard-limit of 5mb. See linked "Fix 5mb limit for artifact verification (#588512)" for more information.

The TL;DR is that I referenced a variable that I thought had 100.megabytes in production, as documented in "Set maximum artifact size". The variable I was using instead had the value 5.megabytes as reported in the linked issue: "Fix 5mb limit for artifact verification (#588512)".

Current behavior

A 5 MB limit is enforced.

Expected behavior

A 100 MB limit is enforced, based on max_artifact_size. This limit is consistent with the values we had in mind when we discussed this issue, the 5 MB limit was introduced in error. More information on how specifically we aim to prevent and handle disk exhaustion are in this thread, and ADR 005: Perform sha256 calculation in PublishProvenanceService.

Additionally, this feature is opt-in (GitLab SLSA | GitLab Docs) and is behind the slsa_provenance_statement FF.

Further reading:

Values in GDK.

> Gitlab::CurrentSettings.current_application_settings.max_artifacts_content_include_size
=> 5242880
> Ci::JobArtifact.max_artifact_size(type: :archive, project: build.project)
  Plan Load (0.5ms)  SELECT "plans".* FROM "plans" WHERE "plans"."name" = 'default' LIMIT 1 
=> 104857600 # 100 MB

Testing

Create a .gitlab-ci.yml file as follows:

build-job:
  stage: build
  variables:
    ATTEST_BUILD_ARTIFACTS: true
  id_tokens:
    SIGSTORE_ID_TOKEN:
      aud: sigstore
  script:
    - dd if=/dev/zero of=test.txt  bs=101M  count=1
    - sha256sum test.txt
  artifacts:
    paths:
      - test.txt

Call ArtifactsReader on the produced build:

ar = SupplyChain::ArtifactsReader.new(build);
[...]
[14] pry(main)> ar.files
SupplyChain::ArtifactsReader::ArtifactTooLarge: Job build-job (395): test.txt is too large: 101 MiB exceeds maximum of 100 MiB
from /Users/samroque-worcel/code/gdk/gitlab/lib/supply_chain/artifacts_reader.rb:83:in `validate_artifact!'

Change .gitlab-ci.yml to produce a 99mb file. Repeat the process:

[18] pry(main)> ar.files { |file| p file }
"test.txt"
=> ["test.txt"]

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #588512 (closed)

Edited by Sam Roque-Worcel

Merge request reports

Loading