Resolve "Fix 5mb limit for artifact verification"
Background
The grouppipeline security group is working towards providing users with SLSA Level 3 Provenance Attestations. As a simplified TL;DR, in the context of GitLab, a provenance statement is a JSON document that correlates the SHA-256 sum of an artifact with the build information. A worker then performs a digital signature, called a provenance attestation, stored as a “Sigstore Bundle” blob. To calculate the SHA-256, we need to read the artifacts using the SupplyChain::ArtifactsReader.
What does this MR do and why?
At the moment, there is a hard-limit of 5mb. See linked "Fix 5mb limit for artifact verification (#588512)" for more information.
The TL;DR is that I referenced a variable that I thought had 100.megabytes in production, as documented in "Set maximum artifact size". The variable I was using instead had the value 5.megabytes as reported in the linked issue: "Fix 5mb limit for artifact verification (#588512)".
Current behavior
A 5 MB limit is enforced.
Expected behavior
A 100 MB limit is enforced, based on max_artifact_size. This limit is consistent with the values we had in mind when we discussed this issue, the 5 MB limit was introduced in error. More information on how specifically we aim to prevent and handle disk exhaustion are in this thread, and ADR 005: Perform sha256 calculation in PublishProvenanceService.
Additionally, this feature is opt-in (GitLab SLSA | GitLab Docs) and is behind the slsa_provenance_statement FF.
Further reading:
- We discussed the size in this comment in Calculate sha256 digest of artifact on PublishProvenanceService (#559267).
- Other discussion on limits within the original MR: Resolve "Calculate sha256 digest of artifact on PublishProvenanceService" (!201682)
Values in GDK.
> Gitlab::CurrentSettings.current_application_settings.max_artifacts_content_include_size
=> 5242880
> Ci::JobArtifact.max_artifact_size(type: :archive, project: build.project)
Plan Load (0.5ms) SELECT "plans".* FROM "plans" WHERE "plans"."name" = 'default' LIMIT 1
=> 104857600 # 100 MB
Testing
Create a .gitlab-ci.yml file as follows:
build-job:
stage: build
variables:
ATTEST_BUILD_ARTIFACTS: true
id_tokens:
SIGSTORE_ID_TOKEN:
aud: sigstore
script:
- dd if=/dev/zero of=test.txt bs=101M count=1
- sha256sum test.txt
artifacts:
paths:
- test.txt
Call ArtifactsReader on the produced build:
ar = SupplyChain::ArtifactsReader.new(build);
[...]
[14] pry(main)> ar.files
SupplyChain::ArtifactsReader::ArtifactTooLarge: Job build-job (395): test.txt is too large: 101 MiB exceeds maximum of 100 MiB
from /Users/samroque-worcel/code/gdk/gitlab/lib/supply_chain/artifacts_reader.rb:83:in `validate_artifact!'
Change .gitlab-ci.yml to produce a 99mb file. Repeat the process:
[18] pry(main)> ar.files { |file| p file }
"test.txt"
=> ["test.txt"]
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #588512 (closed)