[Rails] Generate SLSA provenance statement of job artifact archives

What does this MR do and why?

  • What: This MR provides a simple mechanism for generating provenance statements from a Ci::Build object.
  • Why: The gitLab runner can generate SLSA 1.0 provenance statements, and the provenance statement format is currently documented as the Provenance Metadata Format. As a part of Phase 2: Generate provenance statement in control plane, we have to generate provenance data for the completed job in the rails backend.

References

#546150 (closed)

How to set up and validate locally

  1. Set up GDK with a runner as described here.
  2. Create a sample workflow that generates an artifact. Example below.
  3. Interact with the methods provided by the new implementation.
cat .gitlab-ci.yml
build-job:
  stage: build
  script:
    - echo "Hello, $GITLAB_USER_LOGIN!"
    - echo "Hello, $GITLAB_USER_LOGIN!" > test.txt
  artifacts:
    paths:
      - test.txt

Find the build and generate a provenance statement. Pretty printed for convenience below

> build = Ci::Build.find(<ID>)
> build.provenance_statement.to_json
{
  "_type": "https://in-toto.io/Statement/v1",
  "subject": [
    {
      "name": "artifacts.zip",
      "digest": {
        "sha256": "717a1ee89f0a2829cf5aad57054c83615675b04baa913bdc19999d7519edf3f2"
      }
    }
  ],
  "predicateType": "https://slsa.dev/provenance/v1",
  "predicate": {
    "buildDefinition": {
      "buildType": "https://gitlab.com/gitlab-org/gitlab-runner/-/blob/4d7093e1/PROVENANCE.md",
      "externalParameters": [
        "CI_PIPELINE_ID",
        "CI_PIPELINE_URL",
        "CI_JOB_ID",
[...]
      ],
      "internalParameters": {
        "architecture": "arm64",
        "executor": "docker",
        "job": 412,
        "name": "9-mfdkBG"
      },
      "resolvedDependencies": [
        {
          "uri": "http://gdk.test:3000/root/kjhkjh",
          "digest": {
            "sha256": "a288201509dd9a85da4141e07522bad412938dbe"
          }
        }
      ]
    },
    "runDetails": {
      "builder": {
        "id": "http://gdk.test:3000/groups/root/-/runners/33",
        "version": {
          "gitlab-runner": "4d7093e1"
        }
      },
      "metadata": {
        "invocationID": 412,
        "startedOn": "2025-06-05T01:33:18Z",
        "finishedOn": "2025-06-05T01:33:23Z"
      }
    }
  }
}

This can be compared to the SLSA schema.

I've tried to get codesign to attest my statement but while support for that is incoming it is still not supported. In the meantime I've compared the above provenance statement to the schema and it conforms.

Schema validation testing

Schema validations could be useful, particularly if we could ensure that a separate tool (such as codesign) is able to parse our provenance statement. At the moment though, because of the issue linked above, it is not possible to do this with complete provenance statements. Once we progress to Phase 3 of the epic however and we're performing attestation (i.e. signing) we will be able to verify that the signature is correct using codesign.

Performance Analysis

At a high-level, this code does not require high-performance as it will not be called in any of the application core loops, and will only be called on occasion when generating provenance statements. Additionally, the most CPU intensive portions of attestation generation will happen outside of this code. For example SHA256 hashing of the archive file occurs when the archive is generated. Additionally generation of signatures. Stackprof confirms this:

% bundle exec stackprof tmp/provenance_statement_spec.rb:54.dump --text --limit 20
==================================
  Mode: wall(1000)
  Samples: 16389 (12.43% miss rate)
  GC: 2519 (15.37%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
      2395  (14.6%)        1820  (11.1%)     Bootsnap::CompileCache::Native.fetch
      1711  (10.4%)        1711  (10.4%)     (marking)
     10255  (62.6%)        1415   (8.6%)     Kernel.require
      1280   (7.8%)        1280   (7.8%)     PG::Connection#exec
       889   (5.4%)         889   (5.4%)     Kernel#sleep

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #546150 (closed)

Edited by Sam Roque-Worcel

Merge request reports

Loading