[Rails] Generate SLSA provenance statement of job artifact archives
What does this MR do and why?
- What: This MR provides a simple mechanism for generating provenance statements from a
Ci::Buildobject. - Why: The gitLab runner can generate SLSA 1.0 provenance statements, and the provenance statement format is currently documented as the Provenance Metadata Format. As a part of Phase 2: Generate provenance statement in control plane, we have to generate provenance data for the completed job in the rails backend.
References
- Spec: https://slsa.dev/spec/v1.1/provenance
- ADR: gitlab-com/content-sites/handbook!13413 (diffs, comment 2508568705)
- AD: #537049 (closed)
- Runner helper: https://docs.gitlab.com/ci/runners/configure_runners/#artifact-provenance-metadata
- CI/CD component: #538030 (comment 2535567071)
- PoC: #539007 (closed)
- PR1: !190882 (closed)
- PR2: !192851 (closed)
How to set up and validate locally
- Set up GDK with a runner as described here.
- Create a sample workflow that generates an artifact. Example below.
- Interact with the methods provided by the new implementation.
cat .gitlab-ci.yml
build-job:
stage: build
script:
- echo "Hello, $GITLAB_USER_LOGIN!"
- echo "Hello, $GITLAB_USER_LOGIN!" > test.txt
artifacts:
paths:
- test.txt
Find the build and generate a provenance statement. Pretty printed for convenience below
> build = Ci::Build.find(<ID>)
> build.provenance_statement.to_json
{
"_type": "https://in-toto.io/Statement/v1",
"subject": [
{
"name": "artifacts.zip",
"digest": {
"sha256": "717a1ee89f0a2829cf5aad57054c83615675b04baa913bdc19999d7519edf3f2"
}
}
],
"predicateType": "https://slsa.dev/provenance/v1",
"predicate": {
"buildDefinition": {
"buildType": "https://gitlab.com/gitlab-org/gitlab-runner/-/blob/4d7093e1/PROVENANCE.md",
"externalParameters": [
"CI_PIPELINE_ID",
"CI_PIPELINE_URL",
"CI_JOB_ID",
[...]
],
"internalParameters": {
"architecture": "arm64",
"executor": "docker",
"job": 412,
"name": "9-mfdkBG"
},
"resolvedDependencies": [
{
"uri": "http://gdk.test:3000/root/kjhkjh",
"digest": {
"sha256": "a288201509dd9a85da4141e07522bad412938dbe"
}
}
]
},
"runDetails": {
"builder": {
"id": "http://gdk.test:3000/groups/root/-/runners/33",
"version": {
"gitlab-runner": "4d7093e1"
}
},
"metadata": {
"invocationID": 412,
"startedOn": "2025-06-05T01:33:18Z",
"finishedOn": "2025-06-05T01:33:23Z"
}
}
}
}
This can be compared to the SLSA schema.
I've tried to get codesign to attest my statement but while support for that is incoming it is still not supported. In the meantime I've compared the above provenance statement to the schema and it conforms.
Schema validation testing
Schema validations could be useful, particularly if we could ensure that a separate tool (such as codesign) is able to parse our provenance statement. At the moment though, because of the issue linked above, it is not possible to do this with complete provenance statements. Once we progress to Phase 3 of the epic however and we're performing attestation (i.e. signing) we will be able to verify that the signature is correct using codesign.
Performance Analysis
At a high-level, this code does not require high-performance as it will not be called in any of the application core loops, and will only be called on occasion when generating provenance statements. Additionally, the most CPU intensive portions of attestation generation will happen outside of this code. For example SHA256 hashing of the archive file occurs when the archive is generated. Additionally generation of signatures. Stackprof confirms this:
% bundle exec stackprof tmp/provenance_statement_spec.rb:54.dump --text --limit 20
==================================
Mode: wall(1000)
Samples: 16389 (12.43% miss rate)
GC: 2519 (15.37%)
==================================
TOTAL (pct) SAMPLES (pct) FRAME
2395 (14.6%) 1820 (11.1%) Bootsnap::CompileCache::Native.fetch
1711 (10.4%) 1711 (10.4%) (marking)
10255 (62.6%) 1415 (8.6%) Kernel.require
1280 (7.8%) 1280 (7.8%) PG::Connection#exec
889 (5.4%) 889 (5.4%) Kernel#sleep
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #546150 (closed)