Skip to content

Introduce support for hermetic jobs in GitLab CI

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem Statement

Currently, GitLab CI jobs receive a significant amount of information injected automatically, making it challenging to create truly hermetic and deterministic builds. This includes:

  • Variables inherited from multiple levels (global, group, project, pipeline, job)
  • Artifacts from previous jobs automatically made available
  • Environment variables from the runner environment
  • Implicit dependencies that aren't explicitly declared

This automatic injection model, while convenient for rapid development, creates barriers for teams that need reproducible, secure, and compliant build processes.

What Are Hermetic Builds?

Hermetic builds are builds where:

  • All inputs are explicitly declared and controlled
  • No implicit dependencies or environment access
  • Same inputs always produce identical outputs
  • Build process is isolated from the host environment
  • All dependencies are versioned and immutable

Benefits of Hermetic Builds

SLSA Compliance

  • Supply Chain Levels for Software Artifacts (SLSA) Level 2+ requires hermetic builds
  • Enables generation of provenance attestations with complete build input tracking
  • Supports software supply chain security initiatives and compliance requirements
  • Critical for organizations pursuing SLSA certification

Improved Caching

  • Precise cache keys: When all inputs are explicit, cache keys can be computed accurately
  • Cache hit optimization: Better cache reuse across different environments and time periods
  • Reduced build times: More effective caching leads to faster CI/CD pipelines
  • Storage efficiency: Avoid cache pollution from implicit dependencies

Enhanced Reproducibility

  • Bit-for-bit reproducibility: Same inputs guarantee identical outputs
  • Cross-environment consistency: Builds work the same locally, in CI, and in production
  • Time-independent builds: Builds from months ago can be reproduced exactly
  • Debugging advantages: Easier to isolate and reproduce build issues

Security Benefits

  • Reduced attack surface: No implicit access to environment variables or artifacts
  • Dependency transparency: All dependencies are explicitly declared and auditable
  • Supply chain integrity: Prevents injection of malicious dependencies
  • Compliance readiness: Meets requirements for SOC 2, ISO 27001, and other security standards

Current Challenges in GitLab CI

Automatic Variable Injection

# Current behavior - variables from many sources are automatically available
build:
  script: echo $SOME_VARIABLE # Could come from runner, project, group, etc.

Implicit Artifact Dependencies

# Current behavior - artifacts from previous jobs are automatically downloaded
test:
  needs: [build]
  script: ./built_binary # Available without explicit declaration

Proposed Solution (TBD)

The solution below is based on top of existing/related feature considerations:

Strict Job Mode

Introduce a new strict: true option for jobs or project setting to make all jobs hermetic by default.

build:
  strict: true
  inputs: # define job inputs
    node_version:
      default: "18.17.0"
    build_type:
      default: "release"
    files:
      type: files # We can calculate the checksum of these files. Allows to understand if the same set of inputs is provided.
      default: ["src/**/*", "Dockerfile"]
  run:
    - name: "Checkout code"
      step: components/repo/checkout@v1
    - name: "Reproduce job from cache"
      step: components/cache/reproduce@v1 # If none of the inputs have changed, reused cached outputs.
        inputs:
          job_inputs: $[[job.inputs]]
    - name: "Download artifacts"
      step: components/artifacts/download@v1
      inputs:
        files:
          - job: prepare
            paths: ["package.json", "yarn.lock"]
    - name: "Docker build"
      exec: docker build --build-arg $[[job.inputs.build_type]]=$[[node_version]]

Key Features

  1. Explicit Input Declaration
    • All inputs/variables must be explicitly declared in inputs.
    • Only specified artifacts are made available
    • File dependencies explicitly listed
  2. Environment Isolation (provided by CI Functions)
    • No runner environment variables injected
    • Restricted network access (allowlist-based)
    • Isolated file system access
  3. Provenance Generation
    • Automatic SLSA provenance attestation generation
    • Complete input tracking and recording
    • Cryptographic signing of build outputs

Implementation Considerations

Backward Compatibility

  • New feature should be opt-in via strict: true
  • Existing pipelines continue to work unchanged
  • Gradual migration path for teams

Performance Impact

  • Input validation and isolation may add overhead
  • Caching improvements should offset performance costs
  • Parallel execution of hermetic jobs

Integration Points

  • Container registry integration for reproducible base images
  • Artifact storage with content addressing
  • Integration with GitLab's dependency proxy

Use Cases

Financial Services

  • Regulatory compliance requirements
  • Audit trail for all build inputs
  • Reproducible builds for incident investigation

Open Source Projects

  • SLSA compliance for package repositories
  • Reproducible releases for security verification
  • Supply chain transparency

Enterprise Security

  • Zero-trust build environments
  • Compliance with internal security policies
  • Reduced blast radius of compromised dependencies

Success Metrics

  • Number of projects adopting hermetic builds
  • Reduction in build time variance
  • Improvement in cache hit rates
  • SLSA compliance adoption
  • Security incident reduction

Other references

https://gitlab.com/gitlab-org/gitlab/-/issues/498467+

Edited by Fabio Pitino