DS - Dependency Resolution MVP (#20461) · Epics · GitLab.org

DS - Dependency Resolution MVP

**What is Dependency Resolution (formerly called "build support")?** In the context of Dependencey Scanning, what we used to call "build support" refers to the mean(s) of obtaining the project’s dependency list or graph. This does not necessarily require to build the project exactly as the user would to release it or deploy it. However, building the dependency list often calls for similar requirements to have a working build environment, to be able to execute package manager, build tool, or 3rd party tool commands. For instance, being able to execute the `mvn dependency:tree` command to build a maven project's dependency list. Going forward, we will call this functionality **dependency resolution** to avoid the confusion and better indicate the goal. ### Problem to solve  GitLab's dependency scanning relies on lockfiles or graphfiles as the entry point for an accurate dependency detection and vulnerability analysis. However, approximately 50% of projects require some form of dependency resolution mechanism to generate these files. Whether these files are committed to the repository depends on the ecosystem and project practices. Unlike the legacy Gemnasium analyzer which supports that capability, the original Dependency Scanning analyzer delegated this responsibility to users, expecting them to generate lockfiles/graphfiles in preceding CI jobs. While this approach offers flexibility, it presents some challenges: - **User Experience Gaps**: Many users expect dependency scanning to work out-of-the-box without manual CI configuration. The requirement to set up custom build jobs creates friction for adoption and increases the barrier to entry for security scanning. - **Limitations to enablement at scale**: [Scan Execution Policies](https://docs.gitlab.com/ee/user/application_security/policies/scan_execution_policies.html) enforce security scanning across projects. However, without this necessary dependency resolution step, projects without pre-existing lockfiles or ad-hoc customization could not benefit from dependency scanning analysis. ### Product decisions Leaning on the product principle "accuracy is a dial", we want to offer a multi-tiered approach to our coverage. The intent is to deliver a feature that works out of the box and always provides minimum scan results, even if these can be somewhat inaccurate and incomplete. It's s starting point for users to onboard on the Dependency Scanning and from there they have the opportunity to fine tune and configure the scanner to increase results quality. This will take the following form: - if a lockfile/graphfile is present, the dependency scanning analyzer will consume it directly. - otherwise it will attempt an automatic dependency resolution if the project requires it (maven, python's requirement.txt, gradle, etc.) - finally, if that dependency resolution attempt has failed, the analyzer will fallback to parsing the dependency manifest file (e.g. `pom.xml`). **Manifest Parsing** capability will be introduced separately and work is tracked in a dedicated child epic: https://gitlab.com/groups/gitlab-org/-/work_items/20457. We will start with a spike to evaluate the feasibility and set clear expectation on this approach: https://gitlab.com/gitlab-org/gitlab/-/work_items/584568 ### Dependency Resolution solution After re-evaluating the needs and the technical solutions, it sounds viable to offer an automatic dependency resolution similar to the build support provided by the Gemnasium analyzer. Although, following on the lessons learned with Gemnasium we want to drastically reduce the maintenance overhead of such functionality, making it more sustainable for our team. Fortunately, the spike has shown the opportunity to greatly simplify the dependency resolution for maven, python (requirement.txt, Pipfile, setup.py), and maybe gradle projects too. Again, leaning on `accuracy is a dial` principle, focusing on offering a working solution for the major use cases, and deferring other unsupported use cases to either the manifest parsing fallback, or a manual lockfile/graphfile generation, we can indeed provide a much simpler solution. | Technology | Image | Resolution Command | Output | |------------|-------|-------------------|--------| | Maven | `ubi9/openjdk-21` (pristine)| `mvn dependency:tree` | `maven.graph.json` | | Gradle | `ubi9/openjdk-17` (modified to include gradle 8) | `gradle dependencies` | `gradle.graph.txt` | | Python | `ubi9/python312` (modified to include piptools 7)| `pip-compile` | `pipcompile.lock.txt` | **Key simplifications**: - **Maven**: Use a single Java version (21 LTS) with the built-in `mvn dependency:tree` command, avoiding the complexity of supporting multiple Java versions - **Python**: Use a single python version (3.12) with `pip-tools` 7. This command line tool handles multiple Python project formats (requirements.in, setup.py, pyproject.toml, etc.) with a single implementation. - **Gradle**: Use a single Java version (17 LTS) with gradle 8. Use the native `gradle dependencies` command, prioritizing the use of a gradle wrapper if present. - **SBT**: No automatic dependency resolution support initially; users must provide a `dependencies-compile.dot` file (`sbt dependencyDot`) or rely on manifest parsing fallback - **Go**: No automatic dependency resolution support initially; users must provide a `go.graph` file (`go mod graph > go.graph`) or rely on manifest parsing fallback - The exact scope of coverage for each technology will be further adjusted in the implementation issues as necessary. #### Implementation The team has explored multiple options to integrate this dependency resolution step in the existing CI workflow. None of them was a clear winner and they all come with advantages and downsides. However, the advanced solutions which offer more flexibility and capabilities, often come at a much higher cost, complexity, and an increased risk. We eventually settled on using **Preceding "build" job(s)**. Although, as we dived down into discussions we refined the solution we've determined that running a bare script for detection logic will not be sustainable on the long term. This is when the idea came to use the DS analyzer image as a service to the resolution job. This allows to bring the existing detection logic of the DS analyzer and yet run native build tool commands in their respective environment (vanilla image). #### Overview We implement automatic dependency resolution using _preceding resolution jobs_ that run ecosystem-native tools in vanilla (or close to) images, where the DS analyzer runs as a CI/CD [**service**](https://docs.gitlab.com/ci/services/). The service container is responsible for detecting compatible manifests (using the mounted `CI_PROJECT_DIR`) and generating **tailored native tool's instructions** that the main job's container then executes to produce the relevant lockfiles/graphfiles. These files are exported as job artifacts and consumed by the regular `dependency-scanning` job that runs in a following stage of the same pipeline. #### Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ resolve-maven-dependencies job │ │ image: maven:3.9-eclipse-temurin-21 (vanilla) │ │ stage: build │ │ services: │ │ - ds-analyzer (writes script to $CI_PROJECT_DIR/.ds/) │ │ │ │ script: │ │ - while [ ! -f .ds/resolve.sh ]; do sleep 1; done │ │ - cat .ds/resolve.sh # Transparency │ │ - sh .ds/resolve.sh │ │ └── Runs mvn dependency:tree │ │ └── Generates maven.graph.json │ │ │ │ artifacts: │ │ - "**/maven.graph.json" │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ dependency-scanning job (SSOT for the DS analysis) │ │ image: $DS_ANALYZER_IMAGE │ │ stage: test │ │ │ │ script: │ │ - /analyzer run │ │ └── Consumes all lockfiles/graphfiles │ │ └── Performs vulnerability scan │ │ └── Runs static reachability (semgrep) │ │ └── Generates single report │ │ │ │ artifacts: │ │ - gl-dependency-scanning-report.json │ │ - gl-sbom-*.cdx.json │ └─────────────────────────────────────────────────────────────┘ ``` #### Example CI Template ```yaml spec: inputs: # ... other inputs ... enable_dependency_resolution: type: string default: "maven,gradle,python" description: "Comma-separated list of technologies for automatic dependency resolution. Set to empty to disable all." --- .resolve-dependencies-base: services: - name: $DS_ANALYZER_IMAGE alias: ds-analyzer command: ["/analyzer", "detect-and-write-scripts"] stage: build allow_failure: true script: - while [ ! -f .ds/resolve.sh ]; do sleep 1; done - cat .ds/resolve.sh - sh .ds/resolve.sh resolve-maven-dependencies: extends: .resolve-dependencies-base image: maven:3.9-eclipse-temurin-21 artifacts: paths: ["**/maven.graph.json"] rules: - if: $[[ inputs.enabled_dependency_resolution ]] !~ /maven/ when: never - exists: ['**/pom.xml'] resolve-gradle-dependencies: extends: .resolve-dependencies-base image: gradle:8.5-jdk21 artifacts: paths: ["**/gradle.graph.json"] rules: - if: $[[ inputs.enabled_dependency_resolution ]] !~ /gradle/ when: never - exists: ['**/build.gradle', '**/build.gradle.kts'] resolve-python-dependencies: extends: .resolve-dependencies-base image: ghcr.io/astral-sh/uv:python3.12-bookworm artifacts: paths: ["**/uv.lock", "**/requirements.txt"] rules: - if: $[[ inputs.enabled_dependency_resolution ]] !~ /python/ when: never - exists: ['**/requirements.in', '**/pyproject.toml', '**/Pipfile'] dependency-scanning: image: $DS_ANALYZER_IMAGE stage: test script: - /analyzer run artifacts: paths: - gl-dependency-scanning-report.json - "**/gl-sbom-*.cdx.json" reports: dependency_scanning: gl-dependency-scanning-report.json cyclonedx: "**/gl-sbom-*.cdx.json" ``` #### Benefits - **Single source of truth**: One `dependency-scanning` job is doing the DS analysis and produces the SBOM and DS reports - **Minimal image requirements**: Resolution jobs only need a POSIX shell and the build tool - **Clear separation of concerns**: Resolution jobs generate files, DS job analyzes them - **Backward compatible**: Matches documented flow (build job → DS job) - **3rd party SBOM ready**: Clear destination for custom SBOMs processing (DS job) - **Transparent debugging**: `cat .ds/resolve.sh` shows exact build commands - **Full customization**: Users can override resolution job's script entirely - **Graceful empty runs**: If `rules:exists` false-positives, job exits 0 with no downstream impact - **Selective disable**: Users can disable specific resolution jobs via `enabled_dependency_resolution` input - **Predictable timing model**: Service waits once at start for checkout, then all execution is in main container --- Several details will be further refined in the implementation issues and an architecture design document is being written to reflect the decision: https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223

epic