Skip to content

Link to correct input file when analyzed artifact is not in repo

Proposal

The Dependency Scanning analyzer has a requirement for a project with a supported package manager to supply the correct lockfile for analysis. Some package managers either don't have a lockfile out of the box and require a plugin. In addition, some user projects don't follow the convention of committing their lockfiles to their repository.

After analysis the analyzer generates an sbom with a gitlab:dependency_scanning:input_file property set to the analyzed file (example). The Dependency List and Vulnerability Report use this property to for reporting on the source of the dependency. For lock and graph files generated during pipeline execution (pipdeptree.json, pipenv.graph.json, etc.) these will be broken. Several proposals have been discussed in Dependency and vulnerability lists should link ... (#519626), with a solution in the monolith being favoured.

However, the monolith approach may be more involved and slower (issues include properly representing scan evidence, creating reliable mapping logic between lockfiles and requirements files, and velocity of creating and updating this feature). Initially the monolith was proposed because it seems like the natural place for a feature more oriented around UI/UX. However, as more user feedback is received on usability, doing this in the Dependency Scanning analyzer looks more attractive (example internal request for help issue).

Making this change in the analyzer has the benefit of putting file "evidence" logic closer to where analysis happens (there's more information available at this time). And the change velocity for reviews and releases is a lot faster, which allows the team to iterate quicker on updated mappings and logic changes.

It also doesn't preclude a future iteration to move this functionality to the monolith.

Solution

The analyzer needs to be updated to know whether the analyzed file is checked in to the repo. If the file is not checked in, the analyzer looks up the file against a mapping of possible "build" files. If these exist in the repo, they are reported in the gitlab:dependency_scanning:input_file property.

This change would happen before the sbom is created. The analyzer would implement an InputFileResolver component that:

  • Clones the repository using the commit SHA from CI_COMMIT_SHA
  • Checks if the analyzed file (e.g. pipdeptree.json) is committed to the repo
  • When not committed, looks up a list of possible input files from a predefined mapping for the PackageManager of the resolved Project
  • Returns the first available input file that exists in the repository
  • Uses this result to set the correct file in the SBOM gitlab:dependency_scanning:input_file property

The initial implementation will use a hard-coded mapping of common patterns:

dependencies.lock → build.gradle, pom.xml, package.json
pipdeptree.json → requirements.txt, requirements.prod.txt  
package-lock.json → package.json, yarn.lock
Cargo.lock → Cargo.toml

User supplied mapping

Some users may have custom mappings (e.g. requirements.prod.txt), allowing them to customize the mapping could be done via cli arguments or a mapping yaml file:

file_mapping:
  pipdeptree.json:
    - requirements.txt
    - requirements.prod.txt
  dependencies.lock:
    - build.gradle
    - build.gradle.kts

This should be validated and refined in a follow-up issue.

Implementation plan

  1. Create InputFileResolver component in the dependency scanning analyzer
  2. Add file existence checking against the git tree using git ls-files and specifying the analyzed file
  3. Implement fallback logic with hard-coded mapping for known dependency file patterns (e.g. dependencies.lock -> [build.gradle, build.gradle.kts]
  4. Integrate with SBOM generation to set correct gitlab:dependency_scanning:input_file property
  5. Update main.go to utilize new functionality to override result.Path when alternative is available

/cc @duncan_harris @hacks4oats

Edited by Igor Frenkel