Discovery, Show dependency path in the dependency list and vulnerabilities

Problem to solve

When Dependency Scanning reports a vulnerability affecting a transient (AKA transitive) dependency, users don't know how this vulnerable and transient dependency relates to the dependencies they have declared in the project dependency file. It makes it difficult for them to establish whether there's a threat, or if the vulnerability can be safely ignored. Also, it makes it difficult to figure at actions what required to upgrade or remove the vulnerable dependency, in order to address the vulnerability.

Intended users

Proposal

design a UX to communicate the dependency path(s) of a vulnerable project dependency, and figure out what the best dependency path(s) to show, considering the UX and technical constraints
identify the data structure we need to build to extract the path, and find libraries that can be used to do that
create a PoC to prove we can generate these paths in the Gemnasium analyzer, at a reasonable cost
find out the best way to communicate the dependency paths to the frontend, and how to export the dependencies as a job artifact
establish a list of supported package managers, and identify possible limitations

Latest iteration

early iteration I

📽 video review

Links / references

/cc @NicoleSchwartz @gonzoyumo @kmann @annabeldunstone

Discovery conclusion

Proposed MVC to display path on dependency page
Display path in merge request #219095 (closed)
Display path on object page: #219093 (closed)
UX testing for future iterations: ux-research#913 (closed)
Discovery issue looking at showing multiple paths and improving data view add issue

Suggested backend issues:

Extend the Dependency Scanning report format to add a vulnerability path to a transient dependency
Make Gemnasium generate dependency paths to vulnerable dependencies; this covers all package managers and files Dependency Scanning currently supports, except go.sum (lock file for project using Go modules); See discussion
Make the backend parse the dependency paths, and present them (to the frontend)

frontend issue: #227326 (closed)

The following detailed notes can be used to create these issues, or their parent epic:

Detailed notes

Problem to solve

(same as #198034 (closed))

Further details

Dependency Scanning parses and scans the lock file the package manager generates automatically. The lock file lists all the project dependencies, including the transient dependencies, that is the dependencies of the dependencies. Users don't manually edit the lock file. Instead, they edit the main dependency file, and declare top-level dependencies the project explicitly uses. It's the packager manager's responsibility to build the full dependency list, and store that list into a lock file.

Most lock files give the exact relationship between the dependencies, but this information is currently ignored when parsing the files, in the Gemnasium analyzer project.

Proposal

In the Dependency List, show a path between a vulnerable, transient dependency and a top-level dependency. The UI shows only one path, but it makes it clear that there might be other paths leading to that dependency.

In order to show the dependency path in the UI, the Gemnasium analyzer builds a dependency graph, and leverages this graph to calculate paths to the vulnerable dependencies; these paths are added to the Dependency Scanning report. The addition to the Dependency Scanning report format is to be defined.

Technically, Gemnasium builds and queries the dependency graph using a Go library like gonum/path. See Proof of Concept: gitlab-org/security-products/analyzers/gemnasium!81 (closed)

Supported languages and package managers

See supported languages and package managers in Dependency Scanning documentation, and related discussion.

The dependency graph can be built from the lock file when it's in the project repository and supported (as this is the case for Bundler, npm, yarn, and PHP Composer), except for Go modules.

When there's no lock file, or when it's not supported, the graph can be built from the output of the command Gemnasium uses to list the transient dependencies, like pipdeptree or the Gemnasium plugins for Java. The latter makes possible to support pip, Pipenv, Gradle, Maven, and Sbt.

There's no dependency graph for Go projects because go.sum doesn't provide the information needed to build the graph. All other package managers are supported. The documentation should document this limitation.

Also, in the case we implement lock file support for pip or Pipenv, these lock files would not provide the information needed to build the dependency graph.

In the future, we'll be able to build a dependency graph from poetry.lock, the lock file Poetry generates.

Next steps

The UI might show other paths, in a modal view for instance. See discussion.
Gemnasium will favor a dependency path that connects the transient dependency to a runtime top-level dependency. These paths are more important than the ones that connect to development dependencies. See discussion.
The dependency path is shown for all components listed in the Dependency List, and not only for the affected one. For this to be implemented, it's necessary to export the full dependency graph.
The full dependency graph is exported as a job artifact, to be processed by the Rails backend and/or the frontend. The graph might be exported as part of the BoM. See discussion.
In case the lock file doesn't provide enough information to build the graph, the Gemnasium analyzer runs a command to get the data that's needed. In particular, it runs go mod graph to get graph information for projects using Go modules.

Permissions and Security

No change.

Documentation

The documentation tells for which package managers the dependency paths are available.

Availability & Testing

The dependency path is part of the generated Dependency Scanning report, and checked during QA for gemnasium, gemnasium-maven, and gemnasium-python.

Also, feature tests are needed to ensure dependency paths are presented in the UI, when the data is available.

What does success look like, and how can we measure that?

Users can easily explore the dependency path that leads from a vulnerable dependency listed in a lock file (or not listed at all if there's no lock file) to the main requirements file. Ultimately, they understand what causes the vulnerable dependency to be installed.

What is the type of buyer?

GitLab Ultimate

Links / references

#198034 (closed)

Edited Jul 08, 2020 by Mark Florian