Document Dependency Scanning paradigm: multiple lock files, or one requirements/parent build file per job
Problem to solve
It's not clear whether a Dependency Scanning analyzer scans multiple dependency files, or a single file.
We need documentation of how it works today. And we need to solicit feedback from users who experience pain with the current way things work so that we can decide on a single rule for both users running the scanners and developers implementing them in the future. This rule needs to be documented, so that users can predict how scanners will behave.
Intended users
Further details
Currently there seems to be a discrepancy between the Gemnasium based analyzers:
- gemnasium scans multiple lock files all at once
- gemnasium-python and gemnasium-maven only scan a single file
This should be codified as two existing rules that function today.
Proposal
Make a new issue asking users if they experienced an issue as a result of the current code and which of the following might have solved their issue:
When executing in a CI job, a Dependency Scanning analyzer would either process:
- multiple lock files (default)
- one requirements file (fallback)
When processing a requirements file, the analyzer installs the project dependencies using the package manager, so this is expensive (time and bandwidth). This is why analyzers should NOT process multiple requirements by default. Also, it makes sense to run multiple dependency scanning jobs to process multiple requirements files, to reduce the overall execution time of the pipeline.
Analyzers should first attempt to parse and process lock files because this is both more accurate (it reflects the exact versions used in production) and way cheaper (no need to install the dependencies). They should process a single requirements file as a fallback, or when explicitly requested to do so (variables to be later defined)
This already reflects the way gemnasium, gemnasium-python, and gemnasium-maven currently behave.
Implementation plan
Use this commit as a starting point
-
Add new column
Processes multiple files?
to theSupported languages and package managers
section. This column should link to a new section in the docs, possibly namedHow multiple files are processed
-
In this new section, add the following sub-sections:
Ruby
Python
Java
PHP, NuGet, Go, <everything else>
Provide detailed information in each of the above sub-sections, explaining how files are processed. See this commit for a starting point.
Documentation
To be documented in https://docs.gitlab.com/ee/user/application_security/dependency_scanning/index.html
Testing
none
What does success look like, and how can we measure that?
Users can easily predict how dependency files are scanned. More specifically, they're able to know if a dependency scanning job scans one or multiple files, and how to configure their CI pipeline to scans multiple requirements files.
What is the type of buyer?
TODO