Report exact location of vulnerable npm module installed during dependency scan
Problem to solve
When running Dependency Scanning on an npm or yarn project, and when the git repo doesn't contain a node_modules directory, the report created by the Retire.js analyzer says that all vulnerabilities originate from package.json, even if the affected module is a vendored library detected in node_modules. This is confusing:
- There's no distinction between A. affected modules and B. affected vendored libraries found in these modules.
- There's no distinction between A. a vendored library a module uses at runtime and B. a vendored library used for testing.
- The module where the vendored library has been detected isn't mentioned.
Also, vulnerabilities with the same module name and the same CVE id might collide because of this, even if:
- One has been detected in a module listed in the lock file, and the other has been detected in vendored library, in one of those modules.
- The vulnerabilities come from vendored libraries detected in different modules.
Examples:
Current behavior has been implemented when fixing #13827 (closed). It was decided to set the vulnerability location to package.json when the vulnerable file does not belong to the repo, to avoid broken links.
Intended users
- Sasha (Software Developer)
- Devon (DevOps Engineer)
- Sidney (Systems Administrator)
- Sam (Security Analyst)
Further details
When the scanned npm/yarn project has no node_modules directory, the Retire.js analyzer installs the dependencies prior to the scan so that the Retire.js CLI can track vendored library and report vulnerabilities for these. Vendored libraries are detected based on their filename or fingerprint. The Retire.js analyzer sets the location to package.json so that users can jump from the vulnerability modal to the affected dependency file.
Proposal
Add a property to the vulnerability location to indicate that the file (where the vulnerability has been found) isn't in the git repository, and that the UI shouldn't link to it. The analyzer is responsible for detecting files that are not part of the repo, and to report them as such. The Security Report Schema is extended to carry this information between the analyzer producing the report, and the Rails backend consuming it. When serving the frontend, the Rails backend renders a link to the vulnerable only if the Security Report says it's part of the repo.
Please note that the Rails backend already has the information needed to establish whether a file is in the repository. However, the analyzer is responsible for providing this information, for efficiency: we assume that the Rails backend wouldn't be able to serve the frontend quick enough if it had to access the git repo. Ideally the backend would parse the security reports when the pipeline ends, and resolve the vulnerable paths once for all. Unfortunately, right now it parses the reports each time it serves the frontend, so it can't enhance the report with reusable data.
This proposal isn't specific to Retire.js, and it could be leveraged by other analyzers in the future. For instance, Gemnasium could use this when scanning a lock file, and communicate that this file isn't part of the repo; it's been created by some build job, by the job before_script, or by Gemnasium itself.
Figuring out whether a file is part of the git repo could be implemented in the common library, for reusability and separation of concerns. Ideally there's no change in the analyzer out of the upgrade of the common library.
Since all Dependency Scanning ship with git, we could use the git checkout or git ls-files to tell if a file or directory is in the repo being scanned. git checkout HEAD xyz fails if xyz isn't in the repo. git ls-files xyz never fails, but its output only contains repository files.
% git checkout HEAD README.md && echo OK
Updated 0 paths from a59a4ba
OK
% git checkout HEAD MISSING && echo OK
error: pathspec 'MISSING' did not match any file(s) known to git
% git ls-files MISSING && echo OK
OK
% git ls-files README.md && echo OK
README.md
OK
As a next step, the Secure Report Schema and the Dependency Scanning analyzers could be extended to report both the file where the vulnerability was detected AND the corresponding project file or dependency file. For example, it would report both the exact path of the vendored library, in the node_modules directory, and the path of the corresponding package.json that caused the installation of this vendored library. The backend and frontend would be updated to leverage this information, and they would link to the project/dependency file when the exact vulnerability location doesn't match the content of the git repo.
Old proposals:
A generic solution is to accurately report the dependency path, from the analyzer generating the report to the UI. See #198034 (closed) Pros: Cons:Proposal A: Dependency path
Proposal C though.
Pros: cheap compared to #198034 (closed) (Proposal A)Proposal B: No link to generated file
Another option is to report the exact location, and communicate that the corresponding file has been generated at run-time and is not clickable. This is Proposal 3 of #13827 (comment 234565550).
The analyzer would be responsible for reporting sets of locations, and the Rails backend would be responsible to figuring out what exists in the repo, and what should be presented to users. Implementation plan: Pros: Cons:Proposal C: Report both locations, let the backend decides what to show
Report both the exact location of the vulnerable dependency and the location of the editable dependency file, and make the Rails backend present the more accurate location if there's a match in the git repo, or else the location of the editable dependency file.
Pros: accurate Cons: This is Proposal 3 of #13827 (comment 234565550).Proposal D: Report both locations, mark generated file as such
Another option is to both locations (dependency file and lock file), and communicate whether a particular file is part of the git repo. The analyzer is responsible for marking generated files as such, in the report. The backend and frontend only render links for files that are part of the git repo. New fields must be added to the report format.
Implementation plan
-
update the Security Report Schemas, and add a new file to communicate that a vulnerability locationdoesn't belong to the git repo -
update the issuepackage of the common library to track files that aren't part of the git repo, and report them as such -
upgrade to this new version of common, update QA to cover this, and release a new version of retire.js -
update the Rails backend so that it doesn't generate link for vulnerable files that aren't part of the git repo
The frontend implementation doesn't change.
Permissions and Security
No change.
Documentation
No change.
Availability & Testing
To be tested using existing test projects:
- https://gitlab.com/gitlab-org/security-products/tests/js-npm
- https://gitlab.com/gitlab-org/security-products/tests/js-yarn
What does success look like, and how can we measure that?
Users can clearly distinguish multiple occurrences of the same vulnerability (same security advisory and same affected package) affecting a JavaScript library that is vendored in the node_modules directory. They can selectively dismiss some of these occurrences if they consider they are false-positives.