Based on the desire to add support for the following languages
.Net Framework / Core (NuGet)
c++
c
C#
Please do some research and propose which should be the first to be researched and have a proof of concept done. POC can be either adding support ourselves, or integration of an OSS project
Dependency file support (as opposed to lock file) could later be implemented by running conan or nuget to generate the lock file before parsing it. This lock file parsers would be reused.
@gonzoyumo as this is engineering research should we time box it? I'd like engineering research done 13.1 and the chosen language researched and POC'd in 13.2
@fcatteau I'd like to take this opportunity to write down a list of requirements to add support for a new language/package manager. This would allow anyone to quickly identify if adding support is doable and possibly having it handled by the wider community. Thanks!
Nicole Schwartzchanged title from Decide next language to add to Dependency Scanning to Engineering research - Decide next language to add to Dependency Scanning
changed title from Decide next language to add to Dependency Scanning to Engineering research - Decide next language to add to Dependency Scanning
I'm taking some notes on the requirements, and this is a draft for what should eventually land in the handbook. cc @gonzoyumo
Gemnasium integration
To be integrated into Gemnasium, a package manager must be able to generate a lock file that lists all the project dependencies, including the transient dependencies (dependencies of dependencies explicitly requested by the user). The dependency list must contain the package names and versions, so that the dependencies can be compared to the advisories of gemnasium-db, the Gemnasium Vulnerability Database. Dependencies must have comparable versions, so that Gemnasium can establish if the version the project depends on matches the range of affected versions, in the vulnerability database.
If the projects supported by the package manager usually have no lock file, there must be a command line that generates a lock file for any given dependency file. If this command fetches packages from external registries, then the command line must have settings for the proxies to these registries, and for custom certificates to use when connecting to these over HTTPS. The proxies and custom certificates can either be given as CLI options or environment variables (preferred), or be set in a configuration file.
Scanner integration
To be integrated into GitLab Dependency Scanning, a security scanner supporting a package manager must be able to generate all the mandatory fields defined in the Dependency Scanning JSON schema. There should be no technical constraint preventing the scanner from being bundled in a Docker image. It should be possible to build the Docker image using a Dockerfile, at any time and without any human intervention. If the scanner fetches external resources or connects to remote servers during its execution, it must provide a way to define proxies for these resources and servers, and to use custom certificates. See Secure Scanner integration guide.
Also, the Secure Scanner integration guide should be updated with requirements for offline mode support. I don't know whether offline mode support is required though.
I've created two checklists I'm going to use when evaluating package managers for .NET, C, and C++. cc @xlgmokha
Gemnasium integration checklist
The package managers meets at least one of these two conditions:
Supported projects have a lock file that lists transient and non-transient dependencies.
There is a CLI command that generates such a lock file, and this CLI and its dependencies can be bundled in a Linux-based Docker image.
requirements for lock files:
predictable file names
contain both transient and direct dependencies
give dependency name
give dependency version
provide a format version, or another way of telling if the file format is supported
give dependencies of dependencies, in order to build a dependency graph (optional, see #198034 (closed))
requirements for package versions:
package versions are comparable (three-way comparison can be implemented)
requirements for package names:
package names can be compared; false-positives are acceptable
package names can be resolved to package URLs, or package URLs are provided
requirements for vulnerability database:
When a security advisory is published on NVD, it's possible to tell this relates to a package handled by the package manager. The name of the vulnerable package (as listed in the lock file) can be extracted.
Dependency Scanning Tool integration checklist
can run as a CLI (no GUI)
can run in a Linux-based Docker image
generates all the required fields defined in the Dependency Scanning JSON schema
can run offline, or using custom proxies and custom certificates
As we can see, the dependencies (top-level key) is a JSON object where the keys are the package names, and the values are JSON objects that give the resolved version and a list of dependencies. We have the information needed to implement packages.lock.json support in Gemnasium. cc @xlgmokha
Also, the version top-level key makes possible to check whether the file format is supported.
@fcatteau Will the name of the file always be packages.lock.json or is that controlled through configuration? Is there a single lock file for a solution or will there be one for each project attached to a solution.
I'm asking because I was wondering if we need to do a recursive scan into project directories or if we can expect the file to be in the root next to the solution file. Also, I'm not sure if solution files are still used or not. dotnet cli does seem to offer generating one, so I assume yes, but I would like to suggest that we double check.
@xlgmokha AFAIK the lock file lives with the project file, and not with the solution file. So there might be multiple lock files in a repo, which is fine since gemnasium already looks for lock files recursively. To be double checked though.
In any case, I see no reason for parsing the solution file, and hopefully this reduces the complexity of NuGet support.
Yes, we might need to connect to the package registry in order to track new releases, and Show when a component is out of date. It's out of scope but it's definitely worth sharing - thank you!
AFAIK the lock file lives with the project file, and not with the solution file.
@fcatteau I assume the same. I just wanted to make sure.
In any case, I see no reason for parsing the solution file, and hopefully this reduces the complexity of NuGet support
Yep, good point. The solution file points to the project files but if we're recursively scanning for the project files then there is no need to parse the solution file.
We recently introduced issue templates on gemnasium-db for Package Type Support Request and Schema Change Request issues so that we have a place where we can sort out the details of information that is required by the analyzer upfront. I took the liberty of creating one related to NuGet for groupcomposition analysis .
NuGet uses Semantic Versioning 2.0.0, and a Maven-like syntax for version ranges. All that could be supported in the Gemnasium vrange library since we already have SemVer support and Maven support. We might have to implement a new vrange plugin that combines SemVer versions with Maven version ranges though. We might be able to implement that in Go, in gemnasium/semver. cc @julianthome
The path of a Conan dependency combines a package name, a version, a user, and a channel. It's something like mypkg/0.1@user/channel. See requires attribute.
@julianthome I suggest we ignore the user and the channel when matching the package with a security advisory of the Gemnasium Vulnerability DB. We might have false-positives, but it's better than having false-negatives. I still have to learn more on Conan channels though.
The central repo for Conan packages is conan.io, and the corresponding Conan channel is conan. See opencv/4.1.1@conan/stable for instance. Interestingly, the Conan user is used to indicate the stability. cc @julianthome
Also, each node comes with its list of dependencies (what it requires), and the first node represents the project itself. All that makes it easy to build a dependency graph. See #198034 (closed).
The image name and version can easily be extracted from the pref field.
Here's corresponding the conanfile.txt:
[requires]poco/1.9.4openssl/1.0.2u
I got the lock file by running this command after installing Conan:
No not really. We would be very grateful though if you could create a corresponding Package Type Request issue on gemnasium-db (by selecting the Package Type Request issue template) which would help us to understand the type of data you need for adding conan support.
I didn't say what it would take to add Conan support to Gemnasium's vrange library. This should be straightforward since Conan uses node-semver (Python package) in its version ranges. Actually, we should be able to reuse vrange/npm without any modification. cc @julianthome
By the way, in order to implement Conan support for Dependency Scanning we would end up adding security advisories for system libraries to gemnasium-db. This would be a step toward implementing Container Scanning on top of gemnasium-db. Actually, there might be an issue about that, and I might be the author of the issue. cc @julianthome
@NicoleSchwartz I've updated the issue with the discovery conclusion. I'm not sure users usually keep lock files in the repos of their NuGet or Conan projects, or if they would be willing to do that, but in any case lock file support is the first step before supporting the main build/dependency file. (Also keeping lock files in the repos is considered as a best practice.) cc @gonzoyumo
@NicoleSchwartz In the end I considered we didn't need a PoC to check feasibility. We know enough about the lock files Conan and NuGet generate to prove that these can be supported. I've updated the proposal accordingly. cc @gonzoyumo
@fcatteau if I'm reading correctly the conclusion, there is no preference from a technical point of view to start with .Net Framework/Core/C# (with NuGet) or C/C++ (with Conan), right?
I'm closing this research as work is done. @NicoleSchwartz let's catch up to figure out which one to start with and organize epics accordingly.
@gonzoyumo You're right. Both NuGet and Conan can be supported, at a comparable cost. I'll add that to the conclusions of this discovery. cc @NicoleSchwartz