Dependency Scanning on vendored libraries

Problem to solve

Dependency Scanning relies exclusively on dependency files. Some components are still present in the application, but not declared in these files, and won't be detected.

Target audience

Sasha.

Further details

There are multiple ways to include a dependency or an asset in a project, also depending on the language and framework. Here are a few examples we just don't support:

  • Go can vendor libraries directly in the project file tree.
  • Java can use Jar files
  • Python can include python eggs files
  • Javascript assets are often vendored
  • PHP can use .phar files
  • C# has nupkg files
  • node modules can be vendored and renamed in packages.json

Proposal

To scan vendored libraries, Dependency Scanning operates in 3 steps:

  1. Look for supported files, based on file extensions.
  2. Attempt to resolve these files to packages (type, name, and version).
  3. Compare the packages that have been found to the security advisories of the vulnerability database.

Step 3 is already implemented in Gemnasium.

Step 1 and 2 would coexist with the scanning of the dependency files.

Step 1 is an optimization: the scanner only calculates the checksum of compatible files.

Source files can be resolved to packages based on:

  • filename
  • path/URI
  • header
  • checksums

The scan of vendored libraries has two optimizations:

  • It skips files whose file extension isn't supported. (Step 1)
  • It resolves a file to a package first by checking the filename and path (cheap), then by reading the file contents (more expensive), and eventually by calculating the checksum (even more expensive).

See Retire.js's extractors, like the one for the dojo package:

       {
		"extractors" : {
			"func"				 : [ "dojo.version.toString()" ],
			"uri"				 : [ "/(?:dojo-)?(§§version§§)/dojo(\\.min)?\\.js" ],
			"filename"			 : [ "dojo-(§§version§§)(\\.min)?\\.js" ],
			"filecontentreplace" : [ "/dojo.version=\\{major:([0-9]+),minor:([0-9]+),patch:([0-9]+)/$1.$2.$3/"],
			"hashes"			 : {
				"73cdd262799aab850abbe694cd3bfb709ea23627" : "1.4.1",
				"c8c84eddc732c3cbf370764836a7712f3f873326" : "1.4.0",
				"d569ce9efb7edaedaec8ca9491aab0c656f7c8f0" : "1.0.0",
				"ad44e1770895b7fa84aff5a56a0f99b855a83769" : "1.3.2",
				"8fc10142a06966a8709cd9b8732f7b6db88d0c34" : "1.3.1",
				"a09b5851a0a3e9d81353745a4663741238ee1b84" : "1.3.0",
				"2ab48d45abe2f54cdda6ca32193b5ceb2b1bc25d" : "1.2.3",
				"12208a1e649402e362f528f6aae2c614fc697f8f" : "1.2.0",
				"72a6a9fbef9fa5a73cd47e49942199147f905206" : "1.1.1"
			}

		}
	}

What does success look like, and how can we measure that?

Dependency Scanning detection is improved, and a lot of new components are now recognized and checked for vulnerabilities.

Links / references

Edited by Fabien Catteau