Enrich Dependency Scanning report with more metadata

Refs: gitlab-org/gitlab-ee#5043

Description

The current MR widget shows a simple list of vulnerabilities reported by Dependency Scanning tools with:

  • Priority
  • Title
  • Affected file path

This can be easily enriched to provide more useful data to the users.

1) The Priority property is currently mixing different meanings:

  • Bundler Audit: Criticality
  • RetireJS: severity

At first severity and criticality are close to each other but there is a semantic work here and we need to check the tools are using the correct word regarding the associated value they provide. Also, we could leverage existing industry standard like The Common Vulnerability Scoring System (CVSS) to provide details about the severity (when available).

2) There are a lot more information available (or potentially available) that we can display

As an example here is what can be displayed for issues coming from Gemnasium: gemnasium_vulnerability

We could definitely improve the UX by adding some (if not all) of them. e.g. with current state, a user that sees the report has to:

  • copy/paste the vulnerability title into google to find out what it is
  • Look into the affected file (no line provided) for the corresponding vulnerability

This could be easily enhanced by embedding a short description and/or providing a link to the vulnerability source or the CVE database, etc. And providing the corresponding line in the impacted file could save a good amount of time too.

Proposal

  • 1. Gather all the metadata available from the different SAST tools
  • 2. Find the most valuable and common metadata that could be shown by default
  • 3. Find the less valuable metadata that could be shown in an extended view (or only in full report)

1. Available data

Here is a list of the available properties for each tools we currently rely on:

Property \ Tool Gemnasium Bundler Audit RetireJS
Type Dep-scan Dep-scan Dep-scan (hybrid)
severity ❌ ✅ ✅
title ✅ ✅ ✅
file ✅ ⚠ ✅
start line ❌ ❌ ❌
end line ❌ ❌ ❌
external id (e.g. CVE) ✅ ✅ ⚠
urls ✅ ✅ ✅
internal doc/explanation ✅ ❌ ❌
solution ✅ ✅ ❌
confidence ❌ ❌ ❌
affected item (e.g. class or package) ✅ ✅ ✅
source code extract ❌ ❌ ❌
internal id ✅ ❌ ❌
date ✅ ❌ ❌
credits ✅ ❌ ❌
  • ✅ => we have that data
  • ⚠ => we have that data but it's partially reliable, or we need to extract that data from unstructured content
  • ❌ => we don't have that data or it would need to develop specific or inefficient/unreliable logic to obtain it.

A word on "hybrid" tools:

  • RetireJS is a dependency scan tool that has 2 detection engines:
    • First one is like usual dependency scan, it looks at the known dependency file (package.json only) for listed dependencies.
    • Second one is more like a static analysis tool as it will scan all the files in the given path (recursively), looking for matching filenames and code patterns in files content.

Here here the list of mapped properties for each tool + some relevant comments to save time when implementing:

Property \ Tool Gemnasium Bundler Audit RetireJS
Type Dep-scan Dep-scan Dep-scan (hybrid)
severity ❌ Criticality severity
title title Title summary
file file ⚠ (only Gemfile.lock is supported) filename (or package.json)
start line ❌ ❌ ❌
end line ❌ ❌ ❌
external id (e.g. CVE) identifier (can be CVE or other) Advisory ⚠ identifiers.cve or .bug or .issue
urls urls URL info
internal doc/explanation description ❌ ❌
solution solution, fixed versions Solution ❌
confidence ❌ ❌ ❌
affected item package type + name + version Name, Version component + version, parents packages
source code extract ❌ ❌ ❌
internal id uuid ❌ ❌
date date ❌ ❌
credits credits ❌ ❌

/cc @bikebilly

Edited Apr 11, 2018 by Olivier Gonzalez
Assignee Loading
Time tracking Loading