Skip to content

GitLab Semgrep and Spotbugs analyzers are causing duplicates in vulnerability report

Summary

A change was recently made in Sort vulnerability links and identifiers (gitlab-org/security-products/analyzers/report!116 - merged) • Adam Cohen • 18.3 to produce deterministic output by sorting the vulnerabilities[].identifiers[] field in the report produced by GitLab secure analyzers. However, sorting the vulnerabilities[].identifiers[] field has introduced a bug due to the fact that the first element of vulnerabilities[].identifiers[] has special significance, since it's considered the primary identifier and must always remain as the first element.

The primary identifier is used by the rails monolith to determine which vulnerabilities are existing, and which ones are new. If the vulnerabilities[].identifiers[] list is sorted when a new pipeline is executed, and the first element is moved to the end of the list, then the primary identifier is changed, which causes duplicate entries to show up in the Vulnerabilities report.

This bug affects the following analyzer versions:

note: spotbugs and semgrep were the only analyzers impacted by this bug, since all the other analyzers only ever produce a single element in the vulnerabilities[].identifiers[] list, and therefore only have a single primary identifier.

GitLab Advanced SAST (GLAS) was not impacted because it's still using report v6.0.0 and gemnasium is using report v5.13.0 so it's not impacted either.

See also https://gitlab.com/gitlab-com/request-for-help/-/issues/3457#note_2787772223

Steps to reproduce

  1. Create a new project rfh-3457-4 and add a gl-sast-report.json where semgrep_id is the first element in the list of identifiers:

  2. View the Vulnerability report:

    1

    Vulnerability report shows the following vulnerability severity counts:

    • 4 high
    • 10 medium
    • 2 low
  3. Update status of all vulnerabilities to Confirmed in vulnerability report

    2

  4. Update gl-sast-report.json and place semgrep_id as the last element in the list of identifiers:

  5. Update gl-sast-report.json and place semgrep_id as the first element in the list of identifiers:

  6. View the Vulnerability report:

    Notice that the vulnerability counts have doubled

    • From
      • 4 high
      • 10 medium
      • 2 low
    • To
      • 8 high
      • 20 medium
      • 4 low

    3

Example Project

rfh-3457-4

What is the current bug behavior?

Duplicate vulnerabilities show up in the Vulnerability report when switching to semgrep v6.7.0

What is the expected correct behavior?

Duplicate vulnerabilities should not appear in the Vulnerability report

Implementation Plan

  1. Update the vulnerabilities[].identifiers[] sorting logic so that it only sorts elements 1..N of the list. In other words, the element at index 0 should not be moved.
  2. Release report v6.2.1 with the fix from 1 above.
  3. Update the following analyzers to report v6.2.1:
Edited by Adam Cohen