Skip to content

Improve Vulnerability Tracking: Add tracking field

What does this MR do?

This merge request adds the optional tracking field to the security reports.

If the tracking field is not present, behavior falls back to using the current location-based tracking methods.

This field is intended to be used by a post-analyzer to perform better location tracking. See https://gitlab.com/groups/gitlab-org/-/epics/4690 and gitlab-org/gitlab#293706 (closed)

Video Overview & Expected FAQs

CLICK image

The above video is a 4:27 high-level summary of improving vulnerability tracking with taggr. See the longer, in-depth brown-bag video for more details

[Expected] FAQs

Why is items an array?

items is an array so that we can support use cases where vulnerability uniqueness is determined by an ordered collection of data. The current method of having one item via the location field is the same as having an array of tracking items with one entry.

This solves a current use case, as well as future use cases:

  • current: Fuzzing
    • Uses unique stack traces (an ordered collection of physical locations) to determine uniqueness
  • future: Graph-based analyzers
    • Can provide us with source->sink control flows, also giving us an ordered collection of physical locations

Each item in the tracking array has tracking signatures calculated. These are concatenated together in the GitLab backend (see the GitLab backend MR) to form one final signature.

Are we getting rid of `location`?
No, it’s still useful metadata. It’s point-data, not tracking data. Olivier: this might also be still mandatory for tracking any other kind of vulnerability that doesn’t impact “source code” (or anything that can be converted into an AST?).
Why can’t we do this in the gitlab backend?
It’s too heavy, needs a lot of dependencies
How do we handle updates/fixes/versions to tracking algorithms [post-analyzers]?

We are handling versioning by explicitly not supporting versioning. An updated post-analyzer must be a new revision of the post-analyzer, and is treated as a distinct algorithm.

For example, taggr currently implements the initial scope_offset algorithm. Suppose a bug were found in its implementation - the new revision of taggr should identify its algorithm as scope_offset_rev2 (or something similar). Docker container images of analyzers would then package up both the original taggr version, and the latest revision.

This would result in items in the JSON security report having both scope_offset and scope_offset_rev2 algorithms in the signature field:

"tracking": {
    "type": "source",
    "items": [
        {
            "file": "...",
            "start_line": 0,
            "end_line": 1,
            "signatures": [
                { "algorithm": "scope_offset", "value": "AAA" },
                { "algorithm": "scope_offset_rev2", "value": "AAA" }
            ]
        }
    ]
}

This provides maximum overlap between previously-calculated signatures that exist in GitLab's database and newly calculated signatures. If we did not continue to produce signatures from previous algorithm revisions, then we may not be able to track a vulnerability from its old location to its new location.

Examples

Example JSON Report - without calculated signatures
{
  "version": "0.0.0",
  "vulnerabilities": [
    {
      "category": "sast",
      "name": "Predictable pseudorandom number generator",
      "message": "Predictable pseudorandom number generator",
      "description": "This random generator (java.util.Random) is predictable",
      "cve": "818bf5dacb291e15d9e6dc3c5ac32178:PREDICTABLE_RANDOM:groovy/src/main/groovy/com/gitlab/security_products/tests/App.groovy:47",
      "severity": "Medium",
      "confidence": "Medium",
      "scanner": {
        "id": "custom scanner",
        "name": "Custom Scanner",
        "version": "0.0.0",
        "vendor": { "name": "Custom Vendor" }
      },
      "location": {
        "file": "new_file.c",
        "start_line": 6,
        "end_line": 6,
        "class": "",
        "method": "main",
        "dependency": {
          "package": {}
        }
      },
      "identifiers": [
        {
          "type": "find_sec_bugs_type",
          "name": "Find Security Bugs-PREDICTABLE_RANDOM",
          "value": "PREDICTABLE_RANDOM",
          "url": "https://find-sec-bugs.github.io/bugs.htm#PREDICTABLE_RANDOM"
        },
        {
          "type": "cwe",
          "name": "CWE-330",
          "value": "330",
          "url": "https://cwe.mitre.org/data/definitions/330.html"
        }
      ],
      "tracking": {
        "type": "source",
        "items": [
          {
            "file": "path/to/file1.ext",
            "start_line": 10,
            "end_line": 20
          },
          {
            "file": "path/to/file2.ext",
            "start_line": 10,
            "end_line": 20
          },
          {
            "file": "path/to/file3.ext",
            "start_line": 10,
            "end_line": 20
          }
        ]
      }
    }
  ],
  "remediations": []
}
Example JSON Report - WITH calculated signatures
{
  "version": "0.0.0",
  "vulnerabilities": [
    {
      "category": "sast",
      "name": "Predictable pseudorandom number generator",
      "message": "Predictable pseudorandom number generator",
      "description": "This random generator (java.util.Random) is predictable",
      "cve": "818bf5dacb291e15d9e6dc3c5ac32178:PREDICTABLE_RANDOM:groovy/src/main/groovy/com/gitlab/security_products/tests/App.groovy:47",
      "severity": "Medium",
      "confidence": "Medium",
      "scanner": {
        "id": "custom scanner",
        "name": "Custom Scanner",
        "version": "0.0.0",
        "vendor": { "name": "Custom Vendor" }
      },
      "location": {
        "file": "new_file.c",
        "start_line": 6,
        "end_line": 6,
        "class": "",
        "method": "main",
        "dependency": {
          "package": {}
        }
      },
      "identifiers": [
        {
          "type": "find_sec_bugs_type",
          "name": "Find Security Bugs-PREDICTABLE_RANDOM",
          "value": "PREDICTABLE_RANDOM",
          "url": "https://find-sec-bugs.github.io/bugs.htm#PREDICTABLE_RANDOM"
        },
        {
          "type": "cwe",
          "name": "CWE-330",
          "value": "330",
          "url": "https://cwe.mitre.org/data/definitions/330.html"
        }
      ],
      "tracking": {
        "type": "source",
        "items": [
          {
            "file": "path/to/file1.ext",
            "start_line": 10,
            "end_line": 20,
            "signatures": [
              { "algorithm": "scope_offset", "value": "path/to/file1.ext|scope1|scope2:2" }
            ]
          },
          {
            "file": "path/to/file2.ext",
            "start_line": 10,
            "end_line": 20,
            "signatures": [
              { "algorithm": "scope_offset", "value": "path/to/file2.ext|scope1|scope2:2" }
            ]
          },
          {
            "file": "path/to/file3.ext",
            "start_line": 10,
            "end_line": 20,
            "signatures": [
              { "algorithm": "scope_offset", "value": "path/to/file3.ext|scope1|scope2:2" }
            ]
          }
        ]
      }
    }
  ],
  "remediations": []
}

Analyzer's View

An analyzer declares its intent for tracking the vulnerability by defining a tracking field for each vulnerability finding in the report:

"tracking": {
  "type": "source",
  "items": [
    {
      "file": "path/to/file1.ext",
      "start_line": 10,
      "end_line": 20
    },
    {
      "file": "path/to/file2.ext",
      "start_line": 10,
      "end_line": 20
    },
    {
      "file": "path/to/file3.ext",
      "start_line": 10,
      "end_line": 20
    }
  ]
}

With type being either:

  • hash - Simply hash the data provided to calculate the fingerprint (Deferred, see gitlab-org/gitlab#329385)
  • source - The location is within structured data [source code] and should be tracked using the best available source-specific tracking algorithms

Inserting Calculated Tracking Fingerprint Values

A post-analyzer may run and update an existing security report's tracking field values to contain the calculated, algorithm-specific signatures:

"tracking": {
  "type": "source",
  "items": [
    {
      "file": "path/to/file1.ext",
      "start_line": 10,
      "end_line": 20,
      "signatures": [
        { "algorithm": "scope_offset", "value": "path/to/file1.ext|scope1|scope2:2" }
      ]
    },
    {
      "file": "path/to/file2.ext",
      "start_line": 10,
      "end_line": 20,
      "signatures": [
        { "algorithm": "scope_offset", "value": "path/to/file2.ext|scope1|scope2:2" }
      ]
    },
    {
      "file": "path/to/file3.ext",
      "start_line": 10,
      "end_line": 20,
      "signatures": [
        { "algorithm": "scope_offset", "value": "path/to/file3.ext|scope1|scope2:2" }
      ]
    }
  ]
}

This is the full form of the tracking field, with the signatures object fully defined.

Availability and Testing

  • Review and add/update tests for this feature/bug

Approvals

Edited by 🤖 GitLab Bot 🤖

Merge request reports