Improve Vulnerability Tracking: Add tracking field
What does this MR do?
This merge request adds the optional tracking
field to the security reports.
If the tracking
field is not present, behavior falls back to using the current location-based tracking methods.
This field is intended to be used by a post-analyzer to perform better location tracking. See https://gitlab.com/groups/gitlab-org/-/epics/4690 and gitlab-org/gitlab#293706 (closed)
Video Overview & Expected FAQs
CLICK |
---|
The above video is a 4:27 high-level summary of improving vulnerability tracking with taggr. See the longer, in-depth brown-bag video for more details
[Expected] FAQs
Why is items
an array?
items
is an array so that we can support use cases where vulnerability uniqueness is determined by an ordered collection of data. The current method of having one item via the location
field is the same as having an array of tracking items
with one entry.
This solves a current use case, as well as future use cases:
- current: Fuzzing
- Uses unique stack traces (an ordered collection of physical locations) to determine uniqueness
- future: Graph-based analyzers
- Can provide us with source->sink control flows, also giving us an ordered collection of physical locations
Each item in the tracking array has tracking signatures calculated. These are concatenated together in the GitLab backend (see the GitLab backend MR) to form one final signature.
Are we getting rid of `location`?
No, it’s still useful metadata. It’s point-data, not tracking data. Olivier: this might also be still mandatory for tracking any other kind of vulnerability that doesn’t impact “source code” (or anything that can be converted into an AST?).
Why can’t we do this in the gitlab backend?
It’s too heavy, needs a lot of dependencies
How do we handle updates/fixes/versions to tracking algorithms [post-analyzers]?
We are handling versioning by explicitly not supporting versioning. An updated post-analyzer must be a new revision of the post-analyzer, and is treated as a distinct algorithm.
For example, taggr currently implements the initial scope_offset
algorithm. Suppose a bug were found in its implementation - the new revision of taggr should identify its algorithm as scope_offset_rev2
(or something similar). Docker container images of analyzers would then package up both the original taggr version, and the latest revision.
This would result in items in the JSON security report having both scope_offset
and scope_offset_rev2
algorithms in the signature field:
"tracking": {
"type": "source",
"items": [
{
"file": "...",
"start_line": 0,
"end_line": 1,
"signatures": [
{ "algorithm": "scope_offset", "value": "AAA" },
{ "algorithm": "scope_offset_rev2", "value": "AAA" }
]
}
]
}
This provides maximum overlap between previously-calculated signatures that exist in GitLab's database and newly calculated signatures. If we did not continue to produce signatures from previous algorithm revisions, then we may not be able to track a vulnerability from its old location to its new location.
Examples
Example JSON Report - without calculated signatures
{
"version": "0.0.0",
"vulnerabilities": [
{
"category": "sast",
"name": "Predictable pseudorandom number generator",
"message": "Predictable pseudorandom number generator",
"description": "This random generator (java.util.Random) is predictable",
"cve": "818bf5dacb291e15d9e6dc3c5ac32178:PREDICTABLE_RANDOM:groovy/src/main/groovy/com/gitlab/security_products/tests/App.groovy:47",
"severity": "Medium",
"confidence": "Medium",
"scanner": {
"id": "custom scanner",
"name": "Custom Scanner",
"version": "0.0.0",
"vendor": { "name": "Custom Vendor" }
},
"location": {
"file": "new_file.c",
"start_line": 6,
"end_line": 6,
"class": "",
"method": "main",
"dependency": {
"package": {}
}
},
"identifiers": [
{
"type": "find_sec_bugs_type",
"name": "Find Security Bugs-PREDICTABLE_RANDOM",
"value": "PREDICTABLE_RANDOM",
"url": "https://find-sec-bugs.github.io/bugs.htm#PREDICTABLE_RANDOM"
},
{
"type": "cwe",
"name": "CWE-330",
"value": "330",
"url": "https://cwe.mitre.org/data/definitions/330.html"
}
],
"tracking": {
"type": "source",
"items": [
{
"file": "path/to/file1.ext",
"start_line": 10,
"end_line": 20
},
{
"file": "path/to/file2.ext",
"start_line": 10,
"end_line": 20
},
{
"file": "path/to/file3.ext",
"start_line": 10,
"end_line": 20
}
]
}
}
],
"remediations": []
}
Example JSON Report - WITH calculated signatures
{
"version": "0.0.0",
"vulnerabilities": [
{
"category": "sast",
"name": "Predictable pseudorandom number generator",
"message": "Predictable pseudorandom number generator",
"description": "This random generator (java.util.Random) is predictable",
"cve": "818bf5dacb291e15d9e6dc3c5ac32178:PREDICTABLE_RANDOM:groovy/src/main/groovy/com/gitlab/security_products/tests/App.groovy:47",
"severity": "Medium",
"confidence": "Medium",
"scanner": {
"id": "custom scanner",
"name": "Custom Scanner",
"version": "0.0.0",
"vendor": { "name": "Custom Vendor" }
},
"location": {
"file": "new_file.c",
"start_line": 6,
"end_line": 6,
"class": "",
"method": "main",
"dependency": {
"package": {}
}
},
"identifiers": [
{
"type": "find_sec_bugs_type",
"name": "Find Security Bugs-PREDICTABLE_RANDOM",
"value": "PREDICTABLE_RANDOM",
"url": "https://find-sec-bugs.github.io/bugs.htm#PREDICTABLE_RANDOM"
},
{
"type": "cwe",
"name": "CWE-330",
"value": "330",
"url": "https://cwe.mitre.org/data/definitions/330.html"
}
],
"tracking": {
"type": "source",
"items": [
{
"file": "path/to/file1.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file1.ext|scope1|scope2:2" }
]
},
{
"file": "path/to/file2.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file2.ext|scope1|scope2:2" }
]
},
{
"file": "path/to/file3.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file3.ext|scope1|scope2:2" }
]
}
]
}
}
],
"remediations": []
}
Analyzer's View
An analyzer declares its intent for tracking the vulnerability by defining a tracking
field for each vulnerability finding in the report:
"tracking": {
"type": "source",
"items": [
{
"file": "path/to/file1.ext",
"start_line": 10,
"end_line": 20
},
{
"file": "path/to/file2.ext",
"start_line": 10,
"end_line": 20
},
{
"file": "path/to/file3.ext",
"start_line": 10,
"end_line": 20
}
]
}
With type
being either:
-
(Deferred, see gitlab-org/gitlab#329385)hash
- Simply hash the data provided to calculate the fingerprint -
source
- The location is within structured data [source code] and should be tracked using the best available source-specific tracking algorithms
Inserting Calculated Tracking Fingerprint Values
A post-analyzer may run and update an existing security report's tracking
field values to contain the calculated, algorithm-specific signatures:
"tracking": {
"type": "source",
"items": [
{
"file": "path/to/file1.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file1.ext|scope1|scope2:2" }
]
},
{
"file": "path/to/file2.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file2.ext|scope1|scope2:2" }
]
},
{
"file": "path/to/file3.ext",
"start_line": 10,
"end_line": 20,
"signatures": [
{ "algorithm": "scope_offset", "value": "path/to/file3.ext|scope1|scope2:2" }
]
}
]
}
This is the full form of the tracking
field, with the signatures
object fully defined.
Availability and Testing
-
Review and add/update tests for this feature/bug
Approvals
-
groupcomposition analysis @fcatteau -
groupstatic analysis @theoretick -
groupdynamic analysis @cam_swords -
~"group::fuzz testing" (add your name when approving)