Expand security analyzer common functionality to include fingerprinting of vulnerability source
Problem to solve
Our Secure analyzers require the functionality to uniquely fingerprint occurrences of vulnerabilities so we have a stable key to compare against. In most cases this could be done uniquely via the line number and rule identifier, however there are certain problems with this approach:
- fingerprints should remain the same if a line number changes. Since vulnerability feedback is tied to a fingerprint any changes to the file will result in a non-matching fingerprint and the vulnerability feedback will no longer be associated (see related discussion #6590).
- If a line has multiple occurrences of the same class of vulnerability each occurrence will result in the same comparison key.
Several of our analyzers are currently susceptible to this:
Sasha, Software Developer, https://design.gitlab.com/research/personas#persona-sasha
Sam, Security Analyst, https://design.gitlab.com/research/personas#persona-sam -->
common library with functionality for generating a digest of the source code for a given vulnerability.
At a minimum this will require reading a file by
Line however for accurate fingerprinting we would want to read by column as well (see problem
What does success look like, and how can we measure that?
Vulnerability occurrence fingerprints are stable even when the line number has changed