Engineering Discovery: first-class vulnerabilities

Product discovery (high-level choices): https://gitlab.com/gitlab-org/gitlab-ee/issues/8493

This iteration is about the technical engineering discovery to understand how we can implement first-class vulnerabilities in our current architecture.

Notes from conversation about this topic at Contribute.

NOTE: much of this issue has refocused on UX and Product discovery as many points needed to be revisited. See discussion below for details on initial JTBD, interactions, and direction.

Development log

Status

Jobs To Be Done defined
Data model defined
Implementation steps finalized

Decisions

Terminology

Finding

Something with the potential to be vulnerable. This is a subgroup of the existing Vulnerability Occurrences in a pre-merge state (within Merge Terminology). Findings are not persisted in the database.

Priority Finding

Something with the potential to be vulnerable. This is a subgroup of the existing Vulnerability Occurrences in a post-merge state (within default branches). Priority Finding is persisted in the database and maps to a record in vulnerability_occurrences DB table. A Priority Finding is an "occurrence" that has not been confirmed (and didn't spawn a Vulnerability entity yet) but it has become a part of the production-deployed codebase. It can be in either confirmed or unconfirmed state. A confirmed Priority Finding is the one that has a Vulnerability referencing it.

Vulnerability

A confirmed entity that represents a vulnerability within the codebase. This is a result of the "promotion" of a Priority Finding. A Vulnerability is either in a state of open (some unresolved Findings exist) or resolved (no more** unresolved Findings exist).

**A vulnerability MUST have occurrences. Even externally-sourced weaknesses such as HackerOne reports require triaging, so they will first appear as weakness and require promotion to a Vulnerability.

Triaging

Triaging is the act of promoting a Finding to a Vulnerability.

This behavior will be introduced from a functional perspective within #11205 (closed).

Triaging is not something that is meant to occur through the MR widget and primarily through dashboard views, hence we say Priority Finding promotion, not just Finding promotion. This is an important distinction. The existing MR procedures of dismissal are preserved as feedback is directly associated with vulnerability_occurrences.

When interacting with Vulnerabilities, a dismissal would result in creating cascading feedback for all associated vulnerability_occurrences. This way we preserve the existing DB relationships, maintain backward compatibility, and allow more granular interactions; i.e. for a Vulnerability w/ 2 Findings we dismiss 1 Finding (within test directory) but create an Issue to address 2nd Finding (within src directory).

Data Model

Naming consideration: vulnerability_occurrences should become findings to more accurately capture the potential of a vulnerability, where "occurrences" implies a confirmation
vulnerabilities have many vulnerability_occurrences
vulnerabilities have many issues through vulnerability_feedback (through vulnerability_occurrences)
a vulnerability is "resolved" when all findings are "resolved"
a vulnerability only has 2 states: open and closed. All possible closed sub-states (i.e. "duplicate", "wont fix", "false positive") should be dismissals of the finding
vulnerabilities.severity and vulnerabilities.confidence are aggregates of the associated findings, however they can be overridden by the user.
- By default, vulnerability.severity is the highest-severity of linked occurrences
- By default, vulnerability.confidence is the lowest-confidence of linked occurrences
vulnerability_feedback remains associated with vulnerability_occurrences. To dismiss a vulnerability that has 2 findings, dismissal feedback is applied to all associated findings as a cascading transaction
vulnerability_feedback.feedback_type uniqueness constraint must be dropped in order to allow a finding to have multiple-associated issues
~~we're not going to add report_type attribute to Vulnerabilities because the Vulnerability has a higher level of abstraction than report and may combine findings from different report types~~ for the MVC, we're adding the report_type to the Vulnerabilities (details here).

Interactions

Vulnerability is an epic-like object
Vulnerability is a project-scoped object (not a group-scoped, like Epics)
Naming consideration: findings are all occurrences. "Priority findings" are present within the default branch.
a 1st Class Vuln (vulnerability) can be created from a finding
a 1st Class Vuln (vulnerability) must be associated with findings (a vulnerability cannot be created in isolation as findings contain the specifics of the vulnerability; i.e. location)
Externally-sourced findings (i.e. HackerOne reports) will always be created as findings to allow triaging and potential promotion to vulnerabilities (vulnerability_occurrences.scanner.name == "hackerone")
a finding must be promoted to a vulnerability to allow other findings to be associated with that vulnerability (bulk-associating findings to a vulnerability on creation will not be supported, but this could be an implementation detail)
a finding can be linked to an existing vulnerability (and only one)
a finding can be disassociated with a vulnerability (unless it's the last-associated finding)
vulnerabilities should be considered "resolved" when all findings are "resolved".
- This will not be an automatic action initially as pipeline-completion (when we detect that findings no longer exist) cannot accurately be tied to a production deployment (to which a vulnerability SLA should be associated).
Externally-sourced findings must be manually resolved and will never be auto-closed
a Vulnerability is created confidential similarly to confidential Issues

UX

a board with mockups, flows, critical assumptions, and open questions

Backend MVC Implementation Plan

Rename existing Vulnerabilities API to Vulnerability Findings API
Create vulnerabilities table (see schema below)
Create Vulnerabilities API with the following capabilities

Get a list of project's vulnerabilities
Create vulnerability from finding (the old term is "occurrence")
~~Delete vulnerability (and disassociate from occurrence)~~ removed
Update vulnerability status:
- Dismiss vulnerability (and dismiss all associated occurrences)
- Resolve vulnerability (and resolve all associated occurrences)

[Optional, depending on capacity] Associate finding to (existing) vulnerability

Frontend MVC Implementation Plan

Rename Security Dashboard Vulnerability List to Finding List
Add secondary tab for Vulnerabilities
- GET /api/v4/projects/:id/vulnerabilities
- GET /api/v4/vulnerabilities/:id
Add interaction to "promote" a Finding to a Vulnerability
- POST /api/v4/vulnerabilities { finding_id: <finding_id> }
~~Add interaction to delete a vulnerability~~
- ~~DELETE /api/v4/projects/4/vulnerabilities/:primary_identifier_id-:location_fingerprint-:scanner_id~~ removed because is considered unnecessary
Add interaction to mark a vulnerability as resolved
- POST /api/v4/vulnerabilities/:id/resolve
Add interaction to mark a vulnerability as dismissed
- POST /api/v4/vulnerabilities/:id/dismiss
[Optional, depending on capacity] Add interaction to associate finding with existing vulnerability
- POST /api/v4/vulnerabilities/:id/attach { vulnerability_id: <vulnerability_id> }: post-MVC, extracted into #34531 (closed)

Schema

vulnerabilities

create_table "vulnerabilities" do |t|
    # epic-like (or issue-like) attributes
    t.bigint "milestone_id"
    t.bigint "epic_id"
    t.bigint "project_id", null: false
    t.bigint "author_id", null: false
    t.bigint "updated_by_id"
    t.bigint "last_edited_by_id"
    t.date "start_date"
    t.date "due_date"
    t.datetime "last_edited_at"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
    t.string "title", null: false
    t.string "title_html", null: false
    t.text "description"
    t.text "description_html"
    t.bigint "start_date_sourcing_milestone_id"
    t.bigint "due_date_sourcing_milestone_id"
    t.integer "state", limit: 2, default: 1, null: false
    t.bigint "closed_by_id"
    t.datetime "closed_at"
    t.integer "report_type", limit: 2, default: 0, null: false

    # vulnerability-like attributes
    t.integer "severity", limit: 2, null: false # auto-calculated as highest-severity finding, but overridable
    t.boolean "severity_overridden", default: false
    t.integer "confidence", limit: 2, null: false # auto-calculated as lowest-confidence finding, but overridable
    t.boolean "confidence_overridden", default: false
end

add_foreign_key :vulnerabilities, :projects
add_foreign_key :vulnerabilities, :milestones
add_foreign_key :vulnerabilities, :epics
add_foreign_key :vulnerabilities,  :users, :author_id
add_foreign_key :vulnerabilities,  :users, :updated_by_id
add_foreign_key :vulnerabilities,  :users, :last_edited_by_id
add_foreign_key :vulnerabilities,  :users, :closed_by_id

* iid (project-scoped internal ID) was removed from the initial schema by the maintainer recommendation

vulnerability_occurrences

add_reference :vulnerability_occurrences, :vulnerabilities, :bigint, index: true, foreign_key: true

Edited Oct 29, 2019 by Victor Zagorodny