Skip to content

chore: Limit MarkDroppedAsResolved lookup to primary_ids only

What does this MR do and why?

Previously, when primary_identifiers were provided (only in the case of SAST), we would resolve any identifier types that were resolved on the default branch regardless of whether they were present within the report itself.

This isn't an issue if a pipeline contains all scanners that return no primary identifiers.

This is also not an issue if a pipeline contains all scanners that return primary identifiers.

However, we could accidentally auto-resolve vulnerabilities which now co-exist in merged reports which then contain a subset of the total identifiers after merging multiple together.

See discovery and description of scenario in #368284 (comment 1220388069)

database changes

Refactor join on Vulnerability.by_primary_identifier_ids

This join is only used within MarkDroppedAsResolvedWorker and joins unnecessarily across all identifiers when all we care about is the primary identifier.

Updated query plan https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14174/commands/49741

Reindex common_finder_query with resolved_on_default_branch

Drop previously added index with !100044 (merged) replacing with union of fields from index_vuln_reads_common_query_on_resolved_on_default_branch.

❯ be rake db:migrate
main: == 20230104220137 ReindexVulnReadsOnDefaultBranchWithCommonQuery: migrating ===
main: -- transaction_open?()
main:    -> 0.0001s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.1246s
main: -- index_exists?(:vulnerability_reads, [:project_id, :state, :report_type, :vulnerability_id], {:name=>"index_vuln_reads_common_query_on_resolved_on_default_branch", :where=>"resolved_on_default_branch IS TRUE", :order=>{:vulnerability_id=>:desc}, :algorithm=>:concurrently})
main:    -> 0.0071s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0002s
main: -- add_index(:vulnerability_reads, [:project_id, :state, :report_type, :vulnerability_id], {:name=>"index_vuln_reads_common_query_on_resolved_on_default_branch", :where=>"resolved_on_default_branch IS TRUE", :order=>{:vulnerability_id=>:desc}, :algorithm=>:concurrently})
main:    -> 0.0234s
main: -- execute("RESET statement_timeout")
main:    -> 0.0004s
main: == 20230104220137 ReindexVulnReadsOnDefaultBranchWithCommonQuery: migrated (0.1652s)

main: == 20230104224020 DropVulnReadsOnDefaultBranchIndex: migrating ================
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0006s
main: -- indexes(:vulnerability_reads)
main:    -> 0.0054s
main: -- remove_index(:vulnerability_reads, {:algorithm=>:concurrently, :name=>"index_vuln_reads_on_resolved_on_default_branch"})
main:    -> 0.0027s
main: == 20230104224020 DropVulnReadsOnDefaultBranchIndex: migrated (0.0135s) =======
Index creation
CREATE INDEX index_vuln_reads_common_query_on_resolved_on_default_branch ON vulnerability_reads USING btree (project_id, state, report_type, vulnerability_id DESC) WHERE (resolved_on_default_branch IS TRUE);

Response:

The query has been executed. Duration: 3.862 min

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14246/commands/50001

Query utilizing new index

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/14246/commands/50006

How to set up and validate locally

  1. Feature.enabled(:sec_mark_dropped_findings_as_resolved)
  2. Run a pipeline containing non-overlapping vulnerabilities identifiable by both semgrep and another scanner (i.e. spotbugs)
  3. Remove the semgrep rule (but ensure semgrep's ruleset is not empty or auto-resolution will not occur)
  4. Run a new pipeline
  5. Ensure the spotbugs result is not auto-resolved

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Lucas Charles

Merge request reports