Optimize SecurityFinding scopes: by_state, recently_detected, undismissed_by_vulnerability

Follow-up to !214777 (merged) to optimize the three scopes that are causing query performance issues.

Context

In !214777 (merged), we identified that the following scopes are causing significant performance problems due to expensive NOT EXISTS subqueries that join through vulnerability_occurrences:

scope :by_state, ->(states) do
  states = Array(states).map(&:to_s)

  relation = where(
    'EXISTS (?)',
    Vulnerability.select(1)
      .with_states(states)
      .joins(:findings)
      .where('vulnerability_occurrences.uuid = security_findings.uuid')
  )

  # If the given list of states includes `detected` we should return
  # the findings which does not exist on main branch as well.
  relation = relation.or(recently_detected) if states.include?('detected')

  relation
end

scope :recently_detected, -> do
  where(
    'NOT EXISTS (?)',
    Vulnerabilities::Finding.select(1)
      .where('vulnerability_occurrences.uuid = security_findings.uuid')
  )
end

scope :undismissed_by_vulnerability, -> do
  where(
    'NOT EXISTS (?)',
    Vulnerability.select(1)
      .dismissed
      .joins(:findings)
      .where('vulnerability_occurrences.uuid = security_findings.uuid')
  )
end

Problem

These scopes generate expensive NOT EXISTS subqueries that:

  • Load significant amounts of data from vulnerability_occurrences table
  • Cause the query planner to scan many rows even when filtering by severity
  • Contribute to query timeouts on deep pagination

When state: :detected is applied, the query becomes even worse with an OR condition that includes both EXISTS and NOT EXISTS checks.

Potential Solutions

  1. Add index on vulnerability_occurrences.uuid - Currently no index exists on the UUID column in the vulnerabilities table, forcing the join through vulnerability_occurrences
  2. Denormalize state information - Consider storing vulnerability state directly on security_findings to avoid the join
  3. Refactor the query logic - Explore alternative approaches to filtering by vulnerability state without expensive subqueries
  4. Use materialized views - Pre-compute valid findings for common filter combinations
Edited by 🤖 GitLab Bot 🤖