Optimize SecurityFinding scopes: by_state, recently_detected, undismissed_by_vulnerability
Follow-up to !214777 (merged) to optimize the three scopes that are causing query performance issues.
Context
In !214777 (merged), we identified that the following scopes are causing significant performance problems due to expensive NOT EXISTS subqueries that join through vulnerability_occurrences:
scope :by_state, ->(states) do
states = Array(states).map(&:to_s)
relation = where(
'EXISTS (?)',
Vulnerability.select(1)
.with_states(states)
.joins(:findings)
.where('vulnerability_occurrences.uuid = security_findings.uuid')
)
# If the given list of states includes `detected` we should return
# the findings which does not exist on main branch as well.
relation = relation.or(recently_detected) if states.include?('detected')
relation
end
scope :recently_detected, -> do
where(
'NOT EXISTS (?)',
Vulnerabilities::Finding.select(1)
.where('vulnerability_occurrences.uuid = security_findings.uuid')
)
end
scope :undismissed_by_vulnerability, -> do
where(
'NOT EXISTS (?)',
Vulnerability.select(1)
.dismissed
.joins(:findings)
.where('vulnerability_occurrences.uuid = security_findings.uuid')
)
endProblem
These scopes generate expensive NOT EXISTS subqueries that:
- Load significant amounts of data from
vulnerability_occurrencestable - Cause the query planner to scan many rows even when filtering by severity
- Contribute to query timeouts on deep pagination
When state: :detected is applied, the query becomes even worse with an OR condition that includes both EXISTS and NOT EXISTS checks.
Potential Solutions
- Add index on vulnerability_occurrences.uuid - Currently no index exists on the UUID column in the vulnerabilities table, forcing the join through vulnerability_occurrences
- Denormalize state information - Consider storing vulnerability state directly on security_findings to avoid the join
- Refactor the query logic - Explore alternative approaches to filtering by vulnerability state without expensive subqueries
- Use materialized views - Pre-compute valid findings for common filter combinations
Related Issues
- #574214 (closed) (original timeout issue)
- !214777 (merged) (keyset pagination fix)
Edited by 🤖 GitLab Bot 🤖