Add scanner_reported_severity column to security_findings
Summary
Eliminates the expensive LEFT JOIN in FindingsFinder by adding a scanner_reported_severity column to security_findings and updating the ingestion logic to store effective severity directly.
What does this MR do?
Phase 1: Schema Change
- Adds
scanner_reported_severity smallintcolumn to store the original scanner-reported severity - Creates feature flag
security_findings_effective_severity(gitlab_com_derisk)
Phase 2: Write Path (Ingestion)
- Updates
StoreFindingsService#store_finding_batchto lookup existing severity overrides - Stores both
scanner_reported_severity(original) andseverity(effective with overrides applied)
Phase 3: Read Path (FindingsFinder)
- When feature flag is enabled, filters directly on
security_findings.severitywithout LEFT JOIN - Removes the COALESCE expression that required joining to
vulnerability_occurrences
Changes
| File | Description |
|---|---|
db/migrate/... |
Add scanner_reported_severity smallint column |
config/feature_flags/... |
Feature flag security_findings_effective_severity
|
ee/app/services/security/store_findings_service.rb |
Lookup overrides during ingestion |
ee/app/finders/security/findings_finder.rb |
Remove LEFT JOIN when flag enabled |
ee/spec/... |
Tests for both changes |
Query Plan Comparison
Test Setup
Using scan_id = 384406048 (gitlab-org/gitlab, partition 235, 45 findings with severities 6-7).
Before (with LEFT JOIN)
SELECT security_findings.*
FROM security_findings
LEFT OUTER JOIN vulnerability_occurrences ON vulnerability_occurrences.uuid = security_findings.uuid
WHERE security_findings.scan_id = 384406048
AND security_findings.partition_number = 235
AND security_findings.deduplicated = TRUE
AND COALESCE(vulnerability_occurrences.severity, security_findings.severity) IN (5,6,7)
ORDER BY security_findings.severity DESC, security_findings.id ASC
LIMIT 20;
Limit (cost=2763.03..2763.06 rows=9 width=1110) (actual time=0.328..0.331 rows=20 loops=1)
-> Sort (cost=2763.03..2763.06 rows=9 width=1110) (actual time=0.327..0.329 rows=20 loops=1)
Sort Key: security_findings.severity DESC, security_findings.id
Sort Method: top-N heapsort Memory: 49kB
-> Nested Loop Left Join (cost=1.01..2762.89 rows=9 width=1110) (actual time=0.024..0.298 rows=45 loops=1)
Filter: (COALESCE(vulnerability_occurrences.severity, security_findings.severity) = ANY ('{5,6,7}'::integer[]))
-> Index Scan using security_findings_235_scan_id_deduplicated_idx on security_findings_235 security_findings (cost=0.44..609.34 rows=598 width=1110) (actual time=0.014..0.034 rows=45 loops=1)
Index Cond: ((scan_id = 384406048) AND (deduplicated = true))
Filter: (partition_number = 235)
-> Index Scan using index_vuln_findings_on_uuid_including_vuln_id_1 on vulnerability_occurrences (cost=0.57..3.59 rows=1 width=18) (actual time=0.005..0.005 rows=1 loops=45)
Index Cond: (uuid = security_findings.uuid)
Planning Time: 0.419 ms
Execution Time: 0.363 ms
- Cost: 2763
- Execution: 0.363 ms
- Requires
Nested Loop Left Jointovulnerability_occurrences(45 index lookups)
After (without LEFT JOIN)
SELECT security_findings.*
FROM security_findings
WHERE security_findings.scan_id = 384406048
AND security_findings.partition_number = 235
AND security_findings.deduplicated = TRUE
AND security_findings.severity IN (5,6,7)
ORDER BY security_findings.severity DESC, security_findings.id ASC
LIMIT 20;
Limit (cost=621.03..621.08 rows=20 width=1110) (actual time=0.062..0.064 rows=20 loops=1)
-> Sort (cost=621.03..621.92 rows=355 width=1110) (actual time=0.061..0.062 rows=20 loops=1)
Sort Key: security_findings.severity DESC, security_findings.id
Sort Method: top-N heapsort Memory: 49kB
-> Index Scan using security_findings_235_scan_id_deduplicated_idx on security_findings_235 security_findings (cost=0.44..611.59 rows=355 width=1110) (actual time=0.020..0.041 rows=45 loops=1)
Index Cond: ((scan_id = 384406048) AND (deduplicated = true))
Filter: ((partition_number = 235) AND (severity = ANY ('{5,6,7}'::integer[])))
Planning Time: 0.203 ms
Execution Time: 0.087 ms
- Cost: 621 (~4.5x reduction)
- Execution: 0.087 ms (~4x faster)
- Simple index scan, no join to
vulnerability_occurrences
Summary
| Metric | Before (JOIN) | After (no JOIN) | Improvement |
|---|---|---|---|
| Cost | 2763 | 621 | 4.5x lower |
| Execution time | 0.363 ms | 0.087 ms | 4x faster |
| Join operations | 45 index lookups | 0 | Eliminated |
Note: The postgres.ai explain commands returned 0 rows due to clone timing, but EXPLAIN ANALYZE executed directly on the clone returned the expected 45 rows and 20-row results shown above.
Why a large table exception?
The security_findings table is large but:
- This is a
smallint(2 bytes) column with no default - The table has 30-90 day retention, so old data cycles out naturally
- The performance benefit of eliminating the LEFT JOIN significantly outweighs the storage cost
Key Design Decisions
- No backfill needed - 30-90 day retention means old data cycles out naturally
- Eventual consistency - Next pipeline run picks up any new overrides
-
Scanner severity always stored - Even when flag is off,
scanner_reported_severityis populated for future use
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist.
Related issues
Closes #590349
Related: https://gitlab.com/gitlab-com/request-for-help/-/issues/4212
Edited by Gregory Havenga