Skip to content

Remove duplicate broken container scanning findings

Can Eldem requested to merge remove-duplicated-cs-findings-without-vid into master

What does this MR do?

Incident issue

#246797 (closed)

Latest migration couldn't remove duplicated migrations because it didn't cover the ones without vulnerability_id. (this normally should not be the case and groupthreat insights is working on it)

This migration removes duplications created without vulnerability ids.

--- this one is just to check data not part of migration
SELECT aaa.id bad, 
       aaa.vulnerability_id as bad_vulnerability_id,
       bbb.id good,
       bbb.vulnerability_id as  good_vulnerability_id
FROM   vulnerability_occurrences AS aaa 
       JOIN vulnerability_occurrences bbb 
              ON bbb.location_fingerprint = Decode( 
                 Encode(aaa. location_fingerprint, 
                 'escape'), 
                    'hex') 
                 AND bbb.report_type = 2 
                 AND bbb.id <> aaa.id 
                 AND aaa.scanner_id = bbb.scanner_id 
                 AND aaa.primary_identifier_id = bbb.primary_identifier_id 
                 AND bbb.project_id = aaa.project_id 
WHERE  aaa.report_type = 2 
       AND Length(aaa.location_fingerprint) = 40;

Migration Query

explain SELECT id, project_id, primary_identifier_id, location_fingerprint, scanner_id
FROM "vulnerability_occurrences"
WHERE "vulnerability_occurrences"."id" BETWEEN 231411 AND 2312411 AND "vulnerability_occurrences"."report_type" = 2 AND (length(location_fingerprint) = 40) and vulnerability_id is null

https://explain.depesz.com/s/N7er

Time: 1.820 s
  - planning: 0.343 ms
  - execution: 1.820 s
    - I/O read: 1.801 s
    - I/O write: 0.000 ms
UPDATE "vulnerability_occurrences" 
SET    "location_fingerprint" = '\x43dcc8b28550993b1f8c1ca2d09d324e26bf5ac4' 
WHERE  "vulnerability_occurrences"."id" = 239934 

https://explain.depesz.com/s/iCE6O

Time: 80.390 ms
  - planning: 0.186 ms
  - execution: 80.204 ms
    - I/O read: 78.849 ms
    - I/O write: 0.000 ms
explain DELETE FROM "vulnerability_occurrences" WHERE "vulnerability_occurrences"."id" = 1

https://explain.depesz.com/s/Pcn2

Time: 668.960 ms
  - planning: 0.267 ms
  - execution: 668.693 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Conformity

Related cascade deletes

ALTER TABLE ONLY public.vulnerability_occurrences
    ADD CONSTRAINT fk_97ffe77653 FOREIGN KEY (vulnerability_id) REFERENCES public.vulnerabilities(id) ON DELETE SET NULL;

---- 
ALTER TABLE ONLY public.vulnerability_occurrence_identifiers
    ADD CONSTRAINT fk_rails_e4ef6d027c FOREIGN KEY (occurrence_id) REFERENCES public.vulnerability_occurrences(id) ON DELETE CASCADE;


ALTER TABLE ONLY public.vulnerability_occurrence_pipelines
    ADD CONSTRAINT fk_rails_dc3ae04693 FOREIGN KEY (occurrence_id) REFERENCES public.vulnerability_occurrences(id) ON DELETE CASCADE;

----

Estimated times

  • 1_500 /1000 = 2 loop
  • 1000 * 800ms = 800_000 ms each loops update time
  • 800_000 * 2 = 160_000 (~160 seconds)
  • 4 (waiting min) + 2 = 6 min
Edited by Can Eldem

Merge request reports