Skip to content

Drop invalid records from vulnerability_feedback table

Michał Zając requested to merge 324893-drop-invalid-feedbacks into master

What does this MR do and why?

Related to #372251 (closed)

Database review

Total count: 380 282

SELECT count(*) FROM vulnerability_feedback;

Rows without matching uuid

Count: 106724

SELECT COUNT(v_f.finding_uuid)
FROM vulnerability_feedback v_f
LEFT JOIN vulnerability_occurrences v_o
ON v_f.finding_uuid = v_o.uuid::uuid
WHERE v_o.uuid IS NULL
AND v_f.finding_uuid IS NOT NULL;

Rows without matching project_fingerprint

Count: 7638

SELECT COUNT(*)
FROM vulnerability_feedback v_f
LEFT JOIN vulnerability_occurrences v_o
ON v_f.project_fingerprint::bytea = v_o.project_fingerprint
WHERE v_f.finding_uuid IS NULL
AND v_o.id IS NULL;

Query plan

TODO: This is really slow, either add index, rewrite this or make two separate migrations

Batch selection query
SELECT
  "vulnerability_feedback".*
FROM
  (
    (
      SELECT
        "vulnerability_feedback".*
      FROM
        "vulnerability_feedback"
        LEFT JOIN vulnerability_occurrences ON vulnerability_feedback.finding_uuid = vulnerability_occurrences.uuid :: uuid
      WHERE
        (
          vulnerability_feedback.finding_uuid IS NOT NULL
          AND vulnerability_occurrences.id IS NULL
        )
    )
    UNION
      (
        SELECT
          "vulnerability_feedback".*
        FROM
          "vulnerability_feedback"
          LEFT JOIN vulnerability_occurrences ON vulnerability_feedback.project_fingerprint :: bytea = vulnerability_occurrences.project_fingerprint
        WHERE
          (
            vulnerability_feedback.finding_uuid IS NULL
            AND vulnerability_occurrences.id IS NULL
          )
      )
  ) vulnerability_feedback
WHERE
  "vulnerability_feedback"."id" BETWEEN 1
  AND 100
  AND "vulnerability_feedback"."id" >= 1

Cold cache:

Warm cache:

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Michał Zając

Merge request reports