Skip to content

Add violation_data to scan_result_policy_violations

What does this MR do and why?

This MR adds violation_data to scan_result_policy_violations table so that details about policy violations can be persisted and resurfaced to the user.

The following cases will be covered by this field (there's not going to be any filtering based on this data):

  • Context:
  • Violations:
    • scan_finding rules: newly detected and previously existing uuid that caused violation. Example:
      {"violations": {"uuids": {"newly_detected": ["48cf6bb7-3c8b-5bad-a07a-0104d50ab74f"], "previously_existing": ["6885fb09-428e-5471-ab39-762366eefcba"]}}}
    • license_scanning rules: violated license names. Example:
      {"violations": {"licenses": ["MIT"]}
    • any_merge_request rules: whether unsigned or any commit caused the violation and approvals to be required
      {"violations": {"commits": "UNSIGNED"}}
  • Errors:
    • scan removed. Example:
      {"errors": [{error: "SCAN_REMOVED", missing_scans: ["sast"]]}
    • scanner missing. Example:
      {"errors": [{error: "SCANNER_MISSING", missing_scanners: ["sast"]]}

We want to be able to display the reasons for policy violations in &12191:

Bot designs

Database

Security::ScanResultPolicyViolation.upsert_all(attrs, unique_by: %w[scan_result_policy_id merge_request_id])

New query:

INSERT INTO "scan_result_policy_violations" ("scan_result_policy_id","merge_request_id","project_id","violation_data","created_at","updated_at") VALUES (204490, 2004, 179, '{"errors":[{"error":"SCAN_REMOVED","missing_scans":["sast"]}]}', CURRENT_TIMESTAMP, CURRENT_TIMESTAMP) ON CONFLICT ("scan_result_policy_id","merge_request_id") DO UPDATE SET updated_at=(CASE WHEN ("scan_result_policy_violations"."project_id" IS NOT DISTINCT FROM excluded."project_id" AND "scan_result_policy_violations"."violation_data" IS NOT DISTINCT FROM excluded."violation_data") THEN "scan_result_policy_violations".updated_at ELSE CURRENT_TIMESTAMP END),"project_id"=excluded."project_id","violation_data"=excluded."violation_data" RETURNING "id"

Execution plan from local because violation_data column doesn't exist on production yet: https://explain.depesz.com/s/evCM

Previous query:

INSERT INTO "scan_result_policy_violations" ("scan_result_policy_id","merge_request_id","project_id","created_at","updated_at") VALUES (204490, 2004, 179, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP) ON CONFLICT  DO NOTHING RETURNING "id"

https://explain.depesz.com/s/f2tY

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #433390 (closed)

Edited by Martin Čavoj

Merge request reports