Skip to content

Remove `[analyzer]_pipeline` metrics join on ci_builds"

Subashis Chakraborty requested to merge 338010-latest-successful-by-build into master

What does this MR do and why?

[analyzer]_pipeline service ping metrics are sent for every CI job that contains a scan, not just for those that are part of successful non-retried jobs. This is an attempt to reduce the load on the database.

Related to #338010 (closed)

We took similar approach in this MR.

Impact to the database

Query

Before:

SELECT 
  "security_scans".* 
FROM 
  "security_scans" 
  INNER JOIN "ci_builds" ON "ci_builds"."id" = "security_scans"."build_id" 
  AND "ci_builds"."type" = 'Ci::Build' 
WHERE 
  "ci_builds"."status" = 'success' 
  AND (
    "ci_builds"."retried" = FALSE 
    OR "ci_builds"."retried" IS NULL
  ) 
  AND "security_scans"."scan_type" = 1 
  AND "security_scans"."created_at" BETWEEN '2021-09-27 19:55:50.275002' 
  AND '2021-09-29 19:55:45.806409'

The explain plan generated from #database_lab can be found at the following location https://explain.depesz.com/s/yE1T

After

A query like the following will be made for each scan type (of which there are nine).

SELECT 
  "security_scans".* 
FROM 
  "security_scans" 
WHERE 
  "security_scans"."scan_type" = 1 
  AND "security_scans"."created_at" BETWEEN '2021-09-27 19:55:50.275002' 
  AND '2021-09-29 19:55:45.806409'

This is an expected improvement on the previous query as it doesn't have to join with the ci_builds table and filter by the latest successful build.

The new query searches using the index on scan_type. This query is run against scan_type 1 (SAST), which accounts for the vast majority of Security Scans.

The explain plan generated from #database_lab can be found at the following location https://explain.depesz.com/s/pzXV.

Edited by Subashis Chakraborty

Merge request reports