[Proposal] Add `namespace_id` to `vulnerability_reads` table

Why are we doing this work

The "Group-level vulnerability report" page is mostly timing out as we are trying to generate the response by searching a huge amount of data. We've already fixed the timeout issues for the "project-level vulnerability report" page by using the unnested_in_filters but applying the same solution for the group-level queries might not be a good solution.

Therefore, we can add the namespace_id into the vulnerability_reads table and use that attribute for filtering the vulnerabilities instead of filtering by all the projects belonging to a specific group.

The query filtering by projects

WITH RECURSIVE "base_and_descendants" AS (
  (
    SELECT 
      "namespaces"."id"
    FROM
      "namespaces"
    WHERE
      "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970 
  )
  UNION 
  (
    SELECT
      "namespaces"."id"
    FROM
      "namespaces",
      "base_and_descendants"
    WHERE
      "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id"
  )
),
"group_projects" AS (
  SELECT
    "projects"."id" AS "project_id"
  FROM
    "projects"
  INNER JOIN "base_and_descendants" ON "base_and_descendants"."id" = "projects"."namespace_id"
  INNER JOIN "project_settings" ON "project_settings"."project_id" = "projects"."id" AND "project_settings"."has_vulnerabilities" IS TRUE
  WHERE
    "projects"."archived" = FALSE AND "projects"."pending_delete" = FALSE
)
SELECT
    "vulnerability_reads".*
FROM
    "group_projects",
    unnest('{0, 4}'::smallint[]) AS "report_types" ("report_type"),
    unnest('{1, 4}'::smallint[]) AS "states" ("state"),
    unnest(ARRAY["group_projects"."project_id"]::integer[]) as projects(id),
    LATERAL (
        SELECT
            "vulnerability_reads".*
        FROM
            "vulnerability_reads"
        WHERE
            "vulnerability_reads"."project_id" = "projects"."id"
            AND "vulnerability_reads"."severity" = 5
            AND (vulnerability_reads."report_type" = "report_types"."report_type")
            AND (vulnerability_reads."state" = "states"."state")
        ORDER BY
            "vulnerability_reads"."severity" DESC,
            "vulnerability_reads"."vulnerability_id" DESC
        LIMIT 20) AS vulnerability_reads
ORDER BY
    "vulnerability_reads"."severity" DESC,
    "vulnerability_reads"."vulnerability_id" DESC
LIMIT 20

Cold cache execution plan(~8.5s): https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10979/commands/39444
Warm cache execution plan(~36ms): https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10979/commands/39445

The query filtering by namespaces

WITH RECURSIVE "base_and_descendants" AS MATERIALIZED (
  (
    SELECT 
      "namespaces"."id"
    FROM
      "namespaces"
    WHERE
      "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970 
  )
  UNION 
  (
    SELECT
      "namespaces"."id"
    FROM
      "namespaces",
      "base_and_descendants"
    WHERE
      "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id"
  )
)
SELECT
    "vulnerability_reads".*
FROM
    "base_and_descendants",
    unnest('{0, 4}'::smallint[]) AS "report_types" ("report_type"),
    unnest('{1, 4}'::smallint[]) AS "states" ("state"),
    unnest(ARRAY["base_and_descendants"."id"]::integer[]) as namespaces(id),
    LATERAL (
        SELECT
            "vulnerability_reads".*
        FROM
            "vulnerability_reads"
        WHERE
            "vulnerability_reads"."namespace_id" = "namespaces"."id"
            AND "vulnerability_reads"."severity" = 5
            AND (vulnerability_reads."report_type" = "report_types"."report_type")
            AND (vulnerability_reads."state" = "states"."state")
        ORDER BY
            "vulnerability_reads"."severity" DESC,
            "vulnerability_reads"."vulnerability_id" DESC
        LIMIT 20) AS vulnerability_reads
ORDER BY
    "vulnerability_reads"."severity" DESC,
    "vulnerability_reads"."vulnerability_id" DESC
LIMIT 20

Without index execution plan (~42ms): https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10979/commands/39475
With index warm cache execution plan(~17ms): https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10979/commands/39478

As seen in the above execution plans, by adding the namespace_id column to the vulnerability_reads table we can achieve a tangible performance improvement.

Relevant links

"Unnested in filters" optimization introduced by: Filter vulnerability reads by the cartesian pro... (!90207 - merged)

Implementation plan

database Add namespace_id(bigint) column to vulnerability_reads table along with the FK constraints
database Create index on vulnerability_reads table btree(namespace_id, state, severity, vulnerability_id DESC)
database Change existing triggers to set namespace_id of vulnerability_reads records
database Create migration to backfill the namespace_id column of existing vulnerability_reads records
backend Adjust VulnerabilityReadsFinder to use unnested_in_filters with base_and_descendants CTE when the vulnerable is a group.

Verification steps

TBD

Edited Jul 15, 2022 by Mehmet Emin INAC