Skip to content

List group vulnerability scanners using vulnerability statistics

What does this MR do and why?

Resolve a cross-database join in EE::Group#vulnerability_scanners by JOINing vulnerability_statistics to list vulnerability_scanners in a group and its subgroups. Before this change, the method joined vulnerability_scanners with projects and namespaces, but these tables belong to different gitlab schemas.

The implementation leverages the traversal_ids and archived columns added to vulnerability_scanners in !177591 (merged). It uses the by_group and unarchived scopes added to Vulnerabilities::Statistic as part of !177892 (merged).

The change is behind a feature flag named group_vulnerability_scanners_using_statistics. Feature flag rollout issue is #513300 (closed).

As a result of this change, #vulnerability_scanners might return scanners of projects that have been marked for deletion (i.e. such as projects.delete_pending is true). In similar queries, we decided that listing such projects was acceptable. See #437635 (comment 1749715754)

Specs update

  • Add context for when the feature flag is disabled.
  • Create vulnerability statistics related to the scanners retrieved by the query. This is necessary for the join to work.
  • Replace contain_exactly matcher with include and .not_to include. This is necessary to ignore the extra scanners that are being created when calling the vulnerability_statistic factory while keeping the code simple.
  • Add an unrelated scanner that belongs to a project that's not in the group being selected.

References

Please include cross links to any resources that are relevant to this MR. This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

bundle exec bin/rspec -E '#vulnerability_scanners' ee/spec/models/ee/group_spec.rb
bundle exec bin/rspec -E 'when listing scanners for group' ee/spec/graphql/resolvers/vulnerabilities/scanners_resolver_spec.rb

SQL queries

The following queries have been collected using the ActiveRecord::Base.logger when running the specs. The traversal IDs have been change to match gitlab-org.

The query plans of the original queries have been collected before the ones of the new queries. As a result, the new queries might benefit from warm cache, but that would be limited to vulnerability_scanners and vulnerability_occurrences.

vulnerability_statistics.traversal_ids and .archived were backfilled on console.postgres.ai before running the new queries. See !177892 (comment 2297685662)

The new queries use idx_vulnerability_statistics_on_traversal_ids_and_letter_grade index introduced in !177892 (diffs).

Query triggered by EE::Group#vulnerability_scanners:

SELECT
  "vulnerability_scanners".* 
FROM 
  "vulnerability_scanners" 
WHERE 
  "vulnerability_scanners"."project_id" IN (
    SELECT 
      "vulnerability_statistics"."project_id" 
    FROM 
      "vulnerability_statistics" 
    WHERE 
      "vulnerability_statistics"."traversal_ids" >= '{9970}' 
      AND "vulnerability_statistics"."traversal_ids" < '{9971}' 
      AND "vulnerability_statistics"."archived" = FALSE
  )

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35212/commands/109340

Query triggered by ScannersResolver:

SELECT 
  DISTINCT ON (
    "vulnerability_scanners"."external_id", 
    "report_types"."report_type"
  ) "vulnerability_scanners".*, 
  "report_types"."report_type" AS "report_type" 
FROM 
  "vulnerability_scanners" 
  JOIN LATERAL (
    SELECT 
      "vulnerability_occurrences"."report_type" 
    FROM 
      "vulnerability_occurrences" 
    WHERE 
      "vulnerability_occurrences"."scanner_id" = "vulnerability_scanners"."id" 
    LIMIT 
      1
  ) report_types ON true 
WHERE 
  "vulnerability_scanners"."project_id" IN (
    SELECT 
      "vulnerability_statistics"."project_id" 
    FROM 
      "vulnerability_statistics" 
    WHERE 
      "vulnerability_statistics"."traversal_ids" >= '{9970}' 
      AND "vulnerability_statistics"."traversal_ids" < '{9971}' 
      AND "vulnerability_statistics"."archived" = FALSE
  ) 
ORDER BY 
  "vulnerability_scanners"."external_id" ASC, 
  "report_types"."report_type" ASC

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35212/commands/109341

SQL queries prior to this change

Query triggered by EE::Group#vulnerability_scanners:

SELECT 
  "vulnerability_scanners".* 
FROM 
  "vulnerability_scanners" 
WHERE 
  "vulnerability_scanners"."project_id" IN (
    SELECT 
      "projects"."id" 
    FROM 
      "projects" 
    WHERE 
      "projects"."namespace_id" IN (
        SELECT 
          "namespaces"."id" 
        FROM 
          "namespaces" 
        WHERE 
          "namespaces"."type" = 'Group' 
          AND (
            traversal_ids @> ('{9970}')
          )
      ) 
      AND "projects"."archived" = FALSE 
      AND "projects"."pending_delete" = FALSE
  )

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35212/commands/109338

Query triggered by ScannersResolver:

SELECT 
  DISTINCT ON (
    "vulnerability_scanners"."external_id", 
    "report_types"."report_type"
  ) "vulnerability_scanners".*, 
  "report_types"."report_type" AS "report_type" 
FROM 
  "vulnerability_scanners" 
  JOIN LATERAL (
    SELECT 
      "vulnerability_occurrences"."report_type" 
    FROM 
      "vulnerability_occurrences" 
    WHERE 
      "vulnerability_occurrences"."scanner_id" = "vulnerability_scanners"."id" 
    LIMIT 
      1
  ) report_types ON true 
WHERE 
  "vulnerability_scanners"."project_id" IN (
    SELECT 
      "projects"."id" 
    FROM 
      "projects" 
    WHERE 
      "projects"."namespace_id" IN (
        SELECT 
          "namespaces"."id" 
        FROM 
          "namespaces" 
        WHERE 
          "namespaces"."type" = 'Group' 
          AND (
            traversal_ids @ > ('{9970}')
          )
      ) 
      AND "projects"."archived" = FALSE 
      AND "projects"."pending_delete" = FALSE
  )

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35212/commands/109339

Related to #513032 (closed)

Edited by Fabien Catteau

Merge request reports

Loading