Resolve cross-joins in Security::ProjectStatistics and Vulnerabilities::Identifier

What does this MR do and why?

Resolve cross-database join in Security::ProjectStatistics and Vulnerabilities::Identifier when querying a group and its subgroup. This solved by joining vulnerability_statistics instead of the projects and namespaces tables.

The two changes are behind two features flags. These flags can't be enabled before running the backfill BBM for the new columns added to vulnerability_statistics. The backfill BBM is implemented in !177993 (merged).

The original implementation (i.e. when the feature flags are disabled) now links to the feature flag rollout issues. The cross-join will be entirely resolved when the feature flags will be cleaned up entirely.

Feature flag rollout issues:

References

Please include cross links to any resources that are relevant to this MR. This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

bundle exec bin/rspec \
  ee/spec/graphql/resolvers/vulnerabilities/identifier_search_resolver_spec.rb \
  ee/spec/models/vulnerabilities/identifier_spec.rb \
  ee/spec/graphql/ee/types/group_type_spec.rb

SQL queries

The queries were collected by running specific specs. See !178886 (comment 2312237911) and !178886 (comment 2312229518)

The query plans were collected after doing the backfill manually, so that we don't need to wait for Resolve "Backfill vulnerability_statistics.trav... (!177993 - merged). See !178886 (comment 2314725719)

SELECT 
  SUM(
    "project_security_statistics"."vulnerability_count"
  ) 
FROM 
  "project_security_statistics" 
WHERE 
  "project_security_statistics"."project_id" IN (
    SELECT 
      "vulnerability_statistics"."project_id" 
    FROM 
      "vulnerability_statistics" 
    WHERE 
      "vulnerability_statistics"."traversal_ids" >= '{9970}' 
      AND "vulnerability_statistics"."traversal_ids" < '{9971}' 
      AND "vulnerability_statistics"."archived" = FALSE
  )

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110101

SELECT 
  DISTINCT "vulnerability_identifiers"."name" 
FROM 
  "vulnerability_identifiers" 
WHERE 
  "vulnerability_identifiers"."project_id" IN (
    SELECT 
      "vulnerability_statistics"."project_id" 
    FROM 
      "vulnerability_statistics" 
    WHERE 
      "vulnerability_statistics"."traversal_ids" >= '{9970}' 
      AND "vulnerability_statistics"."traversal_ids" < '{9971}' 
      AND "vulnerability_statistics"."archived" = FALSE
  ) 
  AND (name ILIKE '%CVE-2024-53%') 
ORDER BY 
  "vulnerability_identifiers"."name" ASC 
LIMIT 
  50

See also https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110104

For context, this query gives the number of vulnerability_statistics for the group:

    SELECT 
      "vulnerability_statistics"."project_id" 
    FROM 
      "vulnerability_statistics" 
    WHERE 
      "vulnerability_statistics"."traversal_ids" >= '{9970}' 
      AND "vulnerability_statistics"."traversal_ids" < '{9971}' 
      AND "vulnerability_statistics"."archived" = FALSE

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110100

Related to #510091 (closed)

Edited by Fabien Catteau

Merge request reports

Loading