Resolve cross-joins in Security::ProjectStatistics and Vulnerabilities::Identifier
What does this MR do and why?
Resolve cross-database join in Security::ProjectStatistics
and Vulnerabilities::Identifier when querying a group and its subgroup.
This solved by joining vulnerability_statistics
instead of the projects and namespaces tables.
The two changes are behind two features flags.
These flags can't be enabled before running the backfill BBM for the new columns added to vulnerability_statistics.
The backfill BBM is implemented in !177993 (merged).
The original implementation (i.e. when the feature flags are disabled) now links to the feature flag rollout issues. The cross-join will be entirely resolved when the feature flags will be cleaned up entirely.
Feature flag rollout issues:
- [Feature flag] Enable search_identifier_name_in... (#515039 - closed)
- [Feature flag] Enable sum_vulnerability_count_f... (#515040 - closed)
References
Please include cross links to any resources that are relevant to this MR. This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
bundle exec bin/rspec \
ee/spec/graphql/resolvers/vulnerabilities/identifier_search_resolver_spec.rb \
ee/spec/models/vulnerabilities/identifier_spec.rb \
ee/spec/graphql/ee/types/group_type_spec.rb
SQL queries
The queries were collected by running specific specs. See !178886 (comment 2312237911) and !178886 (comment 2312229518)
The query plans were collected after doing the backfill manually, so that we don't need to wait for Resolve "Backfill vulnerability_statistics.trav... (!177993 - merged). See !178886 (comment 2314725719)
SELECT
SUM(
"project_security_statistics"."vulnerability_count"
)
FROM
"project_security_statistics"
WHERE
"project_security_statistics"."project_id" IN (
SELECT
"vulnerability_statistics"."project_id"
FROM
"vulnerability_statistics"
WHERE
"vulnerability_statistics"."traversal_ids" >= '{9970}'
AND "vulnerability_statistics"."traversal_ids" < '{9971}'
AND "vulnerability_statistics"."archived" = FALSE
)
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110101
SELECT
DISTINCT "vulnerability_identifiers"."name"
FROM
"vulnerability_identifiers"
WHERE
"vulnerability_identifiers"."project_id" IN (
SELECT
"vulnerability_statistics"."project_id"
FROM
"vulnerability_statistics"
WHERE
"vulnerability_statistics"."traversal_ids" >= '{9970}'
AND "vulnerability_statistics"."traversal_ids" < '{9971}'
AND "vulnerability_statistics"."archived" = FALSE
)
AND (name ILIKE '%CVE-2024-53%')
ORDER BY
"vulnerability_identifiers"."name" ASC
LIMIT
50
See also https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110104
For context, this query gives the number of vulnerability_statistics for the group:
SELECT
"vulnerability_statistics"."project_id"
FROM
"vulnerability_statistics"
WHERE
"vulnerability_statistics"."traversal_ids" >= '{9970}'
AND "vulnerability_statistics"."traversal_ids" < '{9971}'
AND "vulnerability_statistics"."archived" = FALSE
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/35549/commands/110100
Related to #510091 (closed)