Vulnerability report fails to load for large groups/projects
Summary
The group-level and project-level dashboards are both failing with an internal server error.
The instance query works:
Steps to reproduce
Use graphql explorer to run the queries above.
Example Project
gitlab-org/gitlab
What is the current bug behavior?
500 error
What is the expected correct behavior?
Response with data.
Relevant logs and/or screenshots
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: `sudo gitlab-rake gitlab:env:info`) (For installations from source run and paste the output of: `sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:check SANITIZE=true
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true
)(we will only investigate if the tests are passing)
Possible fixes
[5:21 PM] I think the problem is, it’s trying to load all the vulnerability_occurrences records for all the scanners into the memory and then doing the distinct on them. Since we have lots of records in the vulnerability_occurrences table, it’s causing the read amplification.
[5:21 PM] The original query is running around ~350ms on postgres.ai but the one with the lateral join is running around ~13ms.
SELECT
DISTINCT ON ("vulnerability_scanners"."external_id", "report_types"."report_type")
"vulnerability_scanners".*,
report_types.report_type
FROM "vulnerability_scanners"
INNER JOIN LATERAL (SELECT report_type FROM vulnerability_occurrences WHERE vulnerability_occurrences.scanner_id = vulnerability_scanners.id LIMIT 1) AS report_types ON TRUE
WHERE
"vulnerability_scanners"."project_id" IN (SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id" FROM "base_and_descendants" AS "namespaces") AND "projects"."archived" = FALSE AND "projects"."pending_delete" = FALSE)
ORDER BY
"vulnerability_scanners"."external_id" ASC;