Improve the performance of the vulnerability report page severity counts using fuzzy counts
Why are we doing this work
Counting the vulnerabilities for each severity is currently causing timeouts as all the vulnerability records related to a group or project have to be read. To address this performance issue, we need to make a feature change and implement "fuzzy counts".
With this change, we will be showing the exact count if there are less than 1000 vulnerabilities for the severity but showing "1000+" if there are more.
This is how the counts will look like;
Note: This change won't affect the vulnerability list which means we will keep showing all the vulnerabilities as we are currently doing.
Implementation plan
-
backend Implement fuzzy counting query instead of counting all the rows -
database Add necessary indices to support the query
An example query
SELECT
count(*),
"all_counts"."severity"
FROM
(
SELECT
1,
"severities"."severity"
FROM
unnest('{1,2,3,4,5,6,7}'::int[]) AS severities(severity),
LATERAL(
SELECT
1
FROM
"vulnerability_reads"
WHERE
"vulnerability_reads"."traversal_ids" >= '{9970}'::bigint[] AND
"vulnerability_reads"."report_type" IN (6, 99) AND
"vulnerability_reads"."state" IN (1, 4) AND
"vulnerability_reads"."archived" IS FALSE AND
"vulnerability_reads"."severity" = "severities"."severity"
LIMIT 1001
) AS "count_for_severity"
) AS "all_counts"
GROUP BY
"all_counts"."severity"
Verification steps
-
Verify that the count query does not timeout for gitlab-org
group.