Count query for issues list is very slow due to banned_users filter
Problem
In https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/16985#note_1304752816 we noticed that the query for getting the issue count on the issues list page is very expensive and it's also called very frequently. It seems the filtering out of banned_users
is an expensive part of this count.
This query can be seen in performance bar labelled graphql (getIssuesCountEE)
and is:
SELECT COUNT(*) FROM "issues" WHERE (NOT EXISTS (SELECT 1 FROM "banned_users" WHERE (issues.author_id = banned_users.user_id))) AND "issues"."project_id" = 278964 AND "issues"."issue_type" IN (0, 1, 2, 4)
This contributes as a major factor for GraphQL based slow queries in Postgres:
Solution
If filtering banned_users
from the issue count isn't essential then we could consider simplifying the query. The buffers reduces by about 50% (which is not a lot) but the speed of a fully cached response is still around 4-5x times faster which makes sense as Postgres has way less to do when there is no join:
- With
banned_users
filter: - Without
banned_users
filter:
The actual answer is also not very different when ignoring banned_users
(presumably because spam issues are also being deleted in the DB):
gitlabhq_dblab=# SELECT COUNT(*) FROM "issues" WHERE "issues"."project_id" = 278964 AND "issues"."issue_type" IN (0, 1, 2, 4);
count
--------
151115
(1 row)
gitlabhq_dblab=# SELECT COUNT(*) FROM "issues" WHERE (NOT EXISTS (SELECT 1 FROM "banned_users" WHERE (issues.author_id = banned_users.user_id))) AND "issues"."project_id" = 278964 AND "issues"."issue_type" IN (0, 1, 2, 4);
count
--------
148896
(1 row)