Performance issue with current SQL query used to count active users

Extracted from https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/5816#note_75996916:

gitlabhq_production=# explain analyze select count(*) from users where state = 'active' and ghost is not true and support_bot is not true;
                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=55866.30..55866.31 rows=1 width=8) (actual time=1310.855..1310.855 rows=1 loops=1)
   ->  Index Only Scan using yorick_test on users  (cost=0.43..50381.09 rows=2194084 width=0) (actual time=0.049..750.513 rows=2193050 loops=1)
         Index Cond: (state = 'active'::text)
         Heap Fetches: 51768
 Planning time: 0.212 ms
 Execution time: 1310.887 ms
(6 rows)

While it's about 2x faster than before, it's still 1.3 seconds. Since this code only runs very rarely I think this should be fine for the time being, but it might get slower as more and more users are added. We should definitely add an issue to redefine what "active users" means, allowing us to perhaps write a better performing SQL query.

Assignee Loading
Time tracking Loading