Skip to content

Replace indexes for counting active users

Andreas Brandl requested to merge ab-admin-page-user-active-count into master

What does this MR do?

Replace indexes for counting active users

This adjusts the partial condition for an index. The index is intended to be used when counting active users with ghost IS NOT TRUE AND bot_type IS NULL.

With the current index, this wasn't working as the partial condition didn't match the query: ghost <> TRUE is not semantically equivalent to ghost IS NOT TRUE (null semantics).

The reason we add an index particularly intended for EE is that the EE query is going to have the additional part AND bot_type IS NULL whereas the CE query doesn't. Logically, it'd be enough to have an index for ghost IS NOT TRUE. However, on GitLab.com, the query planner makes poor choices when the additional AND bot_type IS NULL part is present: It goes for the index on bot_type and doesn't use the partial index.

Note the existing index isn't being used at all according to GitLab.com index statistics. Hence we can first remove it and don't have to worry about the window of time without an index.

Relates to https://gitlab.com/gitlab-org/gitlab-ce/issues/66770

Query plans

Plans from staging:

gitlabhq_production=# explain analyze SELECT COUNT(*) FROM "users" WHERE ("users"."state" IN ('active')) AND (ghost IS NOT TRUE) AND "users"."bot_type" IS NULL;
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=101816.81..101816.82 rows=1 width=8) (actual time=937.831..937.831 rows=1 loops=1)
   ->  Seq Scan on users  (cost=0.00..97995.74 rows=1528428 width=0) (actual time=0.015..833.454 rows=1528489 loops=1)
         Filter: ((ghost IS NOT TRUE) AND (bot_type IS NULL) AND ((state)::text = 'active'::text))
         Rows Removed by Filter: 4210
 Planning time: 0.320 ms
 Execution time: 937.865 ms
(6 rows)

Time: 939.240 ms

gitlabhq_production=# CREATE INDEX CONCURRENTLY index_users_on_state_and_internal_ee ON users (state) WHERE ghost IS NOT TRUE AND bot_type IS NULL;
CREATE INDEX
Time: 5929.095 ms
gitlabhq_production=# explain analyze SELECT COUNT(*) FROM "users" WHERE ("users"."state" IN ('active')) AND (ghost IS NOT TRUE) AND "users"."bot_type" IS NULL;
                                                                               QUERY PLAN                                                                                
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=42872.82..42872.83 rows=1 width=8) (actual time=345.913..345.914 rows=1 loops=1)
   ->  Index Only Scan using index_users_on_state_and_internal_ee on users  (cost=0.43..39051.75 rows=1528428 width=0) (actual time=0.061..245.163 rows=1528489 loops=1)
         Index Cond: (state = 'active'::text)
         Heap Fetches: 113626
 Planning time: 1.654 ms
 Execution time: 345.979 ms
(6 rows)

Time: 349.880 ms

Does this MR meet the acceptance criteria?

Conformity

  • Changelog entry for user-facing changes, or community contribution. Check the link for other scenarios.
Edited by Andreas Brandl

Merge request reports