Draft: Improve performance of dormant users deactivation
What does this MR do?
This MR builds on top of !57778 (merged) to improve deactivation of dormant users.
In the above mentioned MR we started with the following code
User.dormant.find_each do |user|
if user.can_be_deactivated?
with_context(user: user) { user.deactivate }
end
end
where User.dormant
scope had a naive initial definition like
scope :dormant, -> { all }
What is a dormant user?
Dormant users are users that have no recent activity and are considered OK to be deactivated. Currently we define this in User#can_be_deactivated?
:
def can_be_deactivated?
active? && no_recent_activity? && !internal?
end
If we break this down
-
active?
- can be translated towith_state(:active)
-
!internal?
- this is!(ghost? || (bot? && !project_bot?))
, which can be expressed in SQL with theUser.non_internal
scope - https://gitlab.com/gitlab-org/gitlab/-/blob/8a31f66b5633ec593941e39364425edf13e6592b/app/models/concerns/has_user_type.rb#L27.
Both these are already combined in the User.active
scope (https://gitlab.com/gitlab-org/gitlab/-/blob/8a31f66b5633ec593941e39364425edf13e6592b/app/models/user.rb#L370):
scope :active, -> { with_state(:active).non_internal }
User#no_recent_activity?
is a bit trickier. Currently it is defined as:
def no_recent_activity?
last_active_at.to_i <= MINIMUM_INACTIVE_DAYS.days.ago.to_i
end
where last_active_at
is not a database column, but another method on User
(https://gitlab.com/gitlab-org/gitlab/-/blob/8a31f66b5633ec593941e39364425edf13e6592b/app/models/user.rb#L1840-1845):
def last_active_at
last_activity = last_activity_on&.to_time&.in_time_zone
last_sign_in = current_sign_in_at
[last_activity, last_sign_in].compact.max
end
If we want to express this in SQL it gets a bit complicated to make it in performant way - first we need to take care for each of last_activity_on
and current_sign_in_at
having a value or being NULL, and then we have to compare against the one having bigger value (i.e. being more recent). Unfortunately we can't just create a index on GREATEST(last_activity_on, current_sign_in_at)
because these two columns are of different type - date
and timestamp with timezone
.
However, it turns out that taking into account current_sign_in_at
is not needed - whenever users signs in (which updates current_sign_in_at
), last_activity_on
is also updated (because the sign-in request is POST
).
Example:
Let's look at an user without sign-in:
gitlabhq_development=# select id, username, state, user_type, last_activity_on, current_sign_in_at, updated_at from users where id = 19;
id | username | state | user_type | last_activity_on | current_sign_in_at | updated_at
----+----------+--------+-----------+------------------+--------------------+----------------------------
19 | ressie | active | | | | 2021-04-06 04:26:50.980647
(1 row)
Time: 5.785 ms
This is what we see in the logs for the sign-in request:
and if we check the database record for this user after sign-in we see
gitlabhq_development=# select id, username, state, user_type, last_activity_on, current_sign_in_at, updated_at from users where id = 19;
id | username | state | user_type | last_activity_on | current_sign_in_at | updated_at
----+----------+--------+-----------+------------------+----------------------------+----------------------------
19 | ressie | active | | 2021-04-06 | 2021-04-06 04:31:35.345049 | 2021-04-06 04:31:36.647626
(1 row)
Time: 0.837 ms
All of the above means that we can ignore current_sign_in_at
and express no recent activity as where('last_activity_on <= ?', MINIMUM_INACTIVE_DAYS.day.ago.to_date)
.
Once we combine all of the above we end up with scope like
scope :dormant, -> { active.where('last_activity_on <= ?', MINIMUM_INACTIVE_DAYS.day.ago.to_date) }
and this can be supported very well with partial index on users (id, last_activity_on) WHERE (((state)::text = 'active'::text) AND ((user_type IS NULL) OR (user_type = ANY (ARRAY[NULL::integer, 6, 4]))))
.
In order to match the current definition of no_recent_activity?
and cover users with no activity at all we should do something like
last_activity_on <= ?
OR last_activity_on IS NULL
As we know OR
queries do not perform well - https://docs.gitlab.com/ee/development/sql.html#use-unions. The recommended approach is to use UNION
, but this makes iteration (find_each
) harder, so I think we're better if we add another scope like
scope :with_no_activity, -> { active.where(last_activity_on: nil) }
and iterate each type of users separately.
This way we end up with worker like this
BATCH_SIZE = 100
def perform
return if Gitlab.com?
return unless ::Gitlab::CurrentSettings.current_application_settings.deactivate_dormant_users
# rubocop: disable CodeReuse/ActiveRecord
User.dormant.find_each(batch_size: BATCH_SIZE) { |user| deactivate(user) }
User.with_no_activity.find_each(batch_size: BATCH_SIZE) { |user| deactivate(user) }
# rubocop: enable CodeReuse/ActiveRecord
end
private
def deactivate(user)
if user.can_be_deactivated?
with_context(user: user) { user.deactivate }
end
end
Database migrations
Up
$ bundle exec rails db:migrate:up VERSION=20210406030300
== 20210406030300 AddIndexForDormantUsers: migrating ==========================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:users, [:id, :last_activity_on], {:where=>"state = 'active' AND (users.user_type IS NULL OR users.user_type IN (NULL, 6, 4))", :name=>"index_users_on_id_and_last_activity_on_for_non_internal_active", :algorithm=>:concurrently})
-> 0.0101s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- add_index(:users, [:id, :last_activity_on], {:where=>"state = 'active' AND (users.user_type IS NULL OR users.user_type IN (NULL, 6, 4))", :name=>"index_users_on_id_and_last_activity_on_for_non_internal_active", :algorithm=>:concurrently})
-> 0.0055s
-- execute("RESET ALL")
-> 0.0006s
== 20210406030300 AddIndexForDormantUsers: migrated (0.0177s) =================
Down
$ bundle exec rails db:migrate:down VERSION=20210406030300
== 20210406030300 AddIndexForDormantUsers: reverting ==========================
-- transaction_open?()
-> 0.0000s
-- indexes(:users)
-> 0.0106s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- remove_index(:users, {:algorithm=>:concurrently, :name=>"index_users_on_id_and_last_activity_on_for_non_internal_active"})
-> 0.0028s
-- execute("RESET ALL")
-> 0.0006s
== 20210406030300 AddIndexForDormantUsers: reverted (0.0156s) =================
Index creation takes ~249s on when tested on postgres.ai:
exec CREATE INDEX CONCURRENTLY index_users_on_id_and_last_activity_on_for_non_internal_active ON users USING btree (id, last_activity_on) WHERE (((state)::text = 'active'::text) AND ((user_type IS NULL) OR (user_type = ANY (ARRAY[NULL::integer, 6, 4]))))
Session: webui-i3232
% time seconds wait_event
------ ------------ -----------------------------
90.77 226.406605 IO.DataFileRead
6.19 15.434963 Running
1.09 2.709451 IO.DataFileExtend
0.90 2.248166 LWLock.WALWriteLock
0.55 1.371173 IO.BufFileWrite
0.19 0.466658 IO.DataFileWrite
0.09 0.223423 IO.SLRURead
0.08 0.211910 IO.WALWrite
0.08 0.194466 IO.BufFileRead
0.05 0.135070 IPC.ParallelCreateIndexScan
0.01 0.024559 LWLock.buffer_mapping
------ ------------ -----------------------------
100.00 249.426444
The query has been executed. Duration: 249.426 s (estimated for prod: 22.169...228.373 s)
SQL queries
Here are the raw SQL queries generated with their execution plans:
SELECT "users".* FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (NULL, 6, 4)) AND (last_activity_on <= '2020-12-30') AND "users"."id" > 3170776 ORDER BY "users"."id" ASC LIMIT 100
- With cold cache - https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3232/commands/10500.
- With warm cache - https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3232/commands/10501.
SELECT "users".* FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (NULL, 6, 4)) AND (last_activity_on IS NULL) AND "users"."id" > 3170776 ORDER BY "users"."id" ASC LIMIT 100
- With cold cache - https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3232/commands/10503.
- With warm cache - https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3232/commands/10504.
In all cases we have Index Scan using index_users_on_id_and_last_activity_on_for_non_internal_active on public.users
and records are filtered as Index Cond
and not Filter
.
Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because _____.
-
-
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Related to #211754 (closed)