Cron job to delete unconfirmed users after N days
What does this MR do and why?
- The setting for this was added via !122276 (merged)
- It was later clarified that this setting should be for Premium customers, and therefore is EE only: #352514 (comment 1416519422)
- The newly-added scope was not using an existing index when I tried it so I added an index to make the query more efficient.
Database
Updated query based on latest changes to the cron worker (this is the SQL query for one batch of users using the keyset iterator):
SELECT "users"."created_at",
"users"."id",
"users"."username"
FROM "users"
WHERE "users"."user_type" = 0
AND ( "users"."state" IN ( 'active' ) )
AND "users"."confirmed_at" IS NULL
AND ( created_at < '2023-06-01 02:33:48.799079' )
AND ( ( "users"."created_at", "users"."id" ) <
( '2023-05-08 01:31:04.657909', 17067 ) )
ORDER BY "users"."created_at" DESC,
"users"."id" DESC
LIMIT 1000
Ran this using an id / created_at timestamp based on prod data:
SELECT "users"."created_at",
"users"."id",
"users"."username"
FROM "users"
WHERE "users"."user_type" = 0
AND ( "users"."state" IN ( 'active' ) )
AND "users"."confirmed_at" IS NULL
AND ( created_at < '2023-06-01 02:33:48.799079' )
AND ( ( "users"."created_at", "users"."id" ) < (
'Sat, 04 Oct 2014 17:27:28.759948000', 59387 ) )
ORDER BY "users"."created_at" DESC,
"users"."id" DESC
LIMIT 1000
Analysis:
- https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/19303/commands/63713
- https://explain.depesz.com/s/IFyG
Enqueueing job performance analysis
I then locally measured how long it takes to enqueue 1k jobs using Benchmark.ips as recommended by https://docs.gitlab.com/ee/development/performance.html
Benchmark.ips do |x|
x.report('enqueue delete user job for 1k users') do
User.first(1000).each do |user|
DeleteUserWorker.perform_async(
user.id, user.id, skip_authorization: true
)
end
end
end
=> #<Benchmark::IPS::Report:0x00000002917ae660
@data=nil,
@entries=
[#<Benchmark::IPS::Report::Entry:0x0000000291d3c528
@iterations=4,
@label="enqueue delete user job for 1k users",
@measurement_cycle=1,
@microseconds=5235790.0,
@show_total_time=true,
@stats=
#<Benchmark::IPS::Stats::SD:0x0000000291d3c5a0
@error=0,
@mean=0.7766325187584635,
@samples=
[0.8253780644224087, 0.8283343356178794, 0.6171647083001864, 0.8356529666933797]>>]>
So this shows an average of 0.777 iterations per second. Let's just round that up and say 1k jobs can be enqueued in ~1 second.
While it is hard to know how long the SQL query will take to run on prod, I think we can safely say it will run in under 20 seconds. So, 10 iterations (10k) records per run should be within safe limits if we want this worker to run in under 30 seconds.
Migration output
Migrate
% be rails db:migrate RAILS_ENV=test
main: == [advisory_lock_connection] object_id: 223900, pg_backend_pid: 13862
main: == 20230605043258 AddUnconfirmedCreatedAtIndexToUsers: migrating ==============
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0866s
main: -- index_exists?(:users, [:created_at], {:name=>"index_users_on_unconfirmed_and_created_at_for_active_humans", :where=>"confirmed_at IS NULL AND state = 'active' AND user_type IN (0)", :algorithm=>:concurrently})
main: -> 0.0068s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0003s
main: -- add_index(:users, [:created_at], {:name=>"index_users_on_unconfirmed_and_created_at_for_active_humans", :where=>"confirmed_at IS NULL AND state = 'active' AND user_type IN (0)", :algorithm=>:concurrently})
main: -> 0.0023s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20230605043258 AddUnconfirmedCreatedAtIndexToUsers: migrated (0.1076s) =====
Rollback
% be rails db:rollback:main RAILS_ENV=test
main: == [advisory_lock_connection] object_id: 233080, pg_backend_pid: 14286
main: == 20230605043258 AddUnconfirmedCreatedAtIndexToUsers: reverting ==============
main: -- transaction_open?()
main: -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main: -> 0.0856s
main: -- indexes(:users)
main: -> 0.0080s
main: -- execute("SET statement_timeout TO 0")
main: -> 0.0003s
main: -- remove_index(:users, {:algorithm=>:concurrently, :name=>"index_users_on_unconfirmed_and_created_at_for_active_humans"})
main: -> 0.0018s
main: -- execute("RESET statement_timeout")
main: -> 0.0003s
main: == 20230605043258 AddUnconfirmedCreatedAtIndexToUsers: reverted (0.1079s) =====
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.