Prune oauth_access_grants to remove old access_grants not valid anymore
As an additional step for preventing #358760 (closed) in future, oauth_access_grants
has a large number of tokens dating back to 2015 where each token expires_in
. We can add a worker to remove these periodically as well.
The table will hold records according to the agreed policy to keep its size under the limit, so that can be vacuumed periodically.
Policy
OAuth access grants: Active grants and 1 month of revoked grants
Proposal
We'll introduce a (unpartitioned) table that mirrors the oauth_access_grants
table's schema. Then we'll write a Sidekiq worker that iterates through all oauth_access_grants
, and for each row with revoked_at < 1.month.ago
, in one transaction, move the row to the archive table and delete it from production table.
Steps
Adapted from description of this issue:
First release (archive and deletion)
-
Create oauth_access_grant_archived_records
table.- Schema exactly mirrors the production
oauth_access_grants
schema
- Schema exactly mirrors the production
-
Create a new feature flag archive_revoked_access_tokens
-
Create a Sidekiq worker that archives and deletes rows from oauth_access_grants
- schedule is '0 9 2 * *' according with this
- Short-circuit
#perform
if feature flag is disabled - Iterate over all records in
oauth_access_grants
in batches, similar to how the original deletion worker did batched iteration- This obviates an additional index on
created_at
, which we would need a large table rule exception to create
- This obviates an additional index on
- For each batch of 10,000 records in
oauth_access_grants
- Filter the batch to records that are
created_at < 1.month.ago and revoked_at is nil
. - Begin a transaction
- Copy all filtered records to
oauth_access_grant_archived_records
- Delete all filtered records from
oauth_access_grants
- Commit the transaction
- Copy all filtered records to
- Filter the batch to records that are
-
Create a Batched Background Migration Job that archives and deletes rows from oauth_access_grants
until current date -
Enqueue the BBM using chatops, like mentioned here - Monitor table i/o during and after the run of the background job (how?)
- Run queries in prod to confirm deletions from
oauth_access_tokens
andoauth_access_grant_archived_records
-
Once the BBM is finished completely, turn on feature flag (gitlab.com) to enqueue archive_revoked_access_tokens
Previous description
500 when attempting to delete oauth application... (#393247 - closed) has already added an index to oauth_access_grants.application_id
avoid this issue in future.
D, [2022-10-14T08:57:46.322118 #979760] DEBUG -- : Doorkeeper::AccessGrant Destroy (15003.1ms) /*application:console,db_config_name:main*/ DELETE FROM "oauth_access_grants" WHERE "oauth_access_grants"."application_id" = 218553
D, [2022-10-14T08:57:46.323964 #979760] DEBUG -- : TRANSACTION (1.1ms) /*application:console,db_config_name:main*/ ROLLBACK
Traceback (most recent call last):
16: from app/services/users/migrate_records_to_ghost_user_service.rb:53:in `post_migrate_records'
15: from app/models/concerns/batch_destroy_dependent_associations.rb:23:in `destroy_dependent_associations_in_batches'
14: from app/models/concerns/batch_destroy_dependent_associations.rb:23:in `each'
13: from app/models/concerns/batch_destroy_dependent_associations.rb:27:in `block in destroy_dependent_associations_in_batches'
12: from lib/gitlab/database/load_balancing/connection_proxy.rb:71:in `transaction'
11: from lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
10: from lib/gitlab/database/load_balancing/load_balancer.rb:118:in `read_write'
9: from lib/gitlab/database/load_balancing/load_balancer.rb:200:in `retry_with_backoff'
8: from lib/gitlab/database/load_balancing/load_balancer.rb:122:in `block in read_write'
7: from lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
6: from lib/gitlab/database/load_balancing/connection_proxy.rb:61:in `block (2 levels) in <class:ConnectionProxy>'
5: from lib/gitlab/database/load_balancing/connection_proxy.rb:119:in `write_using_load_balancer'
4: from lib/gitlab/database/load_balancing/load_balancer.rb:118:in `read_write'
3: from lib/gitlab/database/load_balancing/load_balancer.rb:200:in `retry_with_backoff'
2: from lib/gitlab/database/load_balancing/load_balancer.rb:122:in `block in read_write'
1: from lib/gitlab/database/load_balancing/connection_proxy.rb:120:in `block in write_using_load_balancer'
ActiveRecord::QueryCanceled (PG::QueryCanceled: ERROR: canceling statement due to statement timeout)