Skip to content

Implement OAuth access grant archival system and retention policy

What does this MR do and why?

  • Adds oauth_access_grant_archived_records table
    • The table has only organization FKs and minimal constraints because it is meant to be a simple dumping ground for data
  • Records revoked more than 1 month ago in oauth_access_grants will be copied to oauth_access_grant_archived_records and then deleted from oauth_access_grants
    • In this way, we soft delete records in the archive.
  • We'll drop the oauth_access_grant_archived_records table in a few milestones once we've confirmed the retention policy is being enforced with no issues

This will drastically reduce the number of records in the oauth_access_grants and its size (after table swap with a clean table, or pg_repack). A simulation on postgres.ai using the main production DB replica shows that applying cleanup according to the retention policy would reduce table size from 24.5 GB GB to 13 GB (indexes excluded). More details: #377995 (comment 2703854705)

This is the Part 1 of #377995 (closed)
Part 2 will be handled in: !207837 (merged)

Changelog: added

References

Main issue: #377995 (closed)
Plan: #555382
Details: #377995 (comment 2703854705)
Query plan: !204337 (comment 2750544518)

Screenshots or screen recordings

Before After

How to set up and validate locally

  • Run migrations, run tests

Queries in batch

SELECT "oauth_access_grants"."id" FROM "oauth_access_grants" WHERE "oauth_access_grants"."id" BETWEEN 4 AND 80002 AND "oauth_access_grants"."id" >= 19366 ORDER BY "oauth_access_grants"."id" ASC LIMIT 1 OFFSET 1000 
                                                                                                    
WITH deleted AS (
  DELETE FROM oauth_access_grants                                                                                                                                                        
  WHERE id IN (SELECT "oauth_access_grants"."id" FROM "oauth_access_grants" WHERE "oauth_access_grants"."id" BETWEEN 4 AND 80002 AND "oauth_access_grants"."id" >= 19366 AND "oauth_access_grants"."id" < 20366 AND "oauth_access_grants"."revoked_at" < '2025-08-10 00:00:00' LIMIT 1000)                                                                                        
  RETURNING *                                                                                                                                                                            
)                                                                                                                                                                                        
INSERT INTO oauth_access_grant_archived_records                                                                                                                                          
  (id, resource_owner_id, application_id, token, expires_in, redirect_uri,                                                                                                               
   revoked_at, created_at, scopes, organization_id, code_challenge,                                                                                                                      
   code_challenge_method, archived_at)                                                                                                                                                   
SELECT                                                                                                                                                                                   
  id, resource_owner_id, application_id, token, expires_in, redirect_uri,                                                                                                                
  revoked_at, created_at, scopes, organization_id, code_challenge,                                                                                                                       
  code_challenge_method, CURRENT_TIMESTAMP as archived_at                                                                                                                                
FROM deleted   

Query plan: !204337 (comment 2750544518)

Reversibility and Recovery

  • The migration to queue the batched background migration is reversible.
  • The operations performed by the batched background migration is not automatically reversible since it performs DELETE operations.
    • Migration can be stopped at any point with only processed batches affected
    • The CTE approach ensures atomicity per batch: grants are only deleted if successfully archived.
    • Unprocessed grants remain intact

Data recovery options are the following.

Full Recovery (from archive table)

INSERT INTO oauth_access_grants 
(id, resource_owner_id, application_id, token, expires_in, 
          redirect_uri, created_at, revoked_at, scopes, code_challenge,
          code_challenge_method, organization_id)
   SELECT id, resource_owner_id, application_id, token, expires_in, 
          redirect_uri, created_at, revoked_at, scopes, code_challenge,
          code_challenge_method, organization_id
   FROM oauth_access_grant_archived_records
   WHERE archived_at >= [migration_start_timestamp];

Selective Recovery (specific tokens)

INSERT INTO oauth_access_grants 
(id, resource_owner_id, application_id, token, expires_in, 
          redirect_uri, created_at, revoked_at, scopes, code_challenge,
          code_challenge_method, organization_id)
   SELECT id, resource_owner_id, application_id, token, expires_in, 
          redirect_uri, created_at, revoked_at, scopes, code_challenge,
          code_challenge_method, organization_id
   FROM oauth_access_grant_archived_records
   WHERE id BETWEEN 73800000 and 73801000;

Important: Since the archive table preserves original id values, records can be restored with their original IDs, maintaining referential integrity.

User Experience Impact if Error Occurs

  • Minimal authentication impact: Since these access grants are already revoked, they cannot be exchanged for access tokens. No active OAuth flows would be disrupted.
  • Authorization history views: Users reviewing their authorization history or connected applications might see missing entries for revoked grants if the UI expects these records.

Migrations

Migrate down
>>> Executing: bin/rails db:migrate:down:main VERSION=20250908105903
main: == [advisory_lock_connection] object_id: 154460, pg_backend_pid: 17467
main: == 20250908105903 CreateOauthAccessGrantArchivedRecords: reverting ============
main: -- drop_table(:oauth_access_grant_archived_records, {:if_exists=>true})
main:    -> 0.0181s
main: == 20250908105903 CreateOauthAccessGrantArchivedRecords: reverted (0.0383s) ===

main: == [advisory_lock_connection] object_id: 154460, pg_backend_pid: 17467
>>> Executing: bin/rails db:migrate:down:ci VERSION=20250908105903
ci: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17607
ci: == 20250908105903 CreateOauthAccessGrantArchivedRecords: reverting ============
ci: -- drop_table(:oauth_access_grant_archived_records, {:if_exists=>true})
ci:    -> 0.0068s
ci: == 20250908105903 CreateOauthAccessGrantArchivedRecords: reverted (0.0161s) ===

ci: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17607  
Migrate up
>>> Executing: bin/rails db:migrate:up:main VERSION=20250908105903
main: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17847
main: == 20250908105903 CreateOauthAccessGrantArchivedRecords: migrating ============
main: -- create_table(:oauth_access_grant_archived_records, {:id=>false})
main:    -> 0.0046s
main: -- execute("ALTER TABLE oauth_access_grant_archived_records ADD PRIMARY KEY (id)")
main:    -> 0.0006s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE oauth_access_grant_archived_records\nADD CONSTRAINT check_bc69cc7ce0\nCHECK ( char_length(code_challenge) <= 128 )\nNOT VALID;\n")
main:    -> 0.0005s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- execute("ALTER TABLE oauth_access_grant_archived_records VALIDATE CONSTRAINT check_bc69cc7ce0;")
main:    -> 0.0005s
main: -- execute("RESET statement_timeout")
main:    -> 0.0002s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE oauth_access_grant_archived_records\nADD CONSTRAINT check_ce125f5bae\nCHECK ( char_length(code_challenge_method) <= 5 )\nNOT VALID;\n")
main:    -> 0.0004s
main: -- execute("ALTER TABLE oauth_access_grant_archived_records VALIDATE CONSTRAINT check_ce125f5bae;")
main:    -> 0.0006s
main: == 20250908105903 CreateOauthAccessGrantArchivedRecords: migrated (0.0645s) ===

main: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17847
>>> Executing: bin/rails db:migrate:up:ci VERSION=20250908105903
ci: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17951
ci: == 20250908105903 CreateOauthAccessGrantArchivedRecords: migrating ============
ci: -- create_table(:oauth_access_grant_archived_records, {:id=>false})
ci:    -> 0.0066s
ci: -- execute("ALTER TABLE oauth_access_grant_archived_records ADD PRIMARY KEY (id)")
ci:    -> 0.0009s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE oauth_access_grant_archived_records\nADD CONSTRAINT check_bc69cc7ce0\nCHECK ( char_length(code_challenge) <= 128 )\nNOT VALID;\n")
ci:    -> 0.0005s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0002s
ci: -- execute("ALTER TABLE oauth_access_grant_archived_records VALIDATE CONSTRAINT check_bc69cc7ce0;")
ci:    -> 0.0248s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0005s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE oauth_access_grant_archived_records\nADD CONSTRAINT check_ce125f5bae\nCHECK ( char_length(code_challenge_method) <= 5 )\nNOT VALID;\n")
ci:    -> 0.0005s
ci: -- execute("ALTER TABLE oauth_access_grant_archived_records VALIDATE CONSTRAINT check_ce125f5bae;")
ci:    -> 0.0005s
I, [2025-09-11T16:43:01.184325 #17869]  INFO -- : Database: 'ci', Table: 'oauth_access_grant_archived_records': Lock Writes
I, [2025-09-11T16:43:01.185311 #17869]  INFO -- : {:method=>"with_lock_retries", :class=>"gitlab:db:lock_writes", :message=>"Lock timeout is set", :current_iteration=>1, :lock_timeout_in_ms=>100}
I, [2025-09-11T16:43:01.185660 #17869]  INFO -- : {:method=>"with_lock_retries", :class=>"gitlab:db:lock_writes", :message=>"Migration finished", :current_iteration=>1, :lock_timeout_in_ms=>100}
ci: == 20250908105903 CreateOauthAccessGrantArchivedRecords: migrated (0.0851s) ===

ci: == [advisory_lock_connection] object_id: 154440, pg_backend_pid: 17951

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Daniele Bracciani

Merge request reports

Loading