Delete inactive groups and projects access tokens after 30 days
What does this MR do and why?
Related to #471683 (closed)
There are two classes of records stored in personal_access_tokens table
-
groups and projects access tokens, aka resource access tokens - tokens that tied to user records with user_type
6/project_bot. -
personal access tokens - tokens that tied to user records with user_type
0/human.
Inactive token is a token that is expired or revoked.
After personal access tokens become inactive(expired or revoked) they retain in the personal_access_tokens. GitLab does not automatically delete those records, hence we collected millions of inactive tokens in the table, see #471683 (comment 2062242381). This MR is not supposed to implement automatic deletion of personal access tokens, however this information might be helpful to validate the implementation of this MR from the database perspective.
Currently, right after groups and projects access tokens become inactive, they are automatically deleted. #462217 (closed) will disable[1, 2] the existing automatic deletion of groups and projects access tokens because
- we need to retain inactive tokens for 30 days for compliance and monitoring purposes
- the existing approach of automatic deletion of inactive tokens has gaps, it relies on membership expiration that could lead to bypass automatic deletion of inactive tokens and even to early token deletion, see #482845 (closed). Also, token revocation via rails console bypasses automatic token deletion. Because of that there is some amount of inactive groups and projects access tokens in the production DB, see #471683 (comment 2062242381).
This MR implements ResourceAccessTokens::InactiveTokensDeletionCronWorker cron worker that is run daily at 00:00. That worker iterates over User.project_bot users and schedules users deletion(**) if its token is inactive for more than 30 days. Moving automatic deletion of inactive groups and projects access tokens to separate process will also prevent bypass deletion of inactive tokens.
** - note that, as per existing implementation, to delete group and project access token we should delete user tied to the token. However, when deleting personal access token only the token record should be deleted.
DB
Query Plans
NOTE: The query plans use the production DB's id ranges.
each_batch finds the lowest id for the query
old: SELECT "users"."id" FROM "users" WHERE "users"."user_type" = 6 ORDER BY "users"."id" ASC LIMIT 1;
new: SELECT "users"."id" FROM "users" WHERE "users"."user_type" = 6 AND ("users"."id" > 0) ORDER BY "users"."id" ASC LIMIT 1;
old: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31265/commands/97069
new: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31291/commands/97151
each_batch finds the next id for the query based on the batch size configuration - 1000
old: SELECT "users"."id" FROM "users" WHERE "users"."user_type" = 6 AND "users"."id" >= 7416080 ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000;
new: SELECT "users"."id" FROM "users" WHERE "users"."user_type" = 6 AND ("users"."id" > 0) AND "users"."id" >= 7416080 ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000;
old: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31265/commands/97071
new: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31291/commands/97153
each_batch query with the id range for the batch
old: SELECT "users"."id", "users"."username" FROM "users" LEFT OUTER JOIN "personal_access_tokens" ON "personal_access_tokens"."user_id" = "users"."id" WHERE "users"."user_type" = 6 AND "users"."id" >= 7416080 AND "users"."id" < 7710470 AND "users"."id" IN (SELECT "personal_access_tokens"."user_id" FROM "personal_access_tokens" WHERE ("personal_access_tokens"."expires_at" <= '2024-09-03' AND "personal_access_tokens"."expires_at" < '2024-08-04' OR "personal_access_tokens"."revoked" = TRUE AND "personal_access_tokens"."updated_at" < '2024-08-04 08:19:50'));
new: SELECT "users"."id", "users"."username" FROM "users" LEFT OUTER JOIN "personal_access_tokens" ON "personal_access_tokens"."user_id" = "users"."id" WHERE "users"."user_type" = 6 AND ("users"."id" > 0) AND "users"."id" >= 7416080 AND "users"."id" < 7710470 AND "users"."id" IN (SELECT "personal_access_tokens"."user_id" FROM "personal_access_tokens" WHERE ("personal_access_tokens"."expires_at" <= '2024-09-06' AND "personal_access_tokens"."expires_at" < '2024-08-07' OR "personal_access_tokens"."revoked" = TRUE AND "personal_access_tokens"."updated_at" < '2024-08-07 11:40:14'));
old: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31179/commands/96844
new: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31291/commands/97155
each_batch query with the id range for the last batch
old: SELECT "users"."id", "users"."username" FROM "users" LEFT OUTER JOIN "personal_access_tokens" ON "personal_access_tokens"."user_id" = "users"."id" WHERE "users"."user_type" = 6 AND "users"."id" >= 22700839 AND "users"."id" IN (SELECT "personal_access_tokens"."user_id" FROM "personal_access_tokens" WHERE ("personal_access_tokens"."expires_at" <= '2024-09-03' AND "personal_access_tokens"."expires_at" < '2024-08-04' OR "personal_access_tokens"."revoked" = TRUE AND "personal_access_tokens"."updated_at" < '2024-08-04 08:19:50'));
new: SELECT "users"."id", "users"."username" FROM "users" LEFT OUTER JOIN "personal_access_tokens" ON "personal_access_tokens"."user_id" = "users"."id" WHERE "users"."user_type" = 6 AND ("users"."id" > 0) AND "users"."id" >= 22754813 AND "users"."id" IN (SELECT "personal_access_tokens"."user_id" FROM "personal_access_tokens" WHERE ("personal_access_tokens"."expires_at" <= '2024-09-06' AND "personal_access_tokens"."expires_at" < '2024-08-07' OR "personal_access_tokens"."revoked" = TRUE AND "personal_access_tokens"."updated_at" < '2024-08-07 11:40:14'));
old: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31179/commands/96846
new: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31291/commands/97158
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
Screenshots are required for UI changes, and strongly recommended for all other merge requests.
| Before | After |
|---|---|
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.