Add last_used_ips to project_access_tokens and group_access_token
What does this MR do and why?
Add last_used_ips to project_access_tokens and group_access_token
References
addresses: #600347 (closed)
Database
Plans below were run against gitlab-production-main via postgres.ai, with realistic scoping (user IN (resource.bots) matching the actual endpoint behavior) and default pagination LIMIT 20. Replaces the earlier unscoped EXPLAINs.
Base list query — project endpoint (GET /projects/:id/access_tokens)
PersonalAccessTokensFinder.new(user: project.bots, impersonation: false).executeSELECT personal_access_tokens.*
FROM personal_access_tokens
WHERE personal_access_tokens.user_id IN (
SELECT users.id
FROM users
INNER JOIN members ON members.user_id = users.id
AND members.requested_at IS NULL
AND members.type = 'ProjectMember'
AND members.source_type = 'Project'
AND members.source_id = $PROJECT_ID
WHERE users.user_type IN (6)
)
AND personal_access_tokens.impersonation = FALSE
LIMIT 20 OFFSET 0;postgres.ai session — project_id 278964 (gitlab-org/gitlab): 1.4 ms execution, 1077 buffer hits, 0 reads. 240 project members scanned → 16 bot users → 20 tokens returned.
Indexes used: index_non_requested_project_members_on_source_id_and_type, index_users_on_user_type_and_id, index_pat_on_user_id_and_expires_at.
Plan tree
Limit (cost=54.67..61.91 rows=20 width=220) (actual time=1.127..1.250 rows=20 loops=1)
Buffers: shared hit=1077
-> Nested Loop (cost=54.67..63.36 rows=24 width=220) (actual time=1.125..1.246 rows=20 loops=1)
Buffers: shared hit=1077
-> HashAggregate (cost=54.23..54.24 rows=1 width=8) (actual time=1.101..1.105 rows=16 loops=1)
Group Key: users.id
Buffers: shared hit=1009
-> Nested Loop (cost=1.00..54.23 rows=1 width=8) (actual time=0.083..1.068 rows=58 loops=1)
Buffers: shared hit=1009
-> Index Scan using index_non_requested_project_members_on_source_id_and_type on public.members (cost=0.56..23.37 rows=15 width=4) (actual time=0.030..0.330 rows=240 loops=1)
Index Cond: ((members.source_id = 278964) AND ((members.source_type)::text = 'Project'::text))
Buffers: shared hit=228
-> Index Only Scan using index_users_on_user_type_and_id on public.users (cost=0.44..2.06 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=240)
Index Cond: ((users.user_type = 6) AND (users.id = members.user_id))
Heap Fetches: 7
Buffers: shared hit=781
-> Index Scan using index_pat_on_user_id_and_expires_at on public.personal_access_tokens (cost=0.44..8.81 rows=24 width=220) (actual time=0.008..0.008 rows=1 loops=16)
Index Cond: (personal_access_tokens.user_id = users.id)
Filter: (NOT personal_access_tokens.impersonation)
Rows Removed by Filter: 0
Buffers: shared hit=68Base list query — group endpoint (GET /groups/:id/access_tokens)
Same finder chain, group bots resolved via members/GroupMember/Namespace:
SELECT personal_access_tokens.*
FROM personal_access_tokens
WHERE personal_access_tokens.user_id IN (
SELECT users.id
FROM users
INNER JOIN members ON members.user_id = users.id
AND members.requested_at IS NULL
AND members.type = 'GroupMember'
AND members.source_type = 'Namespace'
AND members.source_id = $GROUP_ID
WHERE users.user_type IN (6)
)
AND personal_access_tokens.impersonation = FALSE
LIMIT 20 OFFSET 0;postgres.ai session — group_id 9970 (gitlab-org): 2.4 s execution, 7676 hits + 2931 reads. 2,698 members scanned → 4 bot users → 16 tokens.
Note: this is a pre-existing slow path, not introduced by this MR. The same predicate runs today via the UI (
Groups::Settings::AccessTokensController→RenderAccessTokens#active_access_tokens→PersonalAccessTokensFinder.new({user: group.bots, ...})). The planner can't pushuser_type = 6down efficiently because the bot/membership relationship isn't first-class. The MR'spreload_last_used_ipsadds ~2 ms on top — negligible. Tracking base-query optimization separately: [follow-up issue link].
Plan tree
Limit (cost=478.80..486.20 rows=20 width=220) (actual time=2397.136..2421.218 rows=16 loops=1)
Buffers: shared hit=7676 read=2931 dirtied=27
-> Nested Loop (cost=478.80..551.36 rows=196 width=220) (actual time=2397.133..2421.205 rows=16 loops=1)
-> HashAggregate (cost=478.36..478.44 rows=8 width=8) (actual time=2393.727..2393.741 rows=4 loops=1)
Group Key: users.id
Buffers: shared hit=7667 read=2911 dirtied=21
-> Nested Loop (cost=1.00..478.34 rows=8 width=8) (actual time=159.632..2393.692 rows=4 loops=1)
-> Index Scan using index_members_on_source_and_type_and_access_level on public.members (cost=0.57..196.66 rows=137 width=4) (actual time=5.883..2113.853 rows=2698 loops=1)
Index Cond: ((members.source_id = 9970) AND ((members.source_type)::text = 'Namespace'::text) AND ((members.type)::text = 'GroupMember'::text))
Filter: (members.requested_at IS NULL)
-> Index Only Scan using index_users_on_user_type_and_id on public.users (cost=0.44..2.06 rows=1 width=4) (actual time=0.102..0.102 rows=0 loops=2698)
Index Cond: ((users.user_type = 6) AND (users.id = members.user_id))
Heap Fetches: 3
-> Index Scan using index_pat_on_user_id_and_expires_at on public.personal_access_tokens (cost=0.44..8.81 rows=24 width=220) (actual time=2.983..6.856 rows=4 loops=4)
Index Cond: (personal_access_tokens.user_id = users.id)
Filter: (NOT personal_access_tokens.impersonation)New scope: preload_last_used_ips
scope :preload_last_used_ips, -> { preload(:last_used_ips) }Generated query (N ≤ 20 by default per-page):
SELECT *
FROM personal_access_token_last_used_ips
WHERE personal_access_token_id IN ($id_1, ..., $id_N);Index used: idx_pat_last_used_ips_on_pat_id (existing btree on personal_access_token_id). No new index needed.
postgres.ai session: 2 ms execution, Index Scan using idx_pat_last_used_ips_on_pat_id via = ANY (array). Synthetic IDs were used (rows=0), but the planner's row estimate (rows=77) covers the worst-case 20-token × 5-IP scenario, and the access path is invariant to row count. Same query class already running in production via the GraphQL PAT resolver (app/graphql/resolvers/users/personal_access_tokens_resolver.rb) and the pre-MR /personal_access_tokens REST endpoint.
Plan tree
Index Scan using idx_pat_last_used_ips_on_pat_id on public.personal_access_token_last_used_ips (cost=0.43..92.41 rows=77 width=49) (actual time=2.010..2.010 rows=0 loops=1)
Index Cond: (personal_access_token_last_used_ips.personal_access_token_id = ANY ('{101,...,120}'::bigint[]))
Buffers: shared read=3New scope: preload_bot_user_associations_for_project
scope :preload_bot_user_associations_for_project, -> {
preload(user: [:members, { user_detail: { bot_namespace: :project } }])
}Fires four batched IN-list queries against indexed FKs:
SELECT * FROM members WHERE user_id IN ($user_id_1, ..., $user_id_N);
SELECT * FROM user_details WHERE user_id IN ($user_id_1, ..., $user_id_N);
SELECT * FROM namespaces WHERE id IN ($ns_id_1, ..., $ns_id_N);
SELECT * FROM projects WHERE project_namespace_id IN ($ns_id_1, ..., $ns_id_N);Indexes used: idx_members_on_user_and_source_and_source_type_and_member_role (composite on user_id leading), UNIQUE index_user_details_on_user_id, namespaces_pkey, UNIQUE index_projects_on_project_namespace_id. All sub-millisecond at N ≤ 20.
New scope: preload_bot_user_associations_for_group
scope :preload_bot_user_associations_for_group, -> {
preload(user: [:members, { user_detail: :bot_namespace }])
}Same shape as preload_bot_user_associations_for_project minus the projects lookup (group bots don't traverse to a project). Three IN-list queries against the same indexed FKs.