Add last_used_ips to project_access_tokens and group_access_token

What does this MR do and why?

Add last_used_ips to project_access_tokens and group_access_token

References

addresses: #600347 (closed)

Database

Plans below were run against gitlab-production-main via postgres.ai, with realistic scoping (user IN (resource.bots) matching the actual endpoint behavior) and default pagination LIMIT 20. Replaces the earlier unscoped EXPLAINs.

Base list query — project endpoint (GET /projects/:id/access_tokens)

PersonalAccessTokensFinder.new(user: project.bots, impersonation: false).execute
SELECT personal_access_tokens.*
FROM personal_access_tokens
WHERE personal_access_tokens.user_id IN (
  SELECT users.id
  FROM users
  INNER JOIN members ON members.user_id = users.id
   AND members.requested_at IS NULL
   AND members.type = 'ProjectMember'
   AND members.source_type = 'Project'
   AND members.source_id = $PROJECT_ID
  WHERE users.user_type IN (6)
)
AND personal_access_tokens.impersonation = FALSE
LIMIT 20 OFFSET 0;

postgres.ai session — project_id 278964 (gitlab-org/gitlab): 1.4 ms execution, 1077 buffer hits, 0 reads. 240 project members scanned → 16 bot users → 20 tokens returned.

Indexes used: index_non_requested_project_members_on_source_id_and_type, index_users_on_user_type_and_id, index_pat_on_user_id_and_expires_at.

Plan tree
Limit  (cost=54.67..61.91 rows=20 width=220) (actual time=1.127..1.250 rows=20 loops=1)
  Buffers: shared hit=1077
  ->  Nested Loop  (cost=54.67..63.36 rows=24 width=220) (actual time=1.125..1.246 rows=20 loops=1)
        Buffers: shared hit=1077
        ->  HashAggregate  (cost=54.23..54.24 rows=1 width=8) (actual time=1.101..1.105 rows=16 loops=1)
              Group Key: users.id
              Buffers: shared hit=1009
              ->  Nested Loop  (cost=1.00..54.23 rows=1 width=8) (actual time=0.083..1.068 rows=58 loops=1)
                    Buffers: shared hit=1009
                    ->  Index Scan using index_non_requested_project_members_on_source_id_and_type on public.members  (cost=0.56..23.37 rows=15 width=4) (actual time=0.030..0.330 rows=240 loops=1)
                          Index Cond: ((members.source_id = 278964) AND ((members.source_type)::text = 'Project'::text))
                          Buffers: shared hit=228
                    ->  Index Only Scan using index_users_on_user_type_and_id on public.users  (cost=0.44..2.06 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=240)
                          Index Cond: ((users.user_type = 6) AND (users.id = members.user_id))
                          Heap Fetches: 7
                          Buffers: shared hit=781
        ->  Index Scan using index_pat_on_user_id_and_expires_at on public.personal_access_tokens  (cost=0.44..8.81 rows=24 width=220) (actual time=0.008..0.008 rows=1 loops=16)
              Index Cond: (personal_access_tokens.user_id = users.id)
              Filter: (NOT personal_access_tokens.impersonation)
              Rows Removed by Filter: 0
              Buffers: shared hit=68

Base list query — group endpoint (GET /groups/:id/access_tokens)

Same finder chain, group bots resolved via members/GroupMember/Namespace:

SELECT personal_access_tokens.*
FROM personal_access_tokens
WHERE personal_access_tokens.user_id IN (
  SELECT users.id
  FROM users
  INNER JOIN members ON members.user_id = users.id
   AND members.requested_at IS NULL
   AND members.type = 'GroupMember'
   AND members.source_type = 'Namespace'
   AND members.source_id = $GROUP_ID
  WHERE users.user_type IN (6)
)
AND personal_access_tokens.impersonation = FALSE
LIMIT 20 OFFSET 0;

postgres.ai session — group_id 9970 (gitlab-org): 2.4 s execution, 7676 hits + 2931 reads. 2,698 members scanned → 4 bot users → 16 tokens.

Note: this is a pre-existing slow path, not introduced by this MR. The same predicate runs today via the UI (Groups::Settings::AccessTokensControllerRenderAccessTokens#active_access_tokensPersonalAccessTokensFinder.new({user: group.bots, ...})). The planner can't push user_type = 6 down efficiently because the bot/membership relationship isn't first-class. The MR's preload_last_used_ips adds ~2 ms on top — negligible. Tracking base-query optimization separately: [follow-up issue link].

Plan tree
Limit  (cost=478.80..486.20 rows=20 width=220) (actual time=2397.136..2421.218 rows=16 loops=1)
  Buffers: shared hit=7676 read=2931 dirtied=27
  ->  Nested Loop  (cost=478.80..551.36 rows=196 width=220) (actual time=2397.133..2421.205 rows=16 loops=1)
        ->  HashAggregate  (cost=478.36..478.44 rows=8 width=8) (actual time=2393.727..2393.741 rows=4 loops=1)
              Group Key: users.id
              Buffers: shared hit=7667 read=2911 dirtied=21
              ->  Nested Loop  (cost=1.00..478.34 rows=8 width=8) (actual time=159.632..2393.692 rows=4 loops=1)
                    ->  Index Scan using index_members_on_source_and_type_and_access_level on public.members  (cost=0.57..196.66 rows=137 width=4) (actual time=5.883..2113.853 rows=2698 loops=1)
                          Index Cond: ((members.source_id = 9970) AND ((members.source_type)::text = 'Namespace'::text) AND ((members.type)::text = 'GroupMember'::text))
                          Filter: (members.requested_at IS NULL)
                    ->  Index Only Scan using index_users_on_user_type_and_id on public.users  (cost=0.44..2.06 rows=1 width=4) (actual time=0.102..0.102 rows=0 loops=2698)
                          Index Cond: ((users.user_type = 6) AND (users.id = members.user_id))
                          Heap Fetches: 3
        ->  Index Scan using index_pat_on_user_id_and_expires_at on public.personal_access_tokens  (cost=0.44..8.81 rows=24 width=220) (actual time=2.983..6.856 rows=4 loops=4)
              Index Cond: (personal_access_tokens.user_id = users.id)
              Filter: (NOT personal_access_tokens.impersonation)

New scope: preload_last_used_ips

scope :preload_last_used_ips, -> { preload(:last_used_ips) }

Generated query (N ≤ 20 by default per-page):

SELECT *
FROM personal_access_token_last_used_ips
WHERE personal_access_token_id IN ($id_1, ..., $id_N);

Index used: idx_pat_last_used_ips_on_pat_id (existing btree on personal_access_token_id). No new index needed.

postgres.ai session: 2 ms execution, Index Scan using idx_pat_last_used_ips_on_pat_id via = ANY (array). Synthetic IDs were used (rows=0), but the planner's row estimate (rows=77) covers the worst-case 20-token × 5-IP scenario, and the access path is invariant to row count. Same query class already running in production via the GraphQL PAT resolver (app/graphql/resolvers/users/personal_access_tokens_resolver.rb) and the pre-MR /personal_access_tokens REST endpoint.

Plan tree
Index Scan using idx_pat_last_used_ips_on_pat_id on public.personal_access_token_last_used_ips  (cost=0.43..92.41 rows=77 width=49) (actual time=2.010..2.010 rows=0 loops=1)
  Index Cond: (personal_access_token_last_used_ips.personal_access_token_id = ANY ('{101,...,120}'::bigint[]))
  Buffers: shared read=3

New scope: preload_bot_user_associations_for_project

scope :preload_bot_user_associations_for_project, -> {
  preload(user: [:members, { user_detail: { bot_namespace: :project } }])
}

Fires four batched IN-list queries against indexed FKs:

SELECT * FROM members      WHERE user_id              IN ($user_id_1, ..., $user_id_N);
SELECT * FROM user_details WHERE user_id              IN ($user_id_1, ..., $user_id_N);
SELECT * FROM namespaces   WHERE id                   IN ($ns_id_1,   ..., $ns_id_N);
SELECT * FROM projects     WHERE project_namespace_id IN ($ns_id_1,   ..., $ns_id_N);

Indexes used: idx_members_on_user_and_source_and_source_type_and_member_role (composite on user_id leading), UNIQUE index_user_details_on_user_id, namespaces_pkey, UNIQUE index_projects_on_project_namespace_id. All sub-millisecond at N ≤ 20.

New scope: preload_bot_user_associations_for_group

scope :preload_bot_user_associations_for_group, -> {
  preload(user: [:members, { user_detail: :bot_namespace }])
}

Same shape as preload_bot_user_associations_for_project minus the projects lookup (group bots don't traverse to a project). Three IN-list queries against the same indexed FKs.

Edited by Smriti Garg

Merge request reports

Loading