Refresh project authorizations per-project on group deletion

What does this MR do and why?

Summary

When a group that was shared with other groups or projects is deleted, stale project_authorizations records must be manually cleaned up. This MR replaces the existing per-user authorization refresh with a per-project refresh, reducing the number of background jobs from O(N users) to O(K projects).

Problem

When group G2 invites group G1 (a group share), G1's members gain access to G2's projects through a GroupGroupLink. This accessed is granted by creating project_authorizations records:

G2 (shared_group)
├── P2 (project)

GroupGroupLink { shared_group: G2, shared_with_group: G1 }

G1 (shared_with_group, 10,000 members)

ProjectAuthorizations: [U1→P2], [U2→P2], ..., [U10000→P2]

When G1 is deleted, GroupGroupLinks associated with it are deleted through foreign keys, so none of G1's members should have access to P2 anymore. However, the project_authorizations records that establish the members' access are not automatically cleaned up or updated. To remove/update them, Groups::DestroyService called UserProjectAccessChangedService with all affected user IDs, which enqueued one AuthorizedProjectsWorker job per member.

This is wasteful because each AuthorizedProjectsWorker job does the following for a given user:

  1. Acquires a Redis exclusive lease
  2. Runs SELECT on project_authorizations for that user
  3. Runs a recursive CTE (Gitlab::ProjectAuthorizations#calculate) that traverses every group the user belongs to, every group share, every subgroup, and every project share, to recompute all of their project authorizations from scratch
  4. Diffs the result and runs a DELETE + User#update!

For G1 with 10,000 members and G2 with 1 project, the old approach queues 10,000 jobs when the actual work should be removing/updating 10,000 project_authorization rows.

This inefficiency is compounded because the nightly AdjournedGroupDeletionWorker cron job can delete many groups at once.

Solution

Refresh per project, not per user.

Instead of "For each of N users: which projects should I have access to?" do "For each of K affected projects: which users should have access?"

Before (O(N) jobs):                     After (O(K) jobs):

G1 deleted, 10,000 members              G1 deleted, G2 has 100 projects
     │                                       │
     ├─ AuthorizedProjectsWorker[U1]         ├─ ProjectRecalculateWorker[P2_1]
     ├─ AuthorizedProjectsWorker[U2]         ├─ ProjectRecalculateWorker[P2_2]
     ├─ ...                                  ├─ ...
     └─ AuthorizedProjectsWorker[U10000]     └─ ProjectRecalculateWorker[P2_100]

10,000 recursive CTEs                   100 project-scoped queries

How it works

Before group.destroy, while the GroupGroupLink and ProjectGroupLink records still exist, Groups::DestroyService now collects the IDs of all projects accessible for the group members through group/project sharing:

  • Group shares (group.shared_group_links): all projects under the groups that G1 was invited into (e.g., all projects under G2 and G2's descendants)
  • Project shares (group.project_group_links): projects directly shared with G1

After group.destroy, it calls AuthorizedProjectUpdate::ProjectAccessChangedService with those project IDs, enqueuing one ProjectRecalculateWorker per project.

Precedent

This pattern is already used in:

Operation Service used
Group transfer ProjectAccessChangedService
Project share removed ProjectRecalculateWorker

References

Screenshots or screen recordings

Before After

How to set up and validate locally

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Eugie Limpin

Merge request reports

Loading