Refresh project authorizations per-project on group deletion
What does this MR do and why?
Summary
When a group that was shared with other groups or projects is deleted, stale project_authorizations records must be manually cleaned up. This MR replaces the existing per-user authorization refresh with a per-project refresh, reducing the number of background jobs from O(N users) to O(K projects).
Problem
When group G2 invites group G1 (a group share), G1's members gain access to G2's projects through a GroupGroupLink. This accessed is granted by creating project_authorizations records:
G2 (shared_group)
├── P2 (project)
│
GroupGroupLink { shared_group: G2, shared_with_group: G1 }
│
G1 (shared_with_group, 10,000 members)
ProjectAuthorizations: [U1→P2], [U2→P2], ..., [U10000→P2]When G1 is deleted, GroupGroupLinks associated with it are deleted through foreign keys, so none of G1's members should have access to P2 anymore. However, the project_authorizations records that establish the members' access are not automatically cleaned up or updated. To remove/update them, Groups::DestroyService called UserProjectAccessChangedService with all affected user IDs, which enqueued one AuthorizedProjectsWorker job per member.
This is wasteful because each AuthorizedProjectsWorker job does the following for a given user:
- Acquires a Redis exclusive lease
- Runs
SELECTonproject_authorizationsfor that user - Runs a recursive CTE (
Gitlab::ProjectAuthorizations#calculate) that traverses every group the user belongs to, every group share, every subgroup, and every project share, to recompute all of their project authorizations from scratch - Diffs the result and runs a
DELETE+User#update!
For G1 with 10,000 members and G2 with 1 project, the old approach queues 10,000 jobs when the actual work should be removing/updating 10,000 project_authorization rows.
This inefficiency is compounded because the nightly AdjournedGroupDeletionWorker cron job can delete many groups at once.
Solution
Refresh per project, not per user.
Instead of "For each of N users: which projects should I have access to?" do "For each of K affected projects: which users should have access?"
Before (O(N) jobs): After (O(K) jobs):
G1 deleted, 10,000 members G1 deleted, G2 has 100 projects
│ │
├─ AuthorizedProjectsWorker[U1] ├─ ProjectRecalculateWorker[P2_1]
├─ AuthorizedProjectsWorker[U2] ├─ ProjectRecalculateWorker[P2_2]
├─ ... ├─ ...
└─ AuthorizedProjectsWorker[U10000] └─ ProjectRecalculateWorker[P2_100]
10,000 recursive CTEs 100 project-scoped queriesHow it works
Before group.destroy, while the GroupGroupLink and ProjectGroupLink records still exist, Groups::DestroyService now collects the IDs of all projects accessible for the group members through group/project sharing:
- Group shares (
group.shared_group_links): all projects under the groups thatG1was invited into (e.g., all projects underG2andG2's descendants) - Project shares (
group.project_group_links): projects directly shared withG1
After group.destroy, it calls AuthorizedProjectUpdate::ProjectAccessChangedService with those project IDs, enqueuing one ProjectRecalculateWorker per project.
Precedent
This pattern is already used in:
| Operation | Service used |
|---|---|
| Group transfer | ProjectAccessChangedService |
| Project share removed | ProjectRecalculateWorker |
References
- For https://gitlab.com/gitlab-com/gl-infra/production-engineering/-/work_items/28694+
- Motivated by Spike: Is it possible to delete project_authori... (#554451 - closed)
Screenshots or screen recordings
| Before | After |
|---|---|
How to set up and validate locally
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.