Limit ProjectAuthorizations refresh jobs to "distinct" users.
Premise
I wonder if we can change this line: https://gitlab.com/gitlab-org/gitlab/-/blob/3a6c2a41e144a73e209c56933b83b006ee07939f/app/models/group.rb#L374 into members_with_parents.pluck('distinct user_id')
Reason being:
Imagine your user_id
is 1
in the users table, and there is a group named GitLabGroup
and imagine that you are explicitly added as a member of each of the following groups
- the ancestor of the group
GitLabGroup
, - the
GitLabGroup
group itself - and of a group that is shared with
GitLabGroup
.
If we execute, GitLabGroup.user_ids_for_project_authorizations
, it would return an array of user ids, but with duplicate items included like [1,1,1...]
and when it comes to project authroizations worker that uses these array of ids, UserProjectAccessChangedService
enqueues multiple jobs but for these same ids.
Using distinct
would make sure that we only enqueue one job per user_id
, which would ofcourse helps us with enqueuing even fewe jobs than what we enqueue today when a project authorizations refresh is triggered for a specific group, thus reducing the number of jobs in the queue, leading to better performance, fewer queries and lesser time taken for authorization refreshes to complete.