Exclude group-covered projects from search authorization to reduce redundant payload

What does this MR do and why?

When a project is shared with a group via a group link, Search::ProjectsFinder can return that project as an individual entry even when the project's namespace is already covered by the user's group membership. This produces redundant project IDs in both Advanced Search (Elasticsearch) and Exact Code Search (Zoekt) queries.

For example, given this hierarchy where the user is a member of the top-level group:

top-level group (user has access)
├── sub group 1
│   └── sub group 2
│       └── project 1  ← shared with sub group 3
└── sub group 3

project 1 is already accessible via the user's top-level group membership, but because it is also shared with sub group 3, it gets returned as a separate project ID. For large namespaces (10k+ projects), this produces ~180 KB of redundant payload per access branch in Zoekt, potentially causing HTTP 413 errors when the query is sent to multiple nodes.

Approach

This MR adds redundant-project filtering into Search::ProjectsFinder#linked_through_group_projects. When the user has direct group memberships, a TrieNode is built from those memberships and used to check whether each shared project's namespace is already covered by the user's group hierarchy.

Projects whose namespaces are already covered are excluded — unless the group share link grants a higher access level than the user's existing group membership, in which case the project is still included (since the share link provides genuinely elevated permissions).

To avoid N+1 queries and handle large group memberships, group links are fetched in batches of 5,000 using raw SQL with ANY($1::bigint[]) binds rather than an ActiveRecord subquery. group_membership_source_ids was also changed from .select(:id) (AR subquery) to .pluck_primary_key (materialized array) to support this batching approach.

Example Zoekt query result

Before (project 1 enumerated redundantly):

OR(
  traversal_ids: ^top_level-sub1-sub2-,
  traversal_ids: ^top_level-sub3-,
  repo_ids: [project 1, ...]              # redundant — already covered by traversal_ids
)

After (only genuinely uncovered projects):

OR(
  traversal_ids: ^top_level-sub1-sub2-,
  traversal_ids: ^top_level-sub3-,
  repo_ids: [only uncovered IDs]          # project 1 excluded
)

Changes

    • Rewrite Search::ProjectsFinder#linked_through_group_projects to use a TrieNode built from the user's direct group memberships to skip projects whose namespaces are already covered
  • Introduce BATCH_SIZE = 5000 and batch group link fetching via raw SQL (find_links) to avoid large subqueries
  • Change group_membership_source_ids from .select(:id) to .pluck_primary_key to materialize IDs for batching
  • Preserve projects where the group share link grants elevated access above the user's existing group membership
  • Add specs for the covered/elevated-access cases in Search::ProjectsFinder and Search::AuthorizationContext
  • Add integration tests in Search::Zoekt::AccessBranchBuilder spec covering group-level access, partial group coverage, project-only access, and global search scenarios

How to set up and validate locally

Top Level Group *(Owner)*
└── 📁 My Product Group *(Owner)*
│   └── 📋 Rails Monolith
└── 📁 Developers Group *(Owner)*
└── 📋 Apple Project 
  1. test with a non-admin user
  2. login with that user
  3. create the hierarchy above
  4. in Rails Monolith project, invite the Developers Group with DEVELOPER access
  5. open rails console
  6. look up the user, and run the projects finder
    user = User.find XX
    ::Search::ProjectsFinder.new(user: user).execute
  7. ensure that the project Rails Monolith is not returned in the results
  8. checkout master branch and re-run the projects finder
  9. the project Rails Monolith will be returned in the results

References

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports

Loading