Skip to content

Refactor ProjectsFinder#init_collection and GroupProjectsFinder#init_collection

Yorick Peterse requested to merge refactor-projects-finder-init-collection into master

What does this MR do?

This refactors ProjectsFinder#init_collection so it doesn't rely on a UNION any more, producing simpler queries in the process. This also refactors GroupProjectsFinder#init_collection in a similar fashion, though it still uses a UNION as this produced better queries compared to using OR statements.

This fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/33632.

Query Difference

Code such as this:

ProjectsFinder.new(current_user: User.find(1), params: { starred: true }).execute

Would produce SQL such as this:

SELECT projects.*
FROM projects
WHERE projects.pending_delete = 'f'
AND (
    projects.id IN (
        SELECT projects.id
        FROM projects
        INNER JOIN users_star_projects ON users_star_projects.project_id = projects.id
        INNER JOIN project_authorizations ON projects.id = project_authorizations.project_id
        WHERE projects.pending_delete = 'f'
        AND project_authorizations.user_id = 1
        AND users_star_projects.user_id = 1

        UNION

        SELECT projects.id
        FROM projects
        INNER JOIN users_star_projects ON users_star_projects.project_id = projects.id
        WHERE projects.visibility_level IN (20, 10)
        AND users_star_projects.user_id = 1
    )
)
ORDER BY projects.id DESC;

With these changes the above query is reduced down to this instead:

SELECT projects.*
FROM projects
INNER JOIN users_star_projects
    ON users_star_projects.project_id = projects.id
WHERE projects.pending_delete = 'f'
AND (
    EXISTS (
        SELECT 1
        FROM project_authorizations
        WHERE project_authorizations.user_id = 1
        AND (project_id = projects.id)
    )
    OR projects.visibility_level IN (20,10)
)
AND users_star_projects.user_id = 1
ORDER BY projects.id DESC;

The plan of this query (on GitLab.com) is:

                                                                                                 QUERY PLAN                                                                                                                                                                                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                 
 Nested Loop  (cost=0.72..42.15 rows=1 width=510) (actual time=0.039..0.059 rows=2 loops=1)                                                                   
   ->  Index Only Scan Backward using index_users_star_projects_on_user_id_and_project_id on users_star_projects  (cost=0.29..8.33 rows=2 width=4) (actual time=0.011..0.012 rows=2 loops=1)                                                                                                                                 
         Index Cond: (user_id = 1)                                             
         Heap Fetches: 0                                                       
   ->  Index Scan using projects_pkey on projects  (cost=0.43..16.90 rows=1 width=510) (actual time=0.020..0.021 rows=1 loops=2)                              
         Index Cond: (id = users_star_projects.project_id)                     
         Filter: ((NOT pending_delete) AND ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (visibility_level = ANY ('{20,10}'::integer[]))))                
         SubPlan 1                                                             
           ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations  (cost=0.43..8.45 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=2)                                                                                                                   
                 Index Cond: ((user_id = 1) AND (project_id = projects.id))    
                 Heap Fetches: 0                                               
         SubPlan 2                                                             
           ->  Index Only Scan using index_project_authorizations_on_user_id_project_id_access_level on project_authorizations project_authorizations_1  (cost=0.43..9.62 rows=68 width=4) (never executed)                                                                                                                  
                 Index Cond: (user_id = 1)                                     
                 Heap Fetches: 0                                               
 Planning time: 0.575 ms                                                       
 Execution time: 0.211 ms                            

I am a little bit surprised about the sub plans here so I'll see if I can somehow get rid of those.

Does this MR meet the acceptance criteria?

Edited by Yorick Peterse

Merge request reports