Skip to content

Use Group#all_projects in Packages:FinderHelper

What does this MR do?

This is mainly motivated by performance issues we were seeing with WHERE ID IN(...) queries https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3894 .

Those were addressed by updating the #all_projects method to use INNER JOIN instead !56078 (merged) so it's better if we move as much code to using #all_projects as possible . It happens that #all_projects is actually less code to write in all these places.

Query plans

Even though this is a generic helper I only found it used in ::Packages::Nuget::SearchService.new(current_user, group, 'package') so I've captured the Before/After for that.

Before

Query
WITH "project_ids" AS
(
       SELECT "projects"."id"
       FROM   "projects"
       WHERE  "projects"."namespace_id" IN (WITH recursive "base_and_descendants" AS (
                                            (
                                                   SELECT "namespaces".*
                                                   FROM   "namespaces"
                                                   WHERE  "namespaces"."type" = 'Group'
                                                   AND    "namespaces"."id" = 9970)
                                     UNION
                                           (
                                                  SELECT "namespaces".*
                                                  FROM   "namespaces",
                                                         "base_and_descendants"
                                                  WHERE  "namespaces"."type" = 'Group'
                                                  AND    "namespaces"."parent_id" = "base_and_descendants"."id"))
                                      SELECT "id"
                                      FROM   "base_and_descendants" AS "namespaces")
       AND    (
                     EXISTS
                     (
                            SELECT 1
                            FROM   "project_authorizations"
                            WHERE  "project_authorizations"."user_id" = 120073
                            AND    (
                                          project_authorizations.project_id = projects.id)
                            AND    (
                                          project_authorizations.access_level >= 20))
              OR     projects.visibility_level IN (10,20)))
SELECT partition_subquery.id,
       partition_subquery.project_id,
       partition_subquery.created_at,
       partition_subquery.updated_at,
       partition_subquery.NAME,
       partition_subquery.version,
       partition_subquery.package_type,
       partition_subquery.creator_id,
       partition_subquery.status
FROM   (
                SELECT   row_number() OVER (partition BY packages_packages.NAME ORDER BY packages_packages.created_at DESC) AS row_number,
                         packages_packages.*
                FROM     "packages_packages"
                WHERE    "packages_packages"."package_type" = 4
                AND      "packages_packages"."version" IS NOT NULL
                AND      "packages_packages"."name" != 'NuGet.Temporary.Package'
                AND      "packages_packages"."name" IN
                                                        (
                                                        SELECT DISTINCT "packages_packages"."name"
                                                        FROM            "packages_packages"
                                                        WHERE           "packages_packages"."package_type" = 4
                                                        AND             "packages_packages"."version" IS NOT NULL
                                                        AND             "packages_packages"."name" != 'NuGet.Temporary.Package'
                                                        AND             "packages_packages"."project_id" IN
                                                                        (
                                                                               SELECT "id"
                                                                               FROM   "project_ids")
                                                        AND             "packages_packages"."name" ilike '%package%'
                                                        ORDER BY        NAME ASC limit 20 offset 0)
                AND      "packages_packages"."project_id" IN
                         (
                                SELECT "id"
                                FROM   "project_ids")) partition_subquery
WHERE  "partition_subquery"."row_number" <= 10
Plan
 Subquery Scan on partition_subquery  (cost=20479.46..20480.17 rows=7 width=84) (actual time=20.793..20.802 rows=1 loops=1)
   Filter: (partition_subquery.row_number <= 10)
   Rows Removed by Filter: 0
   Buffers: shared hit=13289
   CTE project_ids
     ->  Nested Loop  (cost=1582.13..19335.38 rows=1996 width=4) (actual time=3.718..15.663 rows=1298 loops=1)
           Buffers: shared hit=9377
           ->  HashAggregate  (cost=1581.69..1583.50 rows=181 width=4) (actual time=3.621..3.686 rows=234 loops=1)
                 Group Key: namespaces_2.id
                 Buffers: shared hit=1166
                 ->  CTE Scan on base_and_descendants namespaces_2  (cost=1575.81..1579.43 rows=181 width=4) (actual time=0.071..3.525 rows=234 loops=1)
                       Buffers: shared hit=1166
                       CTE base_and_descendants
                         ->  Recursive Union  (cost=0.43..1575.81 rows=181 width=344) (actual time=0.069..3.108 rows=234 loops=1)
                               Buffers: shared hit=1166
                               ->  Index Scan using index_namespaces_on_type_and_id_partial on public.namespaces  (cost=0.43..3.45 rows=1 width=344) (actual time=0.055..0.056 rows=1 loops=1)
                                     Index Cond: (((namespaces.type)::text = 'Group'::text) AND (namespaces.id = 9970))
                                     Buffers: shared hit=4
                               ->  Nested Loop  (cost=0.56..156.87 rows=18 width=344) (actual time=0.029..0.401 rows=39 loops=6)
                                     Buffers: shared hit=1162
                                     ->  WorkTable Scan on base_and_descendants  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.005 rows=39 loops=6)
                                     ->  Index Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_1  (cost=0.56..15.65 rows=2 width=344) (actual time=0.007..0.009 rows=1 loops=234)
                                           Index Cond: (namespaces_1.parent_id = base_and_descendants.id)
                                           Filter: ((namespaces_1.type)::text = 'Group'::text)
                                           Rows Removed by Filter: 0
                                           Buffers: shared hit=1162
           ->  Index Scan using index_projects_on_namespace_id_and_id on public.projects  (cost=0.44..97.97 rows=11 width=8) (actual time=0.014..0.050 rows=6 loops=234)
                 Index Cond: (projects.namespace_id = namespaces_2.id)
                 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (projects.visibility_level = ANY ('{10,20}'::integer[])))
                 Rows Removed by Filter: 0
                 Buffers: shared hit=8211
                 SubPlan 1
                   ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations  (cost=0.57..3.59 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1298)
                         Index Cond: ((project_authorizations.user_id = 120073) AND (project_authorizations.project_id = projects.id) AND (project_authorizations.access_level >= 20))
                         Heap Fetches: 266
                         Buffers: shared hit=6205
                 SubPlan 2
                   ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations project_authorizations_1  (cost=0.57..887.19 rows=3108 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                         Index Cond: ((project_authorizations_1.user_id = 120073) AND (project_authorizations_1.access_level >= 20))
                         Heap Fetches: 0
   ->  WindowAgg  (cost=1144.08..1144.52 rows=22 width=92) (actual time=20.791..20.794 rows=1 loops=1)
         Buffers: shared hit=13289
         ->  Sort  (cost=1144.08..1144.13 rows=22 width=84) (actual time=20.782..20.785 rows=1 loops=1)
               Sort Key: packages_packages.name, packages_packages.created_at DESC
               Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=13289
               ->  Hash Join  (cost=504.22..1143.59 rows=22 width=84) (actual time=20.728..20.732 rows=1 loops=1)
                     Hash Cond: (packages_packages.project_id = project_ids.id)
                     Buffers: shared hit=13283
                     ->  Nested Loop  (cost=454.81..1093.82 rows=44 width=84) (actual time=3.417..3.419 rows=1 loops=1)
                           Buffers: shared hit=3906
                           ->  Limit  (cost=454.38..454.43 rows=20 width=34) (actual time=3.378..3.379 rows=1 loops=1)
                                 Buffers: shared hit=3902
                                 ->  Sort  (cost=454.38..455.71 rows=532 width=34) (actual time=3.376..3.377 rows=1 loops=1)
                                       Sort Key: packages_packages_1.name
                                       Sort Method: quicksort  Memory: 25kB
                                       Buffers: shared hit=3902
                                       ->  HashAggregate  (cost=434.91..440.23 rows=532 width=34) (actual time=3.350..3.352 rows=1 loops=1)
                                             Group Key: packages_packages_1.name
                                             Buffers: shared hit=3902
                                             ->  Nested Loop  (cost=45.33..433.58 rows=532 width=34) (actual time=2.013..3.344 rows=1 loops=1)
                                                   Buffers: shared hit=3902
                                                   ->  HashAggregate  (cost=44.91..46.91 rows=200 width=4) (actual time=0.642..0.929 rows=1298 loops=1)
                                                         Group Key: project_ids_1.id
                                                         ->  CTE Scan on project_ids project_ids_1  (cost=0.00..39.92 rows=1996 width=4) (actual time=0.001..0.176 rows=1298 loops=1)
                                                   ->  Index Only Scan using index_packages_project_id_name_partial_for_nuget on public.packages_packages packages_packages_1  (cost=0.42..1.92 rows=1 width=38) (actual time=0.002..0.002 rows=0 loops=1298)
                                                         Index Cond: (packages_packages_1.project_id = project_ids_1.id)
                                                         Heap Fetches: 0
                                                         Filter: ((packages_packages_1.name)::text ~~* '%package%'::text)
                                                         Rows Removed by Filter: 0
                                                         Buffers: shared hit=3902
                           ->  Index Scan using package_name_index on public.packages_packages  (cost=0.42..31.94 rows=2 width=84) (actual time=0.036..0.036 rows=1 loops=1)
                                 Index Cond: ((packages_packages.name)::text = (packages_packages_1.name)::text)
                                 Filter: ((packages_packages.version IS NOT NULL) AND ((packages_packages.name)::text <> 'NuGet.Temporary.Package'::text) AND (packages_packages.package_type = 4))
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=4
                     ->  Hash  (cost=46.91..46.91 rows=200 width=4) (actual time=17.292..17.293 rows=1298 loops=1)
                           Buckets: 2048  Batches: 1  Memory Usage: 62kB
                           Buffers: shared hit=9377
                           ->  HashAggregate  (cost=44.91..46.91 rows=200 width=4) (actual time=16.779..17.005 rows=1298 loops=1)
                                 Group Key: project_ids.id
                                 Buffers: shared hit=9377
                                 ->  CTE Scan on project_ids  (cost=0.00..39.92 rows=1996 width=4) (actual time=3.720..16.151 rows=1298 loops=1)
                                       Buffers: shared hit=9377

After

Query
WITH "project_ids" AS
(
           SELECT     "projects"."id"
           FROM       "projects"
           INNER JOIN (WITH recursive "base_and_descendants" AS (
                      (
                             SELECT "namespaces".*
                             FROM   "namespaces"
                             WHERE  "namespaces"."type" = 'Group'
                             AND    "namespaces"."id" = 9970)
               UNION
                     (
                            SELECT "namespaces".*
                            FROM   "namespaces",
                                   "base_and_descendants"
                            WHERE  "namespaces"."type" = 'Group'
                            AND    "namespaces"."parent_id" = "base_and_descendants"."id"))
                SELECT "id"
                FROM   "base_and_descendants" AS "namespaces") namespaces
               ON     namespaces.id=projects.namespace_id
               WHERE  (
                             EXISTS
                             (
                                    SELECT 1
                                    FROM   "project_authorizations"
                                    WHERE  "project_authorizations"."user_id" = 120073
                                    AND    (
                                                  project_authorizations.project_id = projects.id)
                                    AND    (
                                                  project_authorizations.access_level >= 20))
                      OR     projects.visibility_level IN (10,20)))
SELECT partition_subquery.id,
       partition_subquery.project_id,
       partition_subquery.created_at,
       partition_subquery.updated_at,
       partition_subquery.NAME,
       partition_subquery.version,
       partition_subquery.package_type,
       partition_subquery.creator_id,
       partition_subquery.status
FROM   (
                SELECT   row_number() OVER (partition BY packages_packages.NAME ORDER BY packages_packages.created_at DESC) AS row_number,
                         packages_packages.*
                FROM     "packages_packages"
                WHERE    "packages_packages"."package_type" = 4
                AND      "packages_packages"."version" IS NOT NULL
                AND      "packages_packages"."name" != 'NuGet.Temporary.Package'
                AND      "packages_packages"."name" IN
                                                        (
                                                        SELECT DISTINCT "packages_packages"."name"
                                                        FROM            "packages_packages"
                                                        WHERE           "packages_packages"."package_type" = 4
                                                        AND             "packages_packages"."version" IS NOT NULL
                                                        AND             "packages_packages"."name" != 'NuGet.Temporary.Package'
                                                        AND             "packages_packages"."project_id" IN
                                                                        (
                                                                               SELECT "id"
                                                                               FROM   "project_ids")
                                                        AND             "packages_packages"."name" ilike '%package%'
                                                        ORDER BY        NAME ASC limit 20 offset 0)
                AND      "packages_packages"."project_id" IN
                         (
                                SELECT "id"
                                FROM   "project_ids")) partition_subquery
WHERE  "partition_subquery"."row_number" <= 10
Plan
 Subquery Scan on partition_subquery  (cost=20477.20..20477.91 rows=7 width=84) (actual time=22.758..22.769 rows=1 loops=1)
   Filter: (partition_subquery.row_number <= 10)
   Rows Removed by Filter: 0
   Buffers: shared hit=13269
   CTE project_ids
     ->  Nested Loop  (cost=1576.25..19333.12 rows=1996 width=4) (actual time=0.201..17.304 rows=1298 loops=1)
           Buffers: shared hit=9357
           ->  CTE Scan on base_and_descendants namespaces_2  (cost=1575.81..1579.43 rows=181 width=4) (actual time=0.072..3.799 rows=234 loops=1)
                 Buffers: shared hit=1166
                 CTE base_and_descendants
                   ->  Recursive Union  (cost=0.43..1575.81 rows=181 width=344) (actual time=0.070..3.370 rows=234 loops=1)
                         Buffers: shared hit=1166
                         ->  Index Scan using index_namespaces_on_type_and_id_partial on public.namespaces  (cost=0.43..3.45 rows=1 width=344) (actual time=0.061..0.063 rows=1 loops=1)
                               Index Cond: (((namespaces.type)::text = 'Group'::text) AND (namespaces.id = 9970))
                               Buffers: shared hit=4
                         ->  Nested Loop  (cost=0.56..156.87 rows=18 width=344) (actual time=0.034..0.430 rows=39 loops=6)
                               Buffers: shared hit=1162
                               ->  WorkTable Scan on base_and_descendants  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.006 rows=39 loops=6)
                               ->  Index Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_1  (cost=0.56..15.65 rows=2 width=344) (actual time=0.008..0.010 rows=1 loops=234)
                                     Index Cond: (namespaces_1.parent_id = base_and_descendants.id)
                                     Filter: ((namespaces_1.type)::text = 'Group'::text)
                                     Rows Removed by Filter: 0
                                     Buffers: shared hit=1162
           ->  Index Scan using index_projects_on_namespace_id_and_id on public.projects  (cost=0.44..97.97 rows=11 width=8) (actual time=0.015..0.056 rows=6 loops=234)
                 Index Cond: (projects.namespace_id = namespaces_2.id)
                 Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (projects.visibility_level = ANY ('{10,20}'::integer[])))
                 Rows Removed by Filter: 0
                 Buffers: shared hit=8191
                 SubPlan 1
                   ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations  (cost=0.57..3.59 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=1298)
                         Index Cond: ((project_authorizations.user_id = 120073) AND (project_authorizations.project_id = projects.id) AND (project_authorizations.access_level >= 20))
                         Heap Fetches: 266
                         Buffers: shared hit=6185
                 SubPlan 2
                   ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations project_authorizations_1  (cost=0.57..887.19 rows=3108 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                         Index Cond: ((project_authorizations_1.user_id = 120073) AND (project_authorizations_1.access_level >= 20))
                         Heap Fetches: 0
   ->  WindowAgg  (cost=1144.08..1144.52 rows=22 width=92) (actual time=22.757..22.761 rows=1 loops=1)
         Buffers: shared hit=13269
         ->  Sort  (cost=1144.08..1144.13 rows=22 width=84) (actual time=22.748..22.751 rows=1 loops=1)
               Sort Key: packages_packages.name, packages_packages.created_at DESC
               Sort Method: quicksort  Memory: 25kB
               Buffers: shared hit=13269
               ->  Hash Join  (cost=504.22..1143.59 rows=22 width=84) (actual time=22.692..22.697 rows=1 loops=1)
                     Hash Cond: (packages_packages.project_id = project_ids.id)
                     Buffers: shared hit=13263
                     ->  Nested Loop  (cost=454.81..1093.82 rows=44 width=84) (actual time=3.672..3.675 rows=1 loops=1)
                           Buffers: shared hit=3906
                           ->  Limit  (cost=454.38..454.43 rows=20 width=34) (actual time=3.627..3.630 rows=1 loops=1)
                                 Buffers: shared hit=3902
                                 ->  Sort  (cost=454.38..455.71 rows=532 width=34) (actual time=3.625..3.627 rows=1 loops=1)
                                       Sort Key: packages_packages_1.name
                                       Sort Method: quicksort  Memory: 25kB
                                       Buffers: shared hit=3902
                                       ->  HashAggregate  (cost=434.91..440.23 rows=532 width=34) (actual time=3.601..3.604 rows=1 loops=1)
                                             Group Key: packages_packages_1.name
                                             Buffers: shared hit=3902
                                             ->  Nested Loop  (cost=45.33..433.58 rows=532 width=34) (actual time=2.206..3.595 rows=1 loops=1)
                                                   Buffers: shared hit=3902
                                                   ->  HashAggregate  (cost=44.91..46.91 rows=200 width=4) (actual time=0.779..1.051 rows=1298 loops=1)
                                                         Group Key: project_ids_1.id
                                                         ->  CTE Scan on project_ids project_ids_1  (cost=0.00..39.92 rows=1996 width=4) (actual time=0.001..0.234 rows=1298 loops=1)
                                                   ->  Index Only Scan using index_packages_project_id_name_partial_for_nuget on public.packages_packages packages_packages_1  (cost=0.42..1.92 rows=1 width=38) (actual time=0.002..0.002 rows=0 loops=1298)
                                                         Index Cond: (packages_packages_1.project_id = project_ids_1.id)
                                                         Heap Fetches: 0
                                                         Filter: ((packages_packages_1.name)::text ~~* '%package%'::text)
                                                         Rows Removed by Filter: 0
                                                         Buffers: shared hit=3902
                           ->  Index Scan using package_name_index on public.packages_packages  (cost=0.42..31.94 rows=2 width=84) (actual time=0.041..0.042 rows=1 loops=1)
                                 Index Cond: ((packages_packages.name)::text = (packages_packages_1.name)::text)
                                 Filter: ((packages_packages.version IS NOT NULL) AND ((packages_packages.name)::text <> 'NuGet.Temporary.Package'::text) AND (packages_packages.package_type = 4))
                                 Rows Removed by Filter: 0
                                 Buffers: shared hit=4
                     ->  Hash  (cost=46.91..46.91 rows=200 width=4) (actual time=19.001..19.001 rows=1298 loops=1)
                           Buckets: 2048  Batches: 1  Memory Usage: 62kB
                           Buffers: shared hit=9357
                           ->  HashAggregate  (cost=44.91..46.91 rows=200 width=4) (actual time=18.478..18.704 rows=1298 loops=1)
                                 Group Key: project_ids.id
                                 Buffers: shared hit=9357
                                 ->  CTE Scan on project_ids  (cost=0.00..39.92 rows=1996 width=4) (actual time=0.202..17.816 rows=1298 loops=1)
                                       Buffers: shared hit=9357

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #324220 (closed)

Edited by Dylan Griffith

Merge request reports