Skip to content

Create a service to generate npm packages metadata on the group and instance level

Context

Performance can be slow when installing an npm package at the namespace level, when there are many projects and packages under the namespace. The DB query can be complex because it has to pull in data from multiple tables: namespaces, project_authorizations, and of course packages. The slowness happens here

Solution

One solution suggested in this thread is to move the expensive filters into the inner query and do name filtering only in the outer query. This MR implements that.

Implementation Details

  • Introduce Packages::Npm::GenerateMetadataForNamespaceService inheriting from GenerateMetadataService
  • For packages in a namespace (group or instance) - run a simpler query that does not join against user authorizations to retrieve the packages inside the namespace. Get the IDs of these packages and use these IDs to filter the full query

💾 Database Review

Original queries on master branch, with npm_allow_packages_in_multiple_projects feature flag enabled

(1) Querying for packages inside the namespace that match the name - .each_batch lower bound

Query
SELECT
    "packages_packages"."id"
FROM
    "packages_packages"
WHERE
    "packages_packages"."package_type" = 2
    AND "packages_packages"."project_id" IN (
        SELECT
            "projects"."id"
        FROM
            "projects"
            INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
            LEFT JOIN project_features ON projects.id = project_features.project_id
        WHERE
            (
                EXISTS (
                    SELECT
                        1
                    FROM
                        "project_authorizations"
                    WHERE
                        "project_authorizations"."user_id" = redacted-user-id
                        AND (project_authorizations.project_id = projects.id)
                )
                OR projects.visibility_level IN (0, 10, 20)
            )
            AND (
                namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id]::bigint[]
            )
            AND (
                "project_features"."package_registry_access_level" > 0
                OR "project_features"."package_registry_access_level" IS NULL
            )
    )
    AND "packages_packages"."status" IN (0, 1, 5)
    AND "packages_packages"."name" = 'redacted-package-name'
ORDER BY
    "packages_packages"."id" ASC
LIMIT
    1

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38346/commands/117613

(2) Querying for packages inside the namespace that match the name - .each_batch upper bound

pg.ai setup
reset
Query
SELECT
  "packages_packages"."id"
FROM
  "packages_packages"
WHERE
  "packages_packages"."package_type" = 2
  AND "packages_packages"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
      INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
      LEFT JOIN project_features ON projects.id = project_features.project_id
    WHERE
      (
        EXISTS (
          SELECT
            1
          FROM
            "project_authorizations"
          WHERE
            "project_authorizations"."user_id" = redacted-user-id
            AND (project_authorizations.project_id = projects.id)
        )
        OR projects.visibility_level IN (0, 10, 20)
      )
      AND (
        namespaces.traversal_ids::bigint[] && ARRAY[redacted-package-namespace-id]::bigint[]
      )
      AND (
        "project_features"."package_registry_access_level" > 0
        OR "project_features"."package_registry_access_level" IS NULL
      )
  )
  AND "packages_packages"."status" IN (0, 1, 5)
  AND "packages_packages"."name" = 'redacted-package-name'
  AND "packages_packages"."id" >= redacted-package-id
ORDER BY
  "packages_packages"."id" ASC
LIMIT
  1 OFFSET 1000

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38397/commands/117860

Queries on MR branch

(1) Querying for packages inside the namespace that match the name - .each_batch lower bound

pg.ai setup
reset;
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_packages"."id"
FROM
  "packages_packages"
WHERE
  "packages_packages"."package_type" = 2
  AND "packages_packages"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
      INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    WHERE
      (
        namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id]::bigint[]
      )
  )
  AND "packages_packages"."status" IN (0, 1, 5)
  AND "packages_packages"."name" = 'redacted-package-name'
ORDER BY
  "packages_packages"."id" ASC
LIMIT
  1
Explain plan
 Limit  (cost=1.69..358.52 rows=1 width=8) (actual time=3.664..3.666 rows=1 loops=1)
   Buffers: shared hit=6 read=12
   I/O Timings: read=3.517 write=0.000
   ->  Nested Loop Semi Join  (cost=1.69..358.52 rows=1 width=8) (actual time=3.663..3.664 rows=1 loops=1)
         Buffers: shared hit=6 read=12
         I/O Timings: read=3.517 write=0.000
         ->  Index Scan using package_name_id_index on public.packages_packages  (cost=0.56..238.98 rows=28 width=12) (actual time=1.640..1.641 rows=1 loops=1)
               Index Cond: ((packages_packages.name)::text = 'redacted-package-name'::text)
               Filter: ((packages_packages.package_type = 2) AND (packages_packages.status = ANY ('{0,1,5}'::integer[])))
               Rows Removed by Filter: 0
               Buffers: shared read=5
               I/O Timings: read=1.586 write=0.000
         ->  Nested Loop  (cost=1.13..4.26 rows=1 width=4) (actual time=2.019..2.020 rows=1 loops=1)
               Buffers: shared hit=6 read=7
               I/O Timings: read=1.930 write=0.000
               ->  Index Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects  (cost=0.56..3.58 rows=1 width=8) (actual time=0.314..0.315 rows=1 loops=1)
                     Index Cond: (projects.id = packages_packages.project_id)
                     Buffers: shared hit=1 read=4
                     I/O Timings: read=0.283 write=0.000
               ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.57..0.68 rows=1 width=4) (actual time=1.701..1.701 rows=1 loops=1)
                     Index Cond: (namespaces.id = projects.namespace_id)
                     Filter: ((namespaces.traversal_ids)::bigint[] && '{redacted-namespace-id}'::bigint[])
                     Rows Removed by Filter: 0
                     Buffers: shared hit=5 read=3
                     I/O Timings: read=1.647 write=0.000
Settings: effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116384

(2) Querying for packages inside the namespace that match the name - .each_batch upper bound

pg.ai setup
reset;
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_packages"."id"
FROM
  "packages_packages"
WHERE
  "packages_packages"."package_type" = 2
  AND "packages_packages"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
      INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    WHERE
      (
        namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id]::bigint[]
      )
  )
  AND "packages_packages"."status" IN (0, 1, 5)
  AND "packages_packages"."name" = 'redacted-package-name'
ORDER BY
  "packages_packages"."id" ASC
LIMIT
  1 OFFSET 1000
Explain plan
 Limit  (cost=358.52..715.34 rows=1 width=8) (actual time=231.018..231.020 rows=1 loops=1)
   Buffers: shared hit=10014 read=1009 dirtied=58
   WAL: records=59 fpi=58 bytes=470863
   I/O Timings: read=222.018 write=0.000
   ->  Nested Loop Semi Join  (cost=1.69..358.52 rows=1 width=8) (actual time=12.310..230.949 rows=1001 loops=1)
         Buffers: shared hit=10014 read=1009 dirtied=58
         WAL: records=59 fpi=58 bytes=470863
         I/O Timings: read=222.018 write=0.000
         ->  Index Scan using package_name_id_index on public.packages_packages  (cost=0.56..238.98 rows=28 width=12) (actual time=1.694..214.947 rows=1001 loops=1)
               Index Cond: ((packages_packages.name)::text = 'redacted-package-name'::text)
               Filter: ((packages_packages.package_type = 2) AND (packages_packages.status = ANY ('{0,1,5}'::integer[])))
               Rows Removed by Filter: 0
               Buffers: shared hit=8 read=1002 dirtied=58
               WAL: records=59 fpi=58 bytes=470863
               I/O Timings: read=211.533 write=0.000
         ->  Nested Loop  (cost=1.13..4.26 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=1001)
               Buffers: shared hit=10006 read=7
               I/O Timings: read=10.485 write=0.000
               ->  Index Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects  (cost=0.56..3.58 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1001)
                     Index Cond: (projects.id = packages_packages.project_id)
                     Buffers: shared hit=5001 read=4
                     I/O Timings: read=7.465 write=0.000
               ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.57..0.68 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1001)
                     Index Cond: (namespaces.id = projects.namespace_id)
                     Filter: ((namespaces.traversal_ids)::bigint[] && '{redacted-namespace-id}'::bigint[])
                     Rows Removed by Filter: 0
                     Buffers: shared hit=5005 read=3
                     I/O Timings: read=3.019 write=0.000
Settings: random_page_cost = '1.5', seq_page_cost = '4', effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116388

(3) Loading package_ids for the outer loop query

pg.ai setup
reset;
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_packages"."id"
FROM
  "packages_packages"
WHERE
  "packages_packages"."package_type" = 2
  AND "packages_packages"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
      INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    WHERE
      (
        namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id]::bigint[]
      )
  )
  AND "packages_packages"."status" IN (0, 1, 5)
  AND "packages_packages"."name" = 'redacted-package-name'
  AND "packages_packages"."id" >= redacted-package-id
Explain plan
 Nested Loop Semi Join  (cost=1.69..66.00 rows=1 width=8) (actual time=15.782..230.638 rows=417 loops=1)
   Buffers: shared hit=4217 read=383 dirtied=83
   WAL: records=106 fpi=83 bytes=668564
   I/O Timings: read=213.794 write=0.000
   ->  Index Scan using package_name_id_index on public.packages_packages  (cost=0.56..44.66 rows=5 width=12) (actual time=0.991..206.496 rows=417 loops=1)
         Index Cond: (((packages_packages.name)::text = 'redacted-package-name'::text) AND (packages_packages.id >= redacted-package-id))
         Filter: ((packages_packages.package_type = 2) AND (packages_packages.status = ANY ('{0,1,5}'::integer[])))
         Rows Removed by Filter: 0
         Buffers: shared hit=51 read=376 dirtied=83
         WAL: records=106 fpi=83 bytes=668564
         I/O Timings: read=199.231 write=0.000
   ->  Nested Loop  (cost=1.13..4.26 rows=1 width=4) (actual time=0.056..0.056 rows=1 loops=417)
         Buffers: shared hit=4166 read=7
         I/O Timings: read=14.562 write=0.000
         ->  Index Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects  (cost=0.56..3.58 rows=1 width=8) (actual time=0.034..0.034 rows=1 loops=417)
               Index Cond: (projects.id = packages_packages.project_id)
               Buffers: shared hit=2081 read=4
               I/O Timings: read=9.957 write=0.000
         ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.57..0.68 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=417)
               Index Cond: (namespaces.id = projects.namespace_id)
               Filter: ((namespaces.traversal_ids)::bigint[] && '{redacted-namespace-id}'::bigint[])
               Rows Removed by Filter: 0
               Buffers: shared hit=2085 read=3
               I/O Timings: read=4.606 write=0.000
Settings: effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116394

(4) Preload dependencies

pg.ai setup
reset;
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_dependencies"."id"
FROM
  "packages_dependencies"
WHERE
  "packages_dependencies"."id" IN (
    SELECT
      "packages_dependency_links"."dependency_id"
    FROM
      "packages_dependency_links"
    WHERE
      "packages_dependency_links"."package_id" IN (
        SELECT
          "packages_packages"."id"
        FROM
          "packages_packages"
        WHERE
          "packages_packages"."package_type" = 2
          AND "packages_packages"."project_id" IN (
            SELECT
              "projects"."id"
            FROM
              "projects"
              INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
              LEFT JOIN project_features ON projects.id = project_features.project_id
            WHERE
              (
                EXISTS (
                  SELECT
                    1
                  FROM
                    "project_authorizations"
                  WHERE
                    "project_authorizations"."user_id" = redacted-user-id
                    AND (project_authorizations.project_id = projects.id)
                )
                OR projects.visibility_level IN (0, 10, 20)
              )
              AND (
                namespaces.traversal_ids::bigint[] &&ARRAY[redacted-namespace-id]::bigint[]
              )
              AND (
                "project_features"."package_registry_access_level" > 0
                OR "project_features"."package_registry_access_level" IS NULL
              )
          )
          AND "packages_packages"."status" IN (0, 1, 5)
          AND "packages_packages"."name" = 'redacted-package-name'
          AND "packages_packages"."id" IN (redacted-package-ids)
      )
  )
  AND 1 = 1
ORDER BY
  "packages_dependencies"."id" ASC
LIMIT
  1
Explain plan
 Limit  (cost=134.05..134.06 rows=1 width=8) (actual time=136.809..136.818 rows=1 loops=1)
   Buffers: shared hit=1280 read=250 dirtied=8
   WAL: records=8 fpi=8 bytes=63708
   I/O Timings: read=130.181 write=0.000
   ->  Sort  (cost=134.05..134.07 rows=5 width=8) (actual time=136.807..136.815 rows=1 loops=1)
         Sort Key: packages_dependencies.id
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=1280 read=250 dirtied=8
         WAL: records=8 fpi=8 bytes=63708
         I/O Timings: read=130.181 write=0.000
         ->  Nested Loop  (cost=131.75..134.03 rows=5 width=8) (actual time=135.047..136.789 rows=3 loops=1)
               Buffers: shared hit=1277 read=250 dirtied=8
               WAL: records=8 fpi=8 bytes=63708
               I/O Timings: read=130.181 write=0.000
               ->  HashAggregate  (cost=131.32..131.37 rows=5 width=8) (actual time=133.177..133.189 rows=3 loops=1)
                     Group Key: packages_dependency_links.dependency_id
                     Buffers: shared hit=1271 read=246 dirtied=8
                     WAL: records=8 fpi=8 bytes=63708
                     I/O Timings: read=126.644 write=0.000
                     ->  Nested Loop  (cost=120.78..131.31 rows=5 width=8) (actual time=67.413..133.034 rows=162 loops=1)
                           Buffers: shared hit=1271 read=246 dirtied=8
                           WAL: records=8 fpi=8 bytes=63708
                           I/O Timings: read=126.644 write=0.000
                           ->  HashAggregate  (cost=120.21..120.22 rows=1 width=8) (actual time=62.179..62.257 rows=54 loops=1)
                                 Group Key: packages_packages.id
                                 Buffers: shared hit=1115 read=150 dirtied=4
                                 WAL: records=4 fpi=4 bytes=32332
                                 I/O Timings: read=57.230 write=0.000
                                 ->  Nested Loop Semi Join  (cost=111.84..120.21 rows=1 width=8) (actual time=58.252..62.142 rows=54 loops=1)
                                       Buffers: shared hit=1115 read=150 dirtied=4
                                       WAL: records=4 fpi=4 bytes=32332
                                       I/O Timings: read=57.230 write=0.000
                                       ->  Bitmap Heap Scan on public.packages_packages  (cost=110.14..111.66 rows=1 width=12) (actual time=39.821..43.130 rows=54 loops=1)
                                             Filter: ((packages_packages.package_type = 2) AND (packages_packages.status = ANY ('{0,1,5}'::integer[])))
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=103 read=136 dirtied=4
                                             WAL: records=4 fpi=4 bytes=32332
                                             I/O Timings: read=39.002 write=0.000
                                             ->  BitmapAnd  (cost=110.01..110.01 rows=1 width=0) (actual time=39.647..39.649 rows=0 loops=1)
                                                   Buffers: shared hit=103 read=82
                                                   I/O Timings: read=38.404 write=0.000
                                                   ->  Bitmap Index Scan using package_name_id_index  (cost=0.00..4.72 rows=155 width=0) (actual time=1.035..1.035 rows=2313 loops=1)
                                                         Index Cond: ((packages_packages.name)::text = 'redacted-package-name'::text)
                                                         Buffers: shared read=23
                                                         I/O Timings: read=0.469 write=0.000
                                                   ->  Bitmap Index Scan using index_packages_packages_on_id_and_created_at  (cost=0.00..105.03 rows=54 width=0) (actual time=38.513..38.514 rows=54 loops=1)
                                                         Index Cond: (packages_packages.id = ANY ('redacted-package-ids'::bigint[]))
                                                         Buffers: shared hit=103 read=59
                                                         I/O Timings: read=37.934 write=0.000
                                       ->  Nested Loop Left Join  (cost=1.70..8.54 rows=1 width=4) (actual time=0.351..0.351 rows=1 loops=54)
                                             Filter: ((project_features.package_registry_access_level > 0) OR (project_features.package_registry_access_level IS NULL))
                                             Rows Removed by Filter: 0
                                             Buffers: shared hit=1012 read=14
                                             I/O Timings: read=18.227 write=0.000
                                             ->  Nested Loop  (cost=1.13..7.86 rows=1 width=4) (actual time=0.279..0.279 rows=1 loops=54)
                                                   Buffers: shared hit=745 read=11
                                                   I/O Timings: read=14.459 write=0.000
                                                   ->  Index Scan using idx_projects_on_repository_storage_last_repository_updated_at on public.projects  (cost=0.56..7.18 rows=1 width=8) (actual time=0.204..0.204 rows=1 loops=54)
                                                         Index Cond: (projects.id = packages_packages.project_id)
                                                         Filter: ((SubPlan 1) OR (projects.visibility_level = ANY ('{0,10,20}'::integer[])))
                                                         Rows Removed by Filter: 0
                                                         Buffers: shared hit=478 read=8
                                                         I/O Timings: read=10.652 write=0.000
                                                         SubPlan 1
                                                           ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations  (cost=0.58..3.60 rows=1 width=0) (actual time=0.089..0.089 rows=0 loops=54)
                                                                 Index Cond: ((project_authorizations.user_id = redacted-user-id) AND (project_authorizations.project_id = projects.id))
                                                                 Heap Fetches: 0
                                                                 Buffers: shared hit=212 read=4
                                                                 I/O Timings: read=4.692 write=0.000
                                                   ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.57..0.68 rows=1 width=4) (actual time=0.074..0.074 rows=1 loops=54)
                                                         Index Cond: (namespaces.id = projects.namespace_id)
                                                         Filter: ((namespaces.traversal_ids)::bigint[] && '{redacted-namespace-id}'::bigint[])
                                                         Rows Removed by Filter: 0
                                                         Buffers: shared hit=267 read=3
                                                         I/O Timings: read=3.807 write=0.000
                                             ->  Index Scan using index_project_features_on_project_id on public.project_features  (cost=0.56..0.66 rows=1 width=8) (actual time=0.072..0.072 rows=1 loops=54)
                                                   Index Cond: (project_features.project_id = projects.id)
                                                   Buffers: shared hit=267 read=3
                                                   I/O Timings: read=3.769 write=0.000
                           ->  Index Only Scan using idx_pkgs_dep_links_on_pkg_id_dependency_id_dependency_type on public.packages_dependency_links  (cost=0.57..9.90 rows=118 width=16) (actual time=1.306..1.308 rows=3 loops=54)
                                 Index Cond: (packages_dependency_links.package_id = packages_packages.id)
                                 Heap Fetches: 12
                                 Buffers: shared hit=156 read=96 dirtied=4
                                 WAL: records=4 fpi=4 bytes=31376
                                 I/O Timings: read=69.414 write=0.000
               ->  Index Only Scan using packages_dependencies_pkey on public.packages_dependencies  (cost=0.43..0.53 rows=1 width=8) (actual time=1.196..1.196 rows=1 loops=3)
                     Index Cond: (packages_dependencies.id = packages_dependency_links.dependency_id)
                     Heap Fetches: 0
                     Buffers: shared hit=6 read=4
                     I/O Timings: read=3.538 write=0.000
Settings: seq_page_cost = '4', effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB', random_page_cost = '1.5'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116409

(5) Preload dependency links

pg.ai setup
reset;
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
WITH "dependency_links_cte" AS MATERIALIZED (
    SELECT
        package_id,
        dependency_type,
        ARRAY_AGG(dependency_id) as dependency_ids
    FROM
        "packages_dependency_links"
    WHERE
        "packages_dependency_links"."package_id" IN (
            SELECT
                "packages_packages"."id"
            FROM
                "packages_packages"
            WHERE
                "packages_packages"."package_type" = 2
                AND "packages_packages"."project_id" IN (
                    SELECT
                        "projects"."id"
                    FROM
                        "projects"
                        INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
                        LEFT JOIN project_features ON projects.id = project_features.project_id
                    WHERE
                        (
                            EXISTS (
                                SELECT
                                    1
                                FROM
                                    "project_authorizations"
                                WHERE
                                    "project_authorizations"."user_id" = redacted-user-id
                                    AND (project_authorizations.project_id = projects.id)
                            )
                            OR projects.visibility_level IN (0, 10, 20)
                        )
                        AND (
                            namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id]::bigint[]
                        )
                        AND (
                            "project_features"."package_registry_access_level" > 0
                            OR "project_features"."package_registry_access_level" IS NULL
                        )
                )
                AND "packages_packages"."status" IN (0, 1, 5)
                AND "packages_packages"."name" = 'redacted-package-name'
                AND "packages_packages"."id" IN (redacted-package-ids)
    GROUP BY
        "packages_dependency_links"."package_id",
        "packages_dependency_links"."dependency_type"
)
SELECT
    "package_id"
FROM
    "dependency_links_cte" "packages_dependency_links"
GROUP BY
    "package_id"
ORDER BY
    "package_id" ASC
LIMIT
    1

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116414

(6) Load packages

pg.ai setup
reset
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_packages".*
FROM
  "packages_packages"
WHERE
  "packages_packages"."package_type" = 2
  AND "packages_packages"."project_id" IN (
    SELECT
      "projects"."id"
    FROM
      "projects"
      INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
      LEFT JOIN project_features ON projects.id = project_features.project_id
    WHERE
      (
        EXISTS (
          SELECT
            1
          FROM
            "project_authorizations"
          WHERE
            "project_authorizations"."user_id" = redacted-user-id
            AND (project_authorizations.project_id = projects.id)
        )
        OR projects.visibility_level IN (0, 10, 20)
      )
      AND (
        namespaces.traversal_ids::bigint[] && ARRAY[redacted-namespace-id] :: bigint []
      )
      AND (
        "project_features"."package_registry_access_level" > 0
        OR "project_features"."package_registry_access_level" IS NULL
      )
  )
  AND "packages_packages"."status" IN (0, 1, 5)
  AND "packages_packages"."name" = 'redacted-package-name'
  AND "packages_packages"."id" IN (redacted-package-ids)

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116418

(7) Load package tags

pg.ai setup
reset
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_tags".*
FROM
  "packages_tags"
WHERE
  "packages_tags"."package_id" IN (redacted-package-ids)
Explain plan
 Index Scan using index_packages_tags_on_package_id_and_updated_at on public.packages_tags  (cost=0.43..639.31 rows=729 width=45) (actual time=28.368..28.369 rows=0 loops=1)
   Index Cond: (packages_tags.package_id = ANY ('{redacted-package-ids}'::integer[]))
   Buffers: shared hit=115 read=50
   I/O Timings: read=27.856 write=0.000
Settings: work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4', effective_cache_size = '472585MB', jit = 'off'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116422

(8) Load package files

pg.ai setup
reset
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_package_files".*
FROM
  "packages_package_files"
WHERE
  "packages_package_files"."status" = 0
  AND "packages_package_files"."package_id" IN (redacted-package-ids)
Explain plan
 Index Scan using index_packages_package_files_on_package_id_id on public.packages_package_files  (cost=0.57..11573.75 rows=10619 width=995) (actual time=8.899..135.231 rows=54 loops=1)
   Index Cond: (packages_package_files.package_id = ANY ('{redacted-package-ids}'::bigint[]))
   Filter: (packages_package_files.status = 0)
   Rows Removed by Filter: 0
   Buffers: shared hit=146 read=127
   I/O Timings: read=133.647 write=0.000
Settings: jit = 'off', work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4', effective_cache_size = '472585MB'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116426

(9) Load package metadata

pg.ai setup
reset
EXEC DROP INDEX package_name_index;
EXEC CREATE INDEX package_name_id_index ON packages_packages(name, id);
Query
SELECT
  "packages_npm_metadata".*
FROM
  "packages_npm_metadata"
WHERE
  "packages_npm_metadata"."package_id" IN (redacted-package-ids)
Explain plan
 Index Scan using packages_npm_metadata_pkey on public.packages_npm_metadata  (cost=0.43..154.01 rows=54 width=1194) (actual time=7.270..136.542 rows=54 loops=1)
   Index Cond: (packages_npm_metadata.package_id = ANY ('{redacted-package-ids}'::bigint[]))
   Buffers: shared hit=122 read=97 dirtied=1
   WAL: records=1 fpi=1 bytes=8161
   I/O Timings: read=135.364 write=0.000
Settings: effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4'

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/38035/commands/116431

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

NA. No UI changes 🌈

How to set up and validate locally

Installing a package should work just like before, for instance, group and project levels.

Related to #463822

Edited by Radamanthus Batnag

Merge request reports

Loading