Skip to content

Update the maven package finder

David Fernandez requested to merge 10io-update-maven-finder into master

Follow up of https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3894#note_526959002

What does this MR do?

Screenshots (strongly suggested)

n / a

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Database review

Old explain plan

https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/2653/commands/8146

SELECT "packages_packages".*
   FROM "packages_packages"
  INNER JOIN "packages_maven_metadata"
     ON "packages_maven_metadata"."package_id" = "packages_packages"."id"
  INNER JOIN "packages_package_files"
     ON "packages_package_files"."package_id" = "packages_packages"."id"
  WHERE "packages_packages"."project_id" IN (
         SELECT "projects"."id"
           FROM "projects"
          WHERE "projects"."namespace_id" IN (
                 WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = XXX) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
                   FROM "base_and_descendants" AS "namespaces"
                )
            AND "projects"."namespace_id" IN (
                 WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = XXX) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id
                   FROM "base_and_descendants" AS "namespaces"
                )
        )
    AND "packages_maven_metadata"."path" = 'XXX'
  ORDER BY packages_package_files.created_at DESC
  LIMIT 1
Limit  (cost=3867.81..3867.81 rows=1 width=92) (actual time=2337.710..2337.729 rows=0 loops=1)
   Buffers: shared hit=1637569
   ->  Sort  (cost=3867.81..3867.82 rows=7 width=92) (actual time=2337.708..2337.727 rows=0 loops=1)
         Sort Key: packages_package_files.created_at DESC
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=1637569
         ->  Nested Loop  (cost=3839.46..3867.77 rows=7 width=92) (actual time=2337.651..2337.670 rows=0 loops=1)
               Buffers: shared hit=1637566
               ->  Nested Loop  (cost=3838.90..3865.61 rows=1 width=92) (actual time=2337.650..2337.668 rows=0 loops=1)
                     Buffers: shared hit=1637566
                     ->  Nested Loop  (cost=3838.35..3839.57 rows=44 width=84) (actual time=283.276..1054.226 rows=303519 loops=1)
                           Buffers: shared hit=419917
                           ->  HashAggregate  (cost=3837.92..3837.93 rows=1 width=4) (actual time=283.160..301.036 rows=28498 loops=1)
                                 Group Key: projects.id
                                 Buffers: shared hit=87619
                                 ->  Hash Semi Join  (cost=3165.64..3837.92 rows=1 width=4) (actual time=163.883..259.036 rows=28498 loops=1)
                                       Hash Cond: (projects.namespace_id = namespaces_1.id)
                                       Buffers: shared hit=87619
                                       ->  Nested Loop  (cost=1582.13..2245.33 rows=3459 width=12) (actual time=78.116..159.199 rows=28498 loops=1)
                                             Buffers: shared hit=64027
                                             ->  HashAggregate  (cost=1581.69..1583.50 rows=181 width=4) (actual time=78.067..80.861 rows=4844 loops=1)
                                                   Group Key: namespaces.id
                                                   Buffers: shared hit=23592
                                                   ->  CTE Scan on base_and_descendants namespaces  (cost=1575.81..1579.43 rows=181 width=4) (actual time=0.041..73.901 rows=4844 loops=1)
                                                         Buffers: shared hit=23592
                                                         CTE base_and_descendants
                                                           ->  Recursive Union  (cost=0.43..1575.81 rows=181 width=344) (actual time=0.037..62.420 rows=4844 loops=1)
                                                                 Buffers: shared hit=23592
                                                                 ->  Index Scan using index_namespaces_on_type_and_id_partial on public.namespaces namespaces_2  (cost=0.43..3.45 rows=1 width=344) (actual time=0.026..0.027 rows=1 loops=1)
                                                                       Index Cond: (((namespaces_2.type)::text = 'Group'::text) AND (namespaces_2.id = XXXX))
                                                                       Buffers: shared hit=4
                                                                 ->  Nested Loop  (cost=0.56..156.87 rows=18 width=344) (actual time=0.047..5.271 rows=605 loops=8)
                                                                       Buffers: shared hit=23588
                                                                       ->  WorkTable Scan on base_and_descendants  (cost=0.00..0.20 rows=10 width=4) (actual time=0.001..0.198 rows=606 loops=8)
                                                                       ->  Index Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_3  (cost=0.56..15.65 rows=2 width=344) (actual time=0.005..0.007 rows=1 loops=4844)
                                                                             Index Cond: (namespaces_3.parent_id = base_and_descendants.id)
                                                                             Filter: ((namespaces_3.type)::text = 'Group'::text)
                                                                             Rows Removed by Filter: 0
                                                                             Buffers: shared hit=23588
                                             ->  Index Only Scan using index_projects_on_namespace_id_and_id on public.projects  (cost=0.44..3.47 rows=19 width=8) (actual time=0.008..0.013 rows=6 loops=4844)
                                                   Index Cond: (projects.namespace_id = namespaces.id)
                                                   Heap Fetches: 3322
                                                   Buffers: shared hit=40435
                                       ->  Hash  (cost=1581.24..1581.24 rows=181 width=4) (actual time=85.741..85.747 rows=4844 loops=1)
                                             Buckets: 8192  Batches: 1  Memory Usage: 235kB
                                             Buffers: shared hit=23592
                                             ->  CTE Scan on base_and_descendants namespaces_1  (cost=1575.81..1579.43 rows=181 width=4) (actual time=0.075..83.104 rows=4844 loops=1)
                                                   Buffers: shared hit=23592
                                                   CTE base_and_descendants
                                                     ->  Recursive Union  (cost=0.43..1575.81 rows=181 width=344) (actual time=0.071..70.855 rows=4844 loops=1)
                                                           Buffers: shared hit=23592
                                                           ->  Index Scan using index_namespaces_on_type_and_id_partial on public.namespaces namespaces_4  (cost=0.43..3.45 rows=1 width=344) (actual time=0.054..0.056 rows=1 loops=1)
                                                                 Index Cond: (((namespaces_4.type)::text = 'Group'::text) AND (namespaces_4.id = XXXX))
                                                                 Buffers: shared hit=4
                                                           ->  Nested Loop  (cost=0.56..156.87 rows=18 width=344) (actual time=0.057..6.277 rows=605 loops=8)
                                                                 Buffers: shared hit=23588
                                                                 ->  WorkTable Scan on base_and_descendants base_and_descendants_1  (cost=0.00..0.20 rows=10 width=4) (actual time=0.001..0.178 rows=606 loops=8)
                                                                 ->  Index Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_5  (cost=0.56..15.65 rows=2 width=344) (actual time=0.005..0.009 rows=1 loops=4844)
                                                                       Index Cond: (namespaces_5.parent_id = base_and_descendants_1.id)
                                                                       Filter: ((namespaces_5.type)::text = 'Group'::text)
                                                                       Rows Removed by Filter: 0
                                                                       Buffers: shared hit=23588
                           ->  Index Scan using index_packages_packages_on_project_id_and_status on public.packages_packages  (cost=0.42..1.20 rows=44 width=84) (actual time=0.005..0.023 rows=11 loops=28498)
                                 Index Cond: (packages_packages.project_id = projects.id)
                                 Buffers: shared hit=332298
                     ->  Index Only Scan using index_packages_maven_metadata_on_package_id_and_path on public.packages_maven_metadata  (cost=0.55..0.58 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=303519)
                           Index Cond: ((packages_maven_metadata.package_id = packages_packages.id) AND (packages_maven_metadata.path = 'XXXX'::text))
                           Heap Fetches: 0
                           Buffers: shared hit=1217649
               ->  Index Scan using index_packages_package_files_on_package_id_and_file_name on public.packages_package_files  (cost=0.56..1.68 rows=39 width=16) (actual time=0.000..0.000 rows=0 loops=0)
                     Index Cond: (packages_package_files.package_id = packages_packages.id)

New explain plan (without the feature flag of !56078 (merged))

SELECT "packages_packages".*
  FROM "packages_packages"
 INNER JOIN "packages_maven_metadata"
    ON "packages_maven_metadata"."package_id" = "packages_packages"."id"
 INNER JOIN "packages_package_files"
    ON "packages_package_files"."package_id" = "packages_packages"."id"
 WHERE "packages_packages"."project_id" IN (
        SELECT "projects"."id"
          FROM "projects"
         WHERE "projects"."namespace_id" IN (
                WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = XXX) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id
                  FROM "base_and_descendants" AS "namespaces"
               )
           AND "projects"."namespace_id" IN (
                WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = XXX) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id
                  FROM "base_and_descendants" AS "namespaces"
               )
       )
   AND "packages_maven_metadata"."path" = 'XXX'
 ORDER BY packages_package_files.created_at DESC
 LIMIT 1

No explain plan because postgres.ai is not working 🤷

New explain plan (with the feature flag of !56078 (merged))

See https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3894#note_526959002

https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/2660/commands/8147

SELECT "packages_packages".*
  FROM "packages_packages"
 INNER JOIN "packages_maven_metadata"
    ON "packages_maven_metadata"."package_id" = "packages_packages"."id"
 INNER JOIN "packages_package_files"
    ON "packages_package_files"."package_id" = "packages_packages"."id"
 WHERE "packages_packages"."project_id" IN (
        SELECT "projects"."id"
          FROM "projects"
         INNER JOIN (
                WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = XXX) UNION (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id"
                  FROM "base_and_descendants" AS "namespaces"
               ) namespaces
            ON namespaces.id=projects.namespace_id
       )
   AND "packages_maven_metadata"."path" = 'XXX'
 ORDER BY packages_package_files.created_at DESC
 LIMIT 1
Limit  (cost=17101.09..17101.09 rows=1 width=92) (actual time=3248.655..3248.662 rows=0 loops=1)
   Buffers: shared hit=8 read=7203
   I/O Timings: read=3200.287
   ->  Sort  (cost=17101.09..17101.11 rows=7 width=92) (actual time=3248.653..3248.659 rows=0 loops=1)
         Sort Key: packages_package_files.created_at DESC
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=8 read=7203
         I/O Timings: read=3200.287
         ->  Nested Loop  (cost=1578.98..17101.05 rows=7 width=92) (actual time=3248.616..3248.622 rows=0 loops=1)
               Buffers: shared hit=5 read=7203
               I/O Timings: read=3200.287
               ->  Nested Loop Semi Join  (cost=1578.42..17098.89 rows=1 width=92) (actual time=3248.615..3248.621 rows=0 loops=1)
                     Buffers: shared hit=5 read=7203
                     I/O Timings: read=3200.287
                     ->  Nested Loop  (cost=0.98..15515.31 rows=1 width=92) (actual time=3248.615..3248.616 rows=0 loops=1)
                           Buffers: shared hit=5 read=7203
                           I/O Timings: read=3200.287
                           ->  Index Only Scan using index_packages_maven_metadata_on_package_id_and_path on public.packages_maven_metadata  (cost=0.55..15511.86 rows=1 width=8) (actual time=3248.614..3248.614 rows=0 loops=1)
                                 Index Cond: (packages_maven_metadata.path = 'XXX'::text)
                                 Heap Fetches: 0
                                 Buffers: shared hit=5 read=7203
                                 I/O Timings: read=3200.287
                           ->  Index Scan using index_packages_packages_on_id_and_created_at on public.packages_packages  (cost=0.42..3.44 rows=1 width=84) (actual time=0.000..0.000 rows=0 loops=0)
                                 Index Cond: (packages_packages.id = packages_maven_metadata.package_id)
                     ->  Hash Join  (cost=1577.45..1583.57 rows=1 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                           Hash Cond: (namespaces.id = projects.namespace_id)
                           ->  CTE Scan on base_and_descendants namespaces  (cost=1575.81..1579.43 rows=181 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                                 CTE base_and_descendants
                                   ->  Recursive Union  (cost=0.43..1575.81 rows=181 width=344) (actual time=0.000..0.000 rows=0 loops=0)
                                         ->  Index Scan using index_namespaces_on_type_and_id_partial on public.namespaces namespaces_1  (cost=0.43..3.45 rows=1 width=344) (actual time=0.000..0.000 rows=0 loops=0)
                                               Index Cond: (((namespaces_1.type)::text = 'Group'::text) AND (namespaces_1.id = XXX))
                                         ->  Nested Loop  (cost=0.56..156.87 rows=18 width=344) (actual time=0.000..0.000 rows=0 loops=0)
                                               ->  WorkTable Scan on base_and_descendants  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                                               ->  Index Scan using index_namespaces_on_parent_id_and_id on public.namespaces namespaces_2  (cost=0.56..15.65 rows=2 width=344) (actual time=0.000..0.000 rows=0 loops=0)
                                                     Index Cond: (namespaces_2.parent_id = base_and_descendants.id)
                                                     Filter: ((namespaces_2.type)::text = 'Group'::text)
                                                     Rows Removed by Filter: 0
                           ->  Hash  (cost=1.63..1.63 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=0)
                                 ->  Index Scan using projects_pkey on public.projects  (cost=0.56..1.63 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=0)
                                       Index Cond: (projects.id = packages_packages.project_id)
               ->  Index Scan using index_packages_package_files_on_package_id_and_file_name on public.packages_package_files  (cost=0.56..1.68 rows=39 width=16) (actual time=0.000..0.000 rows=0 loops=0)
                     Index Cond: (packages_package_files.package_id = packages_packages.id)

Merge request reports