Skip to content

Add exact_name support to GroupPackagesFinder

Matt Kasa requested to merge 333790-introduce-a-single-package-finder into master

What does this MR do and why?

This adds an exact_name param to the GroupPackagesFinder, defaulted to false in order to preserve existing behavior. When set to true, the with_name scope is used instead of the fuzzy search_by_name scope to match package_name.

This will be useful for package types such as Terraform modules, which need this exact name functionality (see ~bug #333791 (closed)).

Database

Explain: https://console.postgres.ai/shared/df6fd7c9-bf83-422b-9112-791f67a3a03d

Query:

SELECT
    "packages_packages".*
FROM
    "packages_packages"
WHERE
    "packages_packages"."project_id" IN (
        SELECT
            "projects"."id"
        FROM
            "projects"
        LEFT JOIN project_features ON projects.id = project_features.project_id
    WHERE
        "projects"."namespace_id" IN (
            SELECT
                "namespaces"."id"
            FROM
                "namespaces"
            WHERE (traversal_ids @> ('{20}')))
        AND (EXISTS (
                SELECT
                    1
                FROM
                    "project_authorizations"
                WHERE
                    "project_authorizations"."user_id" = 17
                    AND (project_authorizations.project_id = projects.id)
                    AND (project_authorizations.access_level >= 20))
                OR projects.visibility_level IN (10, 20))
            AND ("project_features"."repository_access_level" IS NULL
                OR "project_features"."repository_access_level" IN (20, 30)
                OR ("project_features"."repository_access_level" = 10
                    AND EXISTS (
                        SELECT
                            1
                        FROM
                            "project_authorizations"
                        WHERE
                            "project_authorizations"."user_id" = 17
                            AND (project_authorizations.project_id = projects.id)
                            AND (project_authorizations.access_level >= 10)))))
        AND "packages_packages"."version" IS NOT NULL
        AND "packages_packages"."package_type" != 12
        AND "packages_packages"."name" = 'maven'
        AND "packages_packages"."status" IN (0, 3)
ORDER BY
    "packages_packages"."created_at" DESC

Statistics:

Time: 39.077 ms
  - planning: 9.648 ms
  - execution: 29.429 ms
    - I/O read: 28.760 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 42 (~336.00 KiB) from the buffer pool
  - reads: 34 (~272.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 1 (~8.00 KiB)
  - writes: 0

Plan:

 Sort  (cost=346.96..346.96 rows=1 width=84) (actual time=29.224..29.227 rows=0 loops=1)
   Sort Key: packages_packages.created_at DESC
   Sort Method: quicksort  Memory: 25kB
   Buffers: shared hit=42 read=34 dirtied=1
   I/O Timings: read=28.760 write=0.000
   ->  Nested Loop Semi Join  (cost=1.87..346.95 rows=1 width=84) (actual time=29.189..29.192 rows=0 loops=1)
         Buffers: shared hit=39 read=34 dirtied=1
         I/O Timings: read=28.760 write=0.000
         ->  Index Scan using package_name_index on public.packages_packages  (cost=0.56..43.04 rows=26 width=84) (actual time=2.775..5.777 rows=5 loops=1)
               Index Cond: ((packages_packages.name)::text = 'maven'::text)
               Filter: ((packages_packages.version IS NOT NULL) AND (packages_packages.package_type <> 12) AND (packages_packages.status = ANY ('{0,3}'::integer[])))
               Rows Removed by Filter: 0
               Buffers: shared read=9 dirtied=1
               I/O Timings: read=5.711 write=0.000
         ->  Nested Loop  (cost=1.31..11.68 rows=1 width=4) (actual time=4.680..4.681 rows=0 loops=5)
               Buffers: shared hit=39 read=25
               I/O Timings: read=23.050 write=0.000
               ->  Nested Loop Left Join  (cost=0.88..11.17 rows=1 width=8) (actual time=4.009..4.009 rows=1 loops=5)
                     Filter: ((project_features.repository_access_level IS NULL) OR (project_features.repository_access_level = ANY ('{20,30}'::integer[])) OR ((project_features.repository_access_level = 10) AND (alternatives: SubPlan 3 or hashed SubPlan 4)))
                     Rows Removed by Filter: 0
                     Buffers: shared hit=30 read=22
                     I/O Timings: read=19.788 write=0.000
                     ->  Index Scan using projects_pkey on public.projects  (cost=0.44..7.05 rows=1 width=8) (actual time=3.302..3.303 rows=1 loops=5)
                           Index Cond: (projects.id = packages_packages.project_id)
                           Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (projects.visibility_level = ANY ('{10,20}'::integer[])))
                           Rows Removed by Filter: 0
                           Buffers: shared hit=21 read=19
                           I/O Timings: read=16.308 write=0.000
                           SubPlan 1
                             ->  Index Scan using index_project_authorizations_on_project_id_user_id on public.project_authorizations  (cost=0.57..3.60 rows=1 width=0) (actual time=2.305..2.306 rows=0 loops=5)
                                   Index Cond: ((project_authorizations.project_id = projects.id) AND (project_authorizations.user_id = 17))
                                   Filter: (project_authorizations.access_level >= 20)
                                   Rows Removed by Filter: 0
                                   Buffers: shared hit=11 read=9
                                   I/O Timings: read=11.461 write=0.000
                           SubPlan 2
                             ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations project_authorizations_1  (cost=0.57..1117.62 rows=5151 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                                   Index Cond: ((project_authorizations_1.user_id = 17) AND (project_authorizations_1.access_level >= 20))
                                   Heap Fetches: 0
                                   I/O Timings: read=0.000 write=0.000
                     ->  Index Scan using index_project_features_on_project_id on public.project_features  (cost=0.44..0.50 rows=1 width=8) (actual time=1.172..1.172 rows=1 loops=3)
                           Index Cond: (project_features.project_id = projects.id)
                           Buffers: shared hit=9 read=3
                           I/O Timings: read=3.480 write=0.000
                     SubPlan 3
                       ->  Index Scan using index_project_authorizations_on_project_id_user_id on public.project_authorizations project_authorizations_2  (cost=0.57..3.60 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=0)
                             Index Cond: ((project_authorizations_2.project_id = projects.id) AND (project_authorizations_2.user_id = 17))
                             Filter: (project_authorizations_2.access_level >= 10)
                             Rows Removed by Filter: 0
                             I/O Timings: read=0.000 write=0.000
                     SubPlan 4
                       ->  Index Only Scan using project_authorizations_pkey on public.project_authorizations project_authorizations_3  (cost=0.57..1325.74 rows=6222 width=4) (actual time=0.000..0.000 rows=0 loops=0)
                             Index Cond: ((project_authorizations_3.user_id = 17) AND (project_authorizations_3.access_level >= 10))
                             Heap Fetches: 0
                             I/O Timings: read=0.000 write=0.000
               ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.43..0.51 rows=1 width=4) (actual time=1.116..1.116 rows=0 loops=3)
                     Index Cond: (namespaces.id = projects.namespace_id)
                     Filter: (namespaces.traversal_ids @> '{20}'::integer[])
                     Rows Removed by Filter: 1
                     Buffers: shared hit=9 read=3
                     I/O Timings: read=3.261 write=0.000

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #333790

Edited by Matt Kasa

Merge request reports