Search pypi packages by normalized name
What does this MR do?
PyPI packages can be named using characters like .
and _
, however, when the pip
tool is used to install packages, it normalizes the package name, replacing those characters with hyphens -
in accordance with the python normalized names standard.
This means if a user uploads a package my.package
, then when pip tries to install it, it will request my-package
. GitLab will try to find a package named my-package
belonging to the requested project and fail.
This MR updates the finding of matching pypi packages to look at the normalized version of the name saved in GitLab's database when comparing to the requested name.
Database
These are the queries and plans for the section of code that uses the new scope .with_normalized_pypi_name
. Note that only the last conditional is being changed as the rest of the query is generated from other scopes.
Old query
SELECT "packages_packages".* FROM "packages_packages" WHERE "packages_packages"."project_id" = 20760965 AND "packages_packages"."package_type" = 5 AND "packages_packages"."version" IS NOT NULL AND ("packages_packages"."package_type" != 4 OR "packages_packages"."name" != 'NuGet.Temporary.Package') AND "packages_packages"."name" = 'mypypi.package'; -------------------------------------------------- Query Plan -------------------------------------------------- Index Scan using index_packages_packages_on_project_id_and_package_type on public.packages_packages (cost=0.42..3.45 rows=1 width=84) (actual time=0.025..0.026 rows=1 loops=1) Index Cond: ((packages_packages.project_id = 20760965) AND (packages_packages.package_type = 5)) Filter: ((packages_packages.version IS NOT NULL) AND ((packages_packages.name)::text = 'mypypi.package'::text)) Rows Removed by Filter: 0 Buffers: shared hit=4 Time: 0.713 ms planning: 0.651 ms execution: 0.062 ms I/O read: 0.000 ms I/O write: 0.000 ms Shared buffers: hits: 4 (~32.00 KiB) from the buffer pool reads: 0 from the OS file cache, including disk I/O dirtied: 0 writes: 0https://explain.depesz.com/s/1RZW
New query
SELECT "packages_packages".* FROM "packages_packages" WHERE "packages_packages"."project_id" = 20760965 AND "packages_packages"."package_type" = 5 AND "packages_packages"."version" IS NOT NULL AND ("packages_packages"."package_type" != 4 OR "packages_packages"."name" != 'NuGet.Temporary.Package') AND (LOWER(regexp_replace(name, '[-_.]+', '-', 'g')) = 'mypypi.package'); -------------------------------------------------- Query Plan -------------------------------------------------- Index Scan using index_packages_packages_on_project_id_and_package_type on public.packages_packages (cost=0.42..3.46 rows=1 width=84) (actual time=9.021..9.021 rows=0 loops=1) Index Cond: ((packages_packages.project_id = 20760965) AND (packages_packages.package_type = 5)) Filter: ((packages_packages.version IS NOT NULL) AND (lower(regexp_replace((packages_packages.name)::text, '[-_.]+'::text, '-'::text, 'g'::text)) = 'mypypi.package'::text)) Rows Removed by Filter: 1 Buffers: shared hit=3 read=4 I/O Timings: read=8.513 Time: 0.505 ms planning: 0.456 ms execution: 0.049 ms I/O read: 0.000 ms I/O write: 0.000 ms Shared buffers: hits: 4 (~32.00 KiB) from the buffer pool reads: 0 from the OS file cache, including disk I/O dirtied: 0 writes: 0https://explain.depesz.com/s/XTuj
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides - [-] Database guides
- [-] Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec
- [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team
Related to #241678 (closed)