Skip to content

Remove last_of_each_version scope and add unique index for npm packages

Context

We take measures to eliminate duplicate npm packages in the database.

  • Firstly the root cause of duplicate packages was identified and removed: #390097 (closed)
  • Secondary we removed existing duplicate packages using background migration: #390098 (closed)

Now is the last step - add database constraint: a unique index to ensure a better data coherence.

Note: the validation on the application level already exists (details)

Additionally this MR is doing a cleanup and removes last_of_each_version and last_of_each_version_ids scopes from Packages::Package model, that produce a subquery like: SELECT MAX(id) AS id from packages_packages and we don't need that anymore.

Why we have to combine the code removal and a new index in one MR?

We have several places in the different code areas were we ensure correct behavior with last_of_each_version or last_of_each_version_ids scopes using tests tests where we create duplicate packages. With a new unique database index we won't be able to do that anymore, therefore we agreed to merge changes in one MR.

What does this MR do and why?

  1. This MR adds a new unique index for npm packages.
  2. Removes last_of_each_version and last_of_each_version_ids scopes from Packages#Package

Screenshots or screen recordings

No.

Migrations output

Migration Up

$ rails db:migrate:main 
main: == [advisory_lock_connection] object_id: 228820, pg_backend_pid: 60831
main: == 20230616082958 AddUniqueIndexForNpmPackagesOnProjectIdNameVersion: migrating 
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.1014s
main: -- index_exists?(:packages_packages, [:project_id, :name, :version], {:name=>"idx_packages_on_project_id_name_version_unique_when_npm", :unique=>true, :where=>"package_type = 2 AND status <> 4", :algorithm=>:concurrently})
main:    -> 0.0079s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0002s
main: -- add_index(:packages_packages, [:project_id, :name, :version], {:name=>"idx_packages_on_project_id_name_version_unique_when_npm", :unique=>true, :where=>"package_type = 2 AND status <> 4", :algorithm=>:concurrently})
main:    -> 0.0017s
main: -- execute("RESET statement_timeout")
main:    -> 0.0002s
main: == 20230616082958 AddUniqueIndexForNpmPackagesOnProjectIdNameVersion: migrated (0.1238s) 

main: == [advisory_lock_connection] object_id: 228820, pg_backend_pid: 60831

Migration Down

$ rails db:rollback:main
main: == [advisory_lock_connection] object_id: 228540, pg_backend_pid: 60291
main: == 20230616082958 AddUniqueIndexForNpmPackagesOnProjectIdNameVersion: reverting 
main: -- transaction_open?()
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.1232s
main: -- indexes(:packages_packages)
main:    -> 0.0081s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- remove_index(:packages_packages, {:algorithm=>:concurrently, :name=>"idx_packages_on_project_id_name_version_unique_when_npm"})
main:    -> 0.0010s
main: -- execute("RESET statement_timeout")
main:    -> 0.0003s
main: == 20230616082958 AddUniqueIndexForNpmPackagesOnProjectIdNameVersion: reverted (0.1461s) 

main: == [advisory_lock_connection] object_id: 228540, pg_backend_pid: 60291

How to set up and validate locally

No.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #390099 (closed)

Edited by Dzmitry (Dima) Meshcharakou

Merge request reports