Skip to content

Properly exclude pending_destruction packages when creating one

👻 Context

In changes for https://gitlab.com/gitlab-org/gitlab/-/issues/348166, we added a new status for packages: pending_destruction. This status is our way to mark packages as logically deleted. This can be called a soft delete.

At that time, we thought that the existing scopes would automatically remove pending_destruction packages from the UI and APIs.

The above is true for reading the state of the Package Registry or pulling operations but unfortunately we didn't think about other locations that access packages without any scope:

  • database unique indexes
  • rails validations
  • check for existences in services, finders or workers

Because of that, we introduced a pretty deep UX typebug: republishing a just deleted package will not make it available within the Package Registry. That is because, the pending_destruction package is still taken into account and so, new files are appended to that package but that package is not available in the UI nor it is available for pulling. This is exactly issue #351795 (closed).

Even worse, we have a background job that is our packages vacuum cleaner: it will collect all pending_destruction packages and completely destroy them. With the scenario above, the file pushed on the second upload will be removed.

pending_destruction packages must be ghosts within the Package Registry 👻 : no operation (push or pull) should be impacted by a single pending_destruction package.

This MR attempts to exclude those packages from indexes, validations and checks for existence.

🔬 What does this MR do and why?

  • Replace existing UNIQUE indexes with UNIQUE indexes that exclude pending_destruction packages
  • Update all validation functions in app/models/packages/package.rb to exclude pending_destruction packages
  • Update all services, finders, workers to not take pending_destruction packages in their checks for existence or functions that locate existing package if any.
    • We decided to not use any existing scope as those scopes are selecting specific status values and this MR aims to simply exclude pending_destruction. Using a scope instead of .not_pending_destruction is left as a follow up.

🖥 Screenshots or screen recordings

See next section

How to set up and validate locally

The main scenario to test this is always the same:

  1. Create a package with the considered format.
  2. Open the package lists UI and remove the package.
  3. Create the exact same package with exact same name and exact same version.

To help with steps (1.) and (3.), we're going to use https://gitlab.com/10io/gl_pru. All the $ bundle exec commands we see below are executed within that project.

We have a few formats to go through. Package formats in alpha are skipped here.

Make sure to disable background jobs to make sure that pending_destruction packages are left behind.

Maven

  1. $ bundle exec thor package:push --package-type=maven --user=root --token="<pat>" --url=<base_url>/api/v4/projects/<project_id>/packages/maven --name=my-bananas --version=1.2.3
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

NPM

  1. $ bundle exec thor package:push --package-type=npm --user=root --token="<pat>" --url=<base_url>/api/v4/projects/<project_id>/packages/npm/ --name=bananas_npm --version=1.2.13
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Nuget

  1. Enable background jobs
  2. $ bundle exec thor package:push --package-type=nuget --user=root --token="<pat>" --url=<base_url>/api/v4/projects/<project_id>/packages/nuget/index.json --name=bananas_nuget --version=1.3.25
  3. Make sure that the package is in the UI
  4. Remove the package using the UI
  5. Repeat step (2.)
    • Push operation successful
    • Package shown in the UI
  6. Disable background jobs

PyPI

  1. $ bundle exec thor package:push --package-type=pypi --user=root --token="<pat>" --url=<base_url>/api/v4/projects/<project_id>/packages/pypi --name=bananas_pypi
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Generic Packages

  1. $ curl --header "PRIVATE-TOKEN: <pat>" --upload-file ./dummy.txt "<base_url>/api/v4/projects/<project_id>/packages/generic/bug/1.3.7/file.txt".
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Composer

Pushing a composer package is not done by a manager. See https://docs.gitlab.com/ee/user/packages/composer_repository/#publish-a-composer-package-by-using-the-api.

  1. $ curl --data branch=main "http://__token__:<path>@<base_url>/api/v4/projects/<project_id>/packages/composer"
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Conan

  1. $ bundle exec thor package:push --package-type=conan --user=root --token="<pat>" --url=<base_url>/api/v4/packages/conan --name=bananas_conan --version=1.3.7 --conan-project-path=<project_full_path with + instead of />
  2. Remove the package using the UI
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Helm

helm packages are not supported by gl_pru yet. Create a package following https://helm.sh/docs/intro/using_helm/#creating-your-own-charts.

  1. $ curl --request POST --form 'chart=@bananas-0.1.0.tgz' --user "root:<pat>" <base_url>/api/v4/projects/<project_id>/packages/helm/api/<channel>/charts
  2. Remove the package
  3. Repeat step (1.)
    • Push operation successful
    • Package shown in the UI

Conclusions

The packages registry behaved properly with all the tested package formats.

pending_destruction packages are properly totally ignored during push operations and so we can re-publish a package with same name+version than a package we just deleted.

💾 Database review

🔼 Migration up

== 20220203091304 FixUniquePackagesIndexExcludingPendingDestructionStatus: migrating 
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_golang", :where=>"package_type = 8 AND status != 4", :algorithm=>:concurrently})
   -> 0.0082s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- add_index(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_golang", :where=>"package_type = 8 AND status != 4", :algorithm=>:concurrently})
   -> 0.0052s
-- execute("RESET statement_timeout")
   -> 0.0008s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_generic", :where=>"package_type = 7 AND status != 4", :algorithm=>:concurrently})
   -> 0.0053s
-- add_index(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_generic", :where=>"package_type = 7 AND status != 4", :algorithm=>:concurrently})
   -> 0.0024s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_helm", :where=>"package_type = 11 AND status != 4", :algorithm=>:concurrently})
   -> 0.0051s
-- add_index(:packages_packages, [:project_id, :name, :version], {:unique=>true, :name=>"idx_packages_on_project_id_name_version_unique_when_helm", :where=>"package_type = 11 AND status != 4", :algorithm=>:concurrently})
   -> 0.0029s
-- transaction_open?()
   -> 0.0000s
-- indexes(:packages_packages)
   -> 0.0063s
-- remove_index(:packages_packages, {:algorithm=>:concurrently, :name=>"index_packages_on_project_id_name_version_unique_when_golang"})
   -> 0.0025s
-- transaction_open?()
   -> 0.0000s
-- indexes(:packages_packages)
   -> 0.0059s
-- remove_index(:packages_packages, {:algorithm=>:concurrently, :name=>"index_packages_on_project_id_name_version_unique_when_generic"})
   -> 0.0017s
-- transaction_open?()
   -> 0.0000s
-- indexes(:packages_packages)
   -> 0.0047s
-- remove_index(:packages_packages, {:algorithm=>:concurrently, :name=>"index_packages_on_project_id_name_version_unique_when_helm"})
   -> 0.0017s
== 20220203091304 FixUniquePackagesIndexExcludingPendingDestructionStatus: migrated (0.0625s) 

🔽 Migration down

== 20220203091304 FixUniquePackagesIndexExcludingPendingDestructionStatus: reverting 
== 20220203091304 FixUniquePackagesIndexExcludingPendingDestructionStatus: reverted (0.0000s) 

🍰 Queries

Since this change is done for all package formats, we have quite a few queries to check. Having said that, we can see several similarities such as:

  • Checking if a package with a given type+name+versions exists within a given project.
  • Checking if a package with a given type+name+versions exists within the root namespace of a given project.
  • Checking if a package with a given type+name exists within the whole instance.

Here are all the queries:

  1. model uniqueness validation
  2. composer uniqueness validation
  3. conan uniqueness validation
  4. debian uniqueness validation
  5. npm uniqueness validation at the root namespace
  6. locate existing debian package
  7. locate existing helm package
  8. locate existing npm package
  9. locate existing nuget package
  10. locate existing rubygem package
  11. locate existing terraform package within root namespace
  12. locate existing terraform package within project
  13. locate existing package withing project (used by several package formats)
  14. locate existing conan package

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #351795 (closed)

Edited by David Fernandez

Merge request reports