Skip to content

Add bulk destroy mutation for Packages

David Fernandez requested to merge 342437-graphql-destroy-packages into master

🌱 Context

With object storage usage being a main subject in the ~"group::package", we are expanding how users can use the UI to quickly cleanup packages from their Package Registry. In today's MR, we have &8022 (closed) in mind: let users delete objects in bulk.

In short words, we want users to be able to select a set of packages and delete all of them by clicking on a single button.

Given that the frontend is mainly powered by GraphQL, this MR will focus on adding a GraphQL mutation to delete packages in bulk. We will also need a service that will do that for us. This is issue #342437 (closed) and #361819 (closed).

Deleting packages

One word in how packages are deleted. We don't use package.destroy directly. Long story short, we can't do that because a package is linked to n package_files and each package file is actually a physical file on object storage. As such, deleting a package could snowball into a large amount of DELETE requests to object storage and those could take time.

So, instead of destroying a package, we simply mark it as pending_destruction. For that, we use the status attribute.

We do have background jobs that will hunt down pending_destruction objects and destroy them one by one.

In summary, the service we're introducing here will not really destroy the packages but instead, update their status to pending_destruction.

🔮 Follow ups for the service

This MR will use hard limits to cater the frontend needs. Given that in the UI, users browse packages by pages of 20, we're going to accept max 20 packages in the GraphQL mutation. Going deeper, this means that the service itself will receive a set of max 20 packages.

Having said that, we'd like to implement the service as one capable of handling a larger scale. That's because of cleanup policies.

Cleanup policies are basically automated package destructions. Users input what they to keep and everything else is destroy. Cleanup policies work with rules. Currently, we only have one rule that works on package files. Within the 2-3 next milestones, we have a high probability of working on rules that work on packages. As such, executing those rules will give use a set of packages to destroy. Cleanup policies working on the whole project, that set could be large. To delete those packages, we're going to re-use the service introduced here.

Having said that, we don't want to get ahead of ourselves so we're going to limit this "scale handling" aspect to: use a batched loop on packages in the service instead of working on the set received directly.

🔬 What does this MR do and why?

  • Adds a new GraphQL mutation: delete packages in bulk.
    • We can delete packages from a project or a group = the mutation will simply accept an array of package gids (no project path or group path).
    • We will not allow a multi mutation request
  • Adds a service to delete packages in bulk.
    • Authorization is performed there.
    • Use a batched loop to handle scale. We have a similar service for package files and there we used a batch size of 500. We use the same here but again, this MR will call the service with max 20 packages.
  • Update the related specs.
  • Update the related documentation

🖥 Screenshots or screen recordings

Destroying 2 packages:

Screenshot_2022-10-09_at_22.54.22

Trying to do a multi mutation:

Screenshot_2022-10-09_at_22.55.43

How to set up and validate locally

  1. Have GDK ready with a project.

  2. Let's create packages on gitlab-org/gitlab-test. In a rails console:

    def fixture_file_upload(*args, **kwargs)
      Rack::Test::UploadedFile.new(*args, **kwargs)
    end
    20.times {|i| FactoryBot.create(:npm_package, project: Project.first)}
  3. Get the package ids:

    Packages::Package.all.last(20).map(&:id)
  4. Browse http://gdk.test:8000/-/graphql-explorer and let's try to delete a few packages:

    mutation {
      destroyPackages(input: { ids: ["gid://gitlab/Packages::Package/<package1_id>", "gid://gitlab/Packages::Package/<package2_id>"]}) {
        errors
      }
    }
  5. Check the packages status with the rails console:

    Packages::Package.id_in([<package1_id>, <package2_id>]).map(&:status).uniq
    # => ["pending_destruction"]

You can try:

  • a multi mutation request.
  • sending more than 20 gids to the mutation.

🚥 MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

💾 Database review

For the database review, I used public projects/packages from https://gitlab.com/issue-reproduce.

Please note that even though the service class is implemented to handle large sets of packages, the database review was done strictly under the scope of this MR. As such, in this MR, the service will be called with a set of 20 packages max (see above).

🙏 Thanks

This MR was started during a 🍐 programming session that we did with @rchanila and @dmeshcharakou. As such, they have been credited as co authors on the commit of this MR.

🍿 Session recording.

Many thanks!

Edited by David Fernandez

Merge request reports