Add bulk destroy mutation for Packages
🌱 Context
With object storage usage being a main subject in the ~"group::package", we are expanding how users can use the UI to quickly cleanup packages from their Package Registry. In today's MR, we have &8022 (closed) in mind: let users delete objects in bulk.
In short words, we want users to be able to select a set of packages and delete all of them by clicking on a single button.
Given that the frontend is mainly powered by GraphQL, this MR will focus on adding a GraphQL mutation to delete packages in bulk. We will also need a service that will do that for us. This is issue #342437 (closed) and #361819 (closed).
♻ Deleting packages
One word in how packages are deleted. We don't use package.destroy
directly. Long story short, we can't do that because a package is linked to n
package_files
and each package file is actually a physical file on object storage. As such, deleting a package could snowball into a large amount of DELETE
requests to object storage and those could take time.
So, instead of destroying a package, we simply mark it as pending_destruction
. For that, we use the status
attribute.
We do have background jobs that will hunt down pending_destruction
objects and destroy them one by one.
In summary, the service we're introducing here will not really destroy the packages but instead, update their status
to pending_destruction
.
🔮 Follow ups for the service
This MR will use hard limits to cater the frontend needs. Given that in the UI, users browse packages by pages of 20
, we're going to accept max 20
packages in the GraphQL mutation. Going deeper, this means that the service itself will receive a set of max 20
packages.
Having said that, we'd like to implement the service as one capable of handling a larger scale. That's because of cleanup policies.
Cleanup policies are basically automated package destructions. Users input what they to keep and everything else is destroy. Cleanup policies work with rules. Currently, we only have one rule that works on package files. Within the 2-3 next milestones, we have a high probability of working on rules that work on packages. As such, executing those rules will give use a set of packages to destroy. Cleanup policies working on the whole project, that set could be large. To delete those packages, we're going to re-use the service introduced here.
Having said that, we don't want to get ahead of ourselves so we're going to limit this "scale handling" aspect to: use a batched loop on packages in the service instead of working on the set received directly.
🔬 What does this MR do and why?
- Adds a new GraphQL mutation: delete packages in bulk.
- We can delete packages from a project or a group = the mutation will simply accept an array of package
gids
(no project path or group path). - We will not allow a multi mutation request
- We can delete packages from a project or a group = the mutation will simply accept an array of package
- Adds a service to delete packages in bulk.
- Authorization is performed there.
- Use a batched loop to handle scale. We have a similar service for package files and there we used a batch size of
500
. We use the same here but again, this MR will call the service with max20
packages.
- Update the related specs.
- Update the related documentation
🖥 Screenshots or screen recordings
Destroying 2 packages:
Trying to do a multi mutation:
⚙ How to set up and validate locally
-
Have GDK ready with a project.
-
Let's create packages on
gitlab-org/gitlab-test
. In a rails console:def fixture_file_upload(*args, **kwargs) Rack::Test::UploadedFile.new(*args, **kwargs) end 20.times {|i| FactoryBot.create(:npm_package, project: Project.first)}
-
Get the package ids:
Packages::Package.all.last(20).map(&:id)
-
Browse
http://gdk.test:8000/-/graphql-explorer
and let's try to delete a few packages:mutation { destroyPackages(input: { ids: ["gid://gitlab/Packages::Package/<package1_id>", "gid://gitlab/Packages::Package/<package2_id>"]}) { errors } }
-
Check the packages
status
with the rails console:Packages::Package.id_in([<package1_id>, <package2_id>]).map(&:status).uniq # => ["pending_destruction"]
You can try:
- a multi mutation request.
- sending more than
20
gids
to the mutation.
🚥 MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
💾 Database review
For the database review, I used public projects/packages from https://gitlab.com/issue-reproduce.
Please note that even though the service class is implemented to handle large sets of packages
, the database review was done strictly under the scope of this MR. As such, in this MR, the service will be called with a set of 20
packages max (see above).
-
.each_batch
lower and upper bounds. - Loading the batch of packages, projects and routes.
- Updating package statuses.
🙏 Thanks
This MR was started during a
Many thanks!