Use pending_destruction status for Debian models linked to Object Storage
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
🎛️ Context
The Debian Repository handle packages (with package files) and some files that we can call auxiliary files or metadata files:
Both files are linked to one or multiple physical files in Object Storage.
🔥 Problem
We would like to apply the same approach when deleting an object that is linked to the package registry:
- Have a
statusfield in those objects. Similar to this definition. - When an object should be destroyed, it's actually updated to
pending_destruction. - Cleanup background jobs will take of the actual destruction.
🚒 Solution
- Update all logic that destroy those objects so that the objects are updated to
pending_destruction. - Update all logic that read those objects so that objects in
pending_destructionare ignored. - Update the cleanup package registry worker so that
pending_destructionobjects are detected and the proper job (see next point) is enqueued. - Have a limited capacity job that will process all
pending_destructionobjects and destroy them, one by one.-
⚠️ We have 2 object kinds and those can be defined at 2 different levels (project / group) = we have 4 tables that could contain thosepending_destructionobjects. - Even if it's more classes and work, I think it's worth to have 4 different jobs where each one works on its own table.
- There is a concern that centralize most of the logic of those jobs: https://gitlab.com/gitlab-org/gitlab/-/blob/6327c2a50ec0d8cea92c70d30cb75a0771e35f3a/app/workers/concerns/packages/cleanup_artifact_worker.rb
- Implementing this concern, reduce the amount of lines of code of each job. Example
-
Given the amount work here, I would suggest splitting this in multiple MRs:
- MR1: database changes (column to add) + perhaps the first useful scopes.
- MR2: update all logic around read and destroy those objects to properly exclude
pending_destructionand update topending_destructionrespectively.- This this becomes too big. Split it further in 2 MRs.
- MR3: the 4 background jobs.
- MR4: the updates on the cleanup package registry worker.
Edited by 🤖 GitLab Bot 🤖