Skip to content

Reduce cleanup package registry worker cadence to each hour

Context

Destroying objects in the GitLab package registry is not a trivial thing to do as it involves physical files hosted in object storage.

More details in https://gitlab.com/gitlab-org/gitlab/-/issues/348166.

To mitigate that, we introduced a delayed destruction. Objects in the package registry are:

  1. marked as pending_destruction.
  2. actually destroyed.

(1.) can be executed from different locations such as web requests or background jobs. (2.) is then done by a dedicated background job that will destroy objects slowly, one by one.

(2.) is baked by a limited capacity job. As such, the job will check the backlog of objects pending_destruction, take the first one and destroy it. At the end of the execution, the job will re-enqueue itself if the backlog is not empty.

In short, (2.) is handled by a background job that loops non stop until the backlog is processed.

That's nice but how do we kickstart the loop?

For that, we use a cron job. Its main goal is: do we have something in the backlog? If yes, then enqueue the limited capacity job (eg. start the loop).

The problem is that this cron job runs every 12.hours = we can have a delay between (1.) and (2.) up to 12.hours.

Given that this cron job is thus very lightweight as seen on this chart, we can reduce this cadence to every 1.hour.

Why such reduction? Well, when an object is actually destroyed in the package registry, the project statistics will be updated (usage quota). By using a shorter cadence, that statistics update will be carried out more closer to (1.).

See #379740 (closed).

🔬 What does this MR do and why?

  • Update the cron expression of Packages::CleanupPackageRegistryWorker to 20 * * * * (See here what that means)

📺 Screenshots or screen recordings

None

How to set up and validate locally

Not sure that we have a way to test this other than:

  1. Start the GitLab background jobs.
  2. Check that each hour (on minute 20) you have:
{"severity":"INFO","time":"2022-11-02T10:20:07.888Z","retry":0,"queue":"default","backtrace":true,"version":0,"queue_namespace":"cronjob","args":[],"class":"Packages::CleanupPackageRegistryWorker","jid":"1c97a11a7bdde31bda1a5dbf","created_at":"2022-11-02T10:20:06.519Z","meta.caller_id":"Cronjob","correlation_id":"0e3560d1379a51f1736c52dd596e110d","meta.root_caller_id":"Cronjob","meta.feature_category":"package_registry","worker_data_consistency":"always","idempotency_key":"resque:gitlab:duplicate:default:31163b2078644913f89feb2245c5fa62ac9118d29e04cf14ce0b5def57bba8ea","size_limiter":"validated","enqueued_at":"2022-11-02T10:20:06.521Z","job_size_bytes":2,"pid":75760,"message":"Packages::CleanupPackageRegistryWorker JID-1c97a11a7bdde31bda1a5dbf: start","job_status":"start","scheduling_latency_s":1.366737}

🚥 MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by David Fernandez

Merge request reports