Skip to content

Optimize how RecordDataRepairDetailWorker fetches project

Context

In Restore missing container repositories under ex... (#390842 - closed), we introduced a worker that checks for missing container repositories with tags between Rails and the Container Registry. The worker tries to obtain a lease on the project it will work on and we fetch a project that is not yet in the container_registry_data_repair_details table.

scope :pending_data_repair_analysis, -> do
  left_outer_joins(:container_registry_data_repair_detail)
  .where(container_registry_data_repair_details: { project_id: nil })
end

In !125304 (merged), we revisited the worker and added a randomization on how the worker fetches the next project. We currently have a concurrency of 4, so making sure that the workers don't pick up the same project (and fail since they cannot get a lock on it) is important.

When we added the randomization, there was also a discussion on how to better optimize the worker.

🙌 Given suggestions on how to optimize

  1. Use of Slow Iteration (!125304 (comment 1459987925))

  2. Use of Background Migration to create placeholder container_registry_data_repair_details records and a worker to process them (!125304 (comment 1462821980))

Implementation Plan

As a first step in further optimizing the worker, we will go with the Slow Iteration approach (no. 1 in the suggestions above). More information can be found here: !125304 (comment 1459987925)

Reference: https://docs.gitlab.com/ee/development/database/iterating_tables_in_batches.html#slow-iteration

Edited by Adie (she/her)