Skip to content

Backfill LfsObjectsProject records for forks

What does this MR do?

Background migration has been added to link all LFS objects of a project. It gets scheduled via a post-deploy migration which queries all forks with LFS enabled.

The background migration will get all LFS pointers for each fork to get the OIDs of LFS objects needed to be linked via Gitaly.

Post migration query

Sample (has LIMIT because we're using each_batch):

SELECT projects.id
FROM projects
INNER JOIN fork_network_members ON fork_network_members.project_id = projects.id
WHERE (projects.lfs_enabled = TRUE OR projects.lfs_enabled IS NULL)
AND (fork_network_members.forked_from_project_id IS NOT NULL)
LIMIT 1000

Query plan: https://explain.depesz.com/s/KaAC

Background migration schedule

Background migration jobs will be enqueued with 1000 projects each. Each job will be enqueued using the formula:

index (zero based) * batch size * interval

Where:

  • index - zero based. This is the index of each batch.
  • batch size - size of the batch divided by concurrency rate (BATCH_SIZE is 1000 and CONCURRENCY is 4, so this can be <= 250)
  • interval - 30 seconds. Based on the 99th percentile latency of GetAllLfsPointers calls.

The first 4 jobs will be enqueued and worked on immediately while the next 4 jobs will be enqueued but will be worked on after 2 hours. Each job will process each project sequentially. Decided to go with this based on this: !24164 (comment 282672404).

Wanted to bulk enqueue smaller individual jobs (with #push_bulk) with different schedules instead of big jobs but our version of Sidekiq doesn't support it (support was added in 6.0.1 and we're using 5.2.7). We can enqueue multiple jobs using #perform_in but that'll be n+1 requests to redis. So I opted with this approach.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

#55487 (closed)

Edited by 🤖 GitLab Bot 🤖

Merge request reports