Skip to content

GitLab Migration: throttle concurrent number of importing entities

Problem statement

GitLab Migration by direct transfer (aka BulkImports aka GitLab Migration) entities a lot of background processing jobs in order to perform import of groups & projects. Each entity (which can be either a group or a project) enqueues 10-30 jobs to perform import.

If there's a surge in import requests it can pile up a lot of jobs in the imports shard queue which can trigger Infra alerts.

One of the suggestions in order to reduce the number of enqueued jobs as well as spread the load a bit is to throttle the number of concurrenty running entities per each import.

This is something that we used to do but was removed in !84208 (merged) in favour of Sidekiq managing that without extra functionality of our own.

Proposed solution

Perhaps we should put back throttling in order to reduce the number of enqueued jobs. Since imports shard has limited concurrency to begin with, I imagine this should not affect performance much? It'd be good to provide some numbers to back it up.