Skip to content

Parallelize LooseForeignKeys::CleanupWorker

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

This is 1 way we may be able to improve the performance of cleaning up loose foreign keys if we find ourselves reaching the the MAX_RUNTIME=30s too often. This is an alternative approach to the one described in #343550 (closed) .

The idea is to shard the LooseForeignKeys::CleanupWorker by parent_table so that every minute we run many of these in parallel that are processing different sets of parent_table lists since these are all independent. We could shard by taking a checksum of the table name and split them into N groups for N workers. Since we have ~60 parent tables this should mean we can easily split into N=10 workers and get a substantial reduction in the total time of all workers.

We may only need to invest time in this if we see scaling problems in:

  1. https://dashboards.gitlab.net/d/sidekiq-loose-foreign-keys/sidekiq-loose-foreign-keys-processing?orgId=1&from=now-24h&to=now
  2. https://log.gprd.gitlab.net/app/lens#/edit/0281c170-6eca-11ec-9a7e-a5446680816e?_g=h@13ca26d
Edited by 🤖 GitLab Bot 🤖