Skip to content

Sharding key for ci_job_artifact_states

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem

#550697 (comment 2708317606)

If large table, open issue to add a "sharding key" anyway, to support performant deletions. This does not block replication work, only deletion on legacy cell, so this can be delayed up until when we are starting to move real orgs.

ci_job_artifact_states is a large table. There is a chance that deletion of an organization's ci_job_artifact_states will be unacceptably slow.

Proposal

  1. Confirm that deleting an organization's data from ci_job_artifact_states is unacceptably slow. Example use of Database Lab to check Geo query performance
  2. Also, confirm that loose foreign key won't do the deletion for us async
  3. If too slow, then add a sharding key to ci_job_artifact_states:

Org Data Migration Zoom Call, Google Doc Notes (internal)

is it still easy enough to leverage the sharding key backfill tooling?

Yes, it should be. We can remove the issue url and still use “desired sharding key” https://docs.gitlab.com/development/organization/#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key

Edited by 🤖 GitLab Bot 🤖