Cells: Classify: Make uploads
table to be attributable to be an org
Problem
The uploads
holds a record of all uploaded files into GitLab. This table is attached to many models (users, projects, groups, etc.).
This table is not clearly attributable to be either clusterwide
or cell-local
.
There was some investigation into the problem in [Feature] Cells 1.0 impact for file uploads (#443573 - closed)
Geo
The same applies to upload_states
that is used by Geo to track uploaded records that needs verification.
Dependencies
We need the tables backing the models using uploads
to have their sharding keys so that we can use them.
-
abuse_reports -
achievements -
ai_vectorizable_files -
alert_management_alert_metric_images -
appearances -
bulk_import_export_uploads -
dependency_list_export_parts -
dependency_list_exports -
design_management_designs_versions -
import_export_uploads -
issuable_metric_images -
namespaces -
organization_details -
project_relation_export_uploads -
topics -
projects -
snippets -
user_permission_export_uploads -
users -
vulnerability_archive_exports -
vulnerability_export_parts -
vulnerability_exports -
vulnerability_remediations
https://docs.google.com/spreadsheets/d/19CcPaUGxOaT1rwjSdRvLkhu_-91RUBOdjDFGVxOonVs/edit?usp=sharing
Solution
We should introduce new table to be either cluster or cell-local and split this table into two with a clear purpose.
Proposal
Based on the discussion here - #398199 (comment 2101029924).
-
Milestone 17.7: -
Add new sharding key columns to uploads (!168003 (merged)) -
Update the app to populate sharding key columns for new uploads when available (!168003 (merged))
-
-
Milestone 17.11: -
Create new uploads_9ba88c4165
table (likeuploads
) partitioned bymodel_type
, mark it asexempt_from_sharding: true
(!175203 (merged)) -
Create partition for each model_type in the public schema (!175203 (merged)) -
For each partition create FK referencing the sharding key table (!175203 (merged)) -
Start syncing uploads
->uploads_9ba88c4165
(!175203 (merged))
-
-
Milestone 18.2 (required stop): -
Backfill uploads_9ba88c4165
when every related model has its sharding key ready (!181349 (merged))
-
-
Milestone 18.3: -
Finalize back-fill migration !198033 (merged)
-
-
Milestone 18.5: -
Clean up note_uploads
(no longer needed after !185893 (merged)) (!206764 (merged))
-
-
Milestone N (work on all dependencies is completed) - Add database triggers for all partitions to set sharding key if missing.
- Truncate partitions (to remove orphaned uploads)
- Re-run back-fill (updated to set new sharding keys)
Tables Prepare for sharding achievements
!207893 -
Milestone M (after a required stop) -
Finalize back-fill -
For each partition create NOT NULL constraint !199513 (closed) -
Define sharding key for each partition -
Switch the app to use the new partitioned table by swapping the table names
-