Cells: Classify: Make uploads table to be attributable to be an org

Problem

The uploads holds a record of all uploaded files into GitLab. This table is attached to many models (users, projects, groups, etc.).

This table is not clearly attributable to be either clusterwide or cell-local.

There was some investigation into the problem in [Feature] Cells 1.0 impact for file uploads (#443573 - closed)

Geo

The same applies to upload_states that is used by Geo to track uploaded records that needs verification.

Dependencies

We need the tables backing the models using uploads to have their sharding keys so that we can use them.

  • abuse_reports (!210550 (merged))
  • achievements
  • ai_vectorizable_files
  • alert_management_alert_metric_images
  • appearances
  • bulk_import_export_uploads
  • dependency_list_export_parts
  • dependency_list_exports
  • design_management_designs_versions
  • import_export_uploads
  • issuable_metric_images
  • namespaces
  • organization_details
  • project_relation_export_uploads
  • topics
  • projects
  • snippets
  • user_permission_export_uploads
  • users
  • vulnerability_archive_exports
  • vulnerability_export_parts
  • vulnerability_exports
  • vulnerability_remediations

https://docs.google.com/spreadsheets/d/19CcPaUGxOaT1rwjSdRvLkhu_-91RUBOdjDFGVxOonVs/edit?usp=sharing

Solution

We should introduce new table to be either cluster or cell-local and split this table into two with a clear purpose.

Proposal

Based on the discussion here - #398199 (comment 2101029924).

  • Milestone 17.7:
    • Add new sharding key columns to uploads (!168003 (merged))
    • Update the app to populate sharding key columns for new uploads when available (!168003 (merged))
  • Milestone 17.11:
    • Create new uploads_9ba88c4165 table (like uploads) partitioned by model_type, mark it as exempt_from_sharding: true (!175203 (merged))
    • Create partition for each model_type in the public schema (!175203 (merged))
    • For each partition create FK referencing the sharding key table (!175203 (merged))
    • Start syncing uploads -> uploads_9ba88c4165 (!175203 (merged))
  • Milestone 18.2 (required stop):
    • Backfill uploads_9ba88c4165 when every related model has its sharding key ready (!181349 (merged))
  • Milestone 18.3:
    • Finalize back-fill migration !198033 (merged)
  • Milestone %18.5 :
    • Clean up note_uploads(no longer needed after !185893 (merged)) (!206764 (merged))
  • Milestone %18.6
    • NOT NULL constraint on appearance_uploads -> !209290 (merged)
  • Milestone %18.7 (work on all dependencies are completed)
    • Add database triggers for all partitions to set sharding key if missing (!208858 (merged))
    • Truncate partitions (to remove orphaned uploads) && create NOT NULL constraint for each partition (!213237 (merged))
    • Re-run back-fill (updated to set new sharding keys) (!214675 (merged))
  • Milestone %18.9 (after a required stop)
    • Verify the backfill to set the new sharding keys completed successfully
    • Finalize back-fill && switch the app to use the new partitioned table by swapping the table names (!218827)
    • Mark the uploads table as cell-local (!216517)
Edited Jan 14, 2026 by Tomasz Skorupa
Assignee Loading
Time tracking Loading