Require sharding_key_issue_url for tables with desired_sharding_key until sharding_key is implemented
Problem to Solve
Currently, we find tables that need sharding manually. There's no systematic way to track all tables that have a desired_sharding_key but haven't yet been fully sharded with a sharding_key.
This was identified in the discussion on #579972 where we're working on fully sharding bulk_import_batch_trackers.
Proposal
Require a sharding_key_issue_url field for all database tables that have a desired_sharding_key defined but don't yet have a sharding_key implemented.
This would:
- Provide systematic tracking of all tables that need sharding work
- Link each table to its corresponding sharding issue
- Make it easier to find and prioritize sharding work
- Prevent tables from being forgotten in the sharding process
Implementation Ideas
The requirement could be enforced at the schema level or through database linting/validation, ensuring that:
- If
desired_sharding_keyis present ANDsharding_keyis not present - Then
sharding_key_issue_urlmust be provided
Once a table gets its sharding_key implemented, the sharding_key_issue_url would no longer be required.
Benefits
- Systematic tracking: No more manual discovery of tables needing sharding
- Visibility: Clear view of all pending sharding work
- Accountability: Each table has a linked issue for tracking progress
- Better planning: Easier to prioritize and plan sharding efforts
Related Issues
-
#579972 - Fully shard
bulk_import_batch_trackers(where this idea originated) - Epic: &11670 - Database table sharding
Original Discussion
From #579972 (comment 2877179977):
Thanks for opening this @tskorupa-gl - I was chatting with @chen-gitlab about systematically tracking to all such tables (have
desired_sharding_key). Right now, it seems we find such tables manually. My idea would be to requiresharding_key_issue_urluntil there is asharding_key.WDYT ?