The analytics_cycle_analytics_stage_event_hashes table contains global metadata which is associated with stage records (group-level). To make this table work with the org mover we need to alter the database schema.
Task: Add organization_id to the table
Add organization_id to the analytics_cycle_analytics_stage_event_hashes table with default value 1.
Change the unique index on hash_sha256 and scope it to the organization_id column.
When a new record is created, set the organization ID using the association which creates the hash record.
# in the ensure_stage_event_hash_id methodstage=Stage.build(...)hash=Hash.create(organization_id: stage.group.organization_id,...)stage.save!
Update all queries where we load the hash records to be scoped to the associated namespace.organization_id.
Configure the sharding key in the YML for analytics_cycle_analytics_stage_event_hashes.
I also wanted to discuss the idea of sharding analytics_cycle_analytics_stage_event_hashes with a new organization_id column instead of exempting it.
Pro: No need to work on this "cell-movement" problem. When an org is moved, all analytics_cycle_analytics_stage_event_hashes records associated with the org will be moved along with it.
Con: Duplicate records for the same hash, across organizations. Within the same organization, there won't be duplicates.
Which one do you think is more effort? Can we expect a lot of duplication if we take the organization_id route?
@manojmj, it's very likely that we'll have a lot's of duplications and this can also affect other tables (_stage_events). The id column from the analytics_cycle_analytics_stage_event_hashes is used as a partitioning key which means that when moving records from analytics_cycle_analytics_issue_stage_events and analytics_cycle_analytics_merge_request_stage_events tables, we'll need to rewrite the stage_event_hash_id column (find or create).
Another table with similar characteristics: topics. Added a note in !152834 (comment 1917855323). We decided to use organization_id as sharding key there given the plans for the /explore page.
In the very long term, we want to rebalance cells transparently. In the shorter term, we have to convince orgs to let us move them, since it will require downtime. Anyway, we desperately want to minimize downtime for many reasons.
The hope is that we can achieve continuous, async replication in the style of Geo replication, so cutover downtime consists mostly of waiting for replication to catch up on a small portion of the data.
Adam Hegyichanged title from Re-create stageeventhashes data when moving groups between cells to {+Add sharding key for analytics_cycle_analytics_+}stage{+_+}event{+_+}hashes{+ table+}
changed title from Re-create stageeventhashes data when moving groups between cells to {+Add sharding key for analytics_cycle_analytics_+}stage{+_+}event{+_+}hashes{+ table+}
Adam Hegyichanged the descriptionCompare with previous version
Haim Snirchanged title from Add sharding key for analytics_cycle_analytics_stage_event_hashes table to VSA - Add sharding key for analytics_cycle_analytics_stage_event_hashes table
changed title from Add sharding key for analytics_cycle_analytics_stage_event_hashes table to VSA - Add sharding key for analytics_cycle_analytics_stage_event_hashes table
Haim Snirchanged the descriptionCompare with previous version