Add sharding key for events table
Sharding keys need to be set for the tables:
- events
This involves choosing one of the following, based on the intended behaviour of the table:
-
The table is not cell-local
- Set
gitlab_schema
togitlab_main_clusterwide
.
- Set
-
The table is cell-local and requires a sharding key
- Set
gitlab_schema
togitlab_main_cell
- Add a
sharding_key
ordesired_sharding_key
configuration. If the configuration is known but the chosen key doesn't yet meet not-null and foreign key requirements, you can add an exception toallowed_to_be_missing_not_null
orallowed_to_be_missing_foreign_key
to get the pipeline passing. Please link to a follow-up issue in a code comment next to the exception. - You may also need to set
allow_cross_joins
,allow_cross_transactions
andallow_cross_foreign_keys
if changing the schema causes pipeline failures. Seedb/docs/epics.yml
for an example.
- Set
-
The table is cell-local and does not require a sharding key
- Set
gitlab_schema
togitlab_main_cell
and - Set
exempt_from_sharding
totrue
.
- Set
Documentation
- Choosing either the gitlab_main_cell or gitlab_main_clusterwide schema
- Defining a sharding key for all cell-local tables
- Defining a desired_sharding_key to automatically backfill a sharding_key
Proposal
Extracted from #462801 (comment 1954947766).
-
Step 1: Add migration for new
NOT NULL
columnevents.organization_id
withDEFAULT 1
(1 being the id of the default organization, denoted by constant Organizations::Organization::DEFAULT_ORGANIZATION_ID). This migration will also add the necessary index.Setting DEFAULT 1 helps us avoid back-filling, but at the cost of events in non-default orgs also having their organzation_id set to 1 for a short period of time (ie, until we execute step 2). But this is OK since organizations feature is currently feature flagged, and only used by internal employees. Also, due to this reason, we do not set up a foreign key for this new column during this step, and we defer it until step 3.
Make application code changes such that any newly created event will have their organization_id set correctly.
In terms of pseudocode it will look like
Event.create!(organization_id: current_organization_id, ...)
For most events, this would be the same as
event&.group&.organization
orevent&.project&.organization
. For personal snippet events, this would beevent&.author&.namespace&.organization
-
Step 2: Here, we have to correct the organization_id of events records that do not belong the default org.
Create a change request where we update the organization_id of events that belong to the non-default orgs from 1 to the correct organization_id.
-
Step 3: Set up the FOREIGN KEY, establishing relationship between events.organization_id and organizations.id.
-
Step 4 (?): remove the
DEFAULT 1
forevents.organization_id