Remove temp_source_id column from various tables
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Background
During table migration processes, temp_source_id columns were added to several tables to facilitate data migration from legacy tables to new table structures. These columns served as temporary references to maintain relationships during the migration process but are no longer needed and should be removed to maintain database cleanliness.
Objective
Remove the temp_source_id column from all relevant tables where it was temporarily added during the migration process.
Tables Requiring Cleanup
Based on codebase analysis, the following tables contain temp_source_id columns that need to be removed:
1. group_scim_identities
-
Migration:
db/migrate/20241024145113_create_group_scim_identities.rb(Milestone 17.6) -
Purpose: Temporary column to store
scim_identityid during SCIM identity migration -
Index: Unique index on
temp_source_id -
Background Migration:
lib/gitlab/background_migration/migrate_scim_identities.rb
2. group_scim_auth_access_tokens
-
Migration:
db/migrate/20241024145500_create_group_scim_auth_tokens.rb(Milestone 17.6) -
Purpose: Temporary column to store
scim_tokensid during SCIM token migration -
Index: Unique index on
temp_source_id -
Background Migration:
lib/gitlab/background_migration/migrate_scim_tokens.rb
3. system_access_group_microsoft_applications
-
Migration:
db/migrate/20241106114853_create_system_access_group_microsoft_applications.rb(Milestone 17.7) - Purpose: Temporary column to store graph access tokens id during Microsoft applications table split
-
Index: Unique index
index_group_microsoft_applications_on_temp_source_id -
Background Migration:
lib/gitlab/background_migration/split_microsoft_applications_table.rb
4. system_access_group_microsoft_graph_access_tokens
-
Migration:
db/migrate/20241106115015_create_system_access_group_microsoft_graph_access_tokens.rb(Milestone 17.7) - Purpose: Temporary column to store graph access tokens id during Microsoft applications table split
-
Index: Unique index
index_source_id_microsoft_access_tokens -
Background Migration:
lib/gitlab/background_migration/split_microsoft_applications_table.rb
Current Usage
These columns are currently used by:
-
Sync workers: Various
ee/app/workers/authn/sync_*_worker.rbfiles for bidirectional synchronization - Background migrations: For data migration processes
-
Schema specs: Explicitly excluded from foreign key validation in
spec/db/schema_spec.rb
Tasks
-
Verify all background migrations using these columns have completed -
Confirm sync workers are no longer needed or update them to use proper foreign keys -
Create migration scripts to remove the columns and their associated indexes: -
Remove temp_source_idfromgroup_scim_identities -
Remove temp_source_idfromgroup_scim_auth_access_tokens -
Remove temp_source_idfromsystem_access_group_microsoft_applications -
Remove temp_source_idfromsystem_access_group_microsoft_graph_access_tokens
-
-
Update schema specs to remove exclusions for these columns -
Run tests to ensure removal doesn't impact existing functionality -
Execute cleanup in production environment
Expected Outcome
All temporary source ID columns will be removed, resulting in:
- Cleaner database schema
- Reduced storage overhead
- Simplified table structure
- Removal of temporary indexes
Migration Considerations
- These columns have unique indexes that will also need to be dropped
- Verify that any remaining sync workers or background jobs are not actively using these columns
- Consider the impact on any ongoing data synchronization processes
Analysis completed by examining migration files, background migration classes, worker implementations, and schema specifications.