Skip to content

Remove temp_source_id column from various tables

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Background

During table migration processes, temp_source_id columns were added to several tables to facilitate data migration from legacy tables to new table structures. These columns served as temporary references to maintain relationships during the migration process but are no longer needed and should be removed to maintain database cleanliness.

Objective

Remove the temp_source_id column from all relevant tables where it was temporarily added during the migration process.

Tables Requiring Cleanup

Based on codebase analysis, the following tables contain temp_source_id columns that need to be removed:

1. group_scim_identities

  • Migration: db/migrate/20241024145113_create_group_scim_identities.rb (Milestone 17.6)
  • Purpose: Temporary column to store scim_identity id during SCIM identity migration
  • Index: Unique index on temp_source_id
  • Background Migration: lib/gitlab/background_migration/migrate_scim_identities.rb

2. group_scim_auth_access_tokens

  • Migration: db/migrate/20241024145500_create_group_scim_auth_tokens.rb (Milestone 17.6)
  • Purpose: Temporary column to store scim_tokens id during SCIM token migration
  • Index: Unique index on temp_source_id
  • Background Migration: lib/gitlab/background_migration/migrate_scim_tokens.rb

3. system_access_group_microsoft_applications

  • Migration: db/migrate/20241106114853_create_system_access_group_microsoft_applications.rb (Milestone 17.7)
  • Purpose: Temporary column to store graph access tokens id during Microsoft applications table split
  • Index: Unique index index_group_microsoft_applications_on_temp_source_id
  • Background Migration: lib/gitlab/background_migration/split_microsoft_applications_table.rb

4. system_access_group_microsoft_graph_access_tokens

  • Migration: db/migrate/20241106115015_create_system_access_group_microsoft_graph_access_tokens.rb (Milestone 17.7)
  • Purpose: Temporary column to store graph access tokens id during Microsoft applications table split
  • Index: Unique index index_source_id_microsoft_access_tokens
  • Background Migration: lib/gitlab/background_migration/split_microsoft_applications_table.rb

Current Usage

These columns are currently used by:

  • Sync workers: Various ee/app/workers/authn/sync_*_worker.rb files for bidirectional synchronization
  • Background migrations: For data migration processes
  • Schema specs: Explicitly excluded from foreign key validation in spec/db/schema_spec.rb

Tasks

  • Verify all background migrations using these columns have completed
  • Confirm sync workers are no longer needed or update them to use proper foreign keys
  • Create migration scripts to remove the columns and their associated indexes:
    • Remove temp_source_id from group_scim_identities
    • Remove temp_source_id from group_scim_auth_access_tokens
    • Remove temp_source_id from system_access_group_microsoft_applications
    • Remove temp_source_id from system_access_group_microsoft_graph_access_tokens
  • Update schema specs to remove exclusions for these columns
  • Run tests to ensure removal doesn't impact existing functionality
  • Execute cleanup in production environment

Expected Outcome

All temporary source ID columns will be removed, resulting in:

  • Cleaner database schema
  • Reduced storage overhead
  • Simplified table structure
  • Removal of temporary indexes

Migration Considerations

  • These columns have unique indexes that will also need to be dropped
  • Verify that any remaining sync workers or background jobs are not actively using these columns
  • Consider the impact on any ongoing data synchronization processes

Analysis completed by examining migration files, background migration classes, worker implementations, and schema specifications.

Edited by 🤖 GitLab Bot 🤖