Backfill organization_id on pool_repositories

What does this MR do and why?

Backfill organization_id on pool_repositories

This commit provides the batched background migration to backfill organization_id on pool_repositories.

Additionally it provides the following:

  • database trigger to set organization_id to cover scenarios where model hooks are skipped
  • concurrent index on organization_id
  • foreign key constraint (NOT VALID) on organization_id
  • NOT NULL (NOT VALID) constraint on organization_id
  • specs

Changelog: added

See also: Support org id for pool repositories (!207640 - merged)

database notes

I'm using the same approach as this similar, recently merged MR: Set sharding key on `todos` (!204559 - merged)

index creation

postgres ai link: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/44340/commands/135910

Command: 
exec

Query:
CREATE INDEX CONCURRENTLY index_pool_repositories_on_organization_id ON pool_repositories USING btree (organization_id);

Response:
The query has been executed. Duration: 346.385 ms

migration output

Click to expand
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16895
main: == 20250929180950 AddOrganizationIdToPoolRepositories: migrating ==============
main: -- add_column(:pool_repositories, :organization_id, :bigint, {:null=>true})
main:    -> 0.0325s
main: == 20250929180950 AddOrganizationIdToPoolRepositories: migrated (0.0576s) =====

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16895
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16896
ci: == 20250929180950 AddOrganizationIdToPoolRepositories: migrating ==============
ci: -- add_column(:pool_repositories, :organization_id, :bigint, {:null=>true})
ci:    -> 0.0060s
ci: == 20250929180950 AddOrganizationIdToPoolRepositories: migrated (0.0141s) =====

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16896
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16902
main: == 20251006152354 AddShardingKeyTriggerToPoolRepositories: migrating ==========
main: -- execute("CREATE OR REPLACE FUNCTION pool_repositories_sharding_key() RETURNS TRIGGER AS $$\nBEGIN\n  IF NEW.organization_id IS NOT NULL THEN\n    RETURN NEW;\n  END IF;\n\n  IF NEW.source_project_id IS NOT NULL THEN\n    SELECT p.organization_id\n    INTO NEW.organization_id\n    FROM projects p\n    WHERE p.id = NEW.source_project_id;\n\n    IF NEW.organization_id IS NOT NULL THEN\n      RETURN NEW;\n    END IF;\n  END IF;\n\n  NEW.organization_id := 1;\n  RETURN NEW;\nEND;\n$$ LANGUAGE plpgsql;\n")
main:    -> 0.0059s
main: -- execute("CREATE TRIGGER trigger_pool_repositories_sharding_key\nBEFORE INSERT OR UPDATE ON pool_repositories\nFOR EACH ROW\n\nEXECUTE FUNCTION pool_repositories_sharding_key()\n")
main:    -> 0.0029s
main: == 20251006152354 AddShardingKeyTriggerToPoolRepositories: migrated (0.0136s) =

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16902
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16927
ci: == 20251006152354 AddShardingKeyTriggerToPoolRepositories: migrating ==========
ci: -- execute("CREATE OR REPLACE FUNCTION pool_repositories_sharding_key() RETURNS TRIGGER AS $$\nBEGIN\n  IF NEW.organization_id IS NOT NULL THEN\n    RETURN NEW;\n  END IF;\n\n  IF NEW.source_project_id IS NOT NULL THEN\n    SELECT p.organization_id\n    INTO NEW.organization_id\n    FROM projects p\n    WHERE p.id = NEW.source_project_id;\n\n    IF NEW.organization_id IS NOT NULL THEN\n      RETURN NEW;\n    END IF;\n  END IF;\n\n  NEW.organization_id := 1;\n  RETURN NEW;\nEND;\n$$ LANGUAGE plpgsql;\n")
ci:    -> 0.0062s
ci: -- execute("CREATE TRIGGER trigger_pool_repositories_sharding_key\nBEFORE INSERT OR UPDATE ON pool_repositories\nFOR EACH ROW\n\nEXECUTE FUNCTION pool_repositories_sharding_key()\n")
ci:    -> 0.0033s
ci: == 20251006152354 AddShardingKeyTriggerToPoolRepositories: migrated (0.0213s) =

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16927
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16930
main: == 20250917231903 FinalizeHkFixStringConfigHashesGroupStreamingDestinations: migrating
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: == 20250917231903 FinalizeHkFixStringConfigHashesGroupStreamingDestinations: migrated (0.1160s)

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16930
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16932
ci: == 20250917231903 FinalizeHkFixStringConfigHashesGroupStreamingDestinations: migrating
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_ci_cell_local, :gitlab_internal, :gitlab_shared, :gitlab_shared_cell_local, :gitlab_shared_org].
ci: == 20250917231903 FinalizeHkFixStringConfigHashesGroupStreamingDestinations: migrated (0.0081s)

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16932
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16934
main: == 20251008204550 QueueBackfillPoolRepositoriesOrganizationId: migrating ======
main: == 20251008204550 QueueBackfillPoolRepositoriesOrganizationId: migrated (0.0475s)

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16934
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16935
ci: == 20251008204550 QueueBackfillPoolRepositoriesOrganizationId: migrating ======
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main_org].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_ci_cell_local, :gitlab_internal, :gitlab_shared, :gitlab_shared_cell_local, :gitlab_shared_org].
ci: == 20251008204550 QueueBackfillPoolRepositoriesOrganizationId: migrated (0.0083s)

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16935
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16938
main: == 20251008204839 AddConcurrentIndexToPoolRepositoriesOnOrganizationId: migrating
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- view_exists?(:postgres_partitions)
main:    -> 0.0009s
main: -- index_exists?(:pool_repositories, :organization_id, {:name=>"index_pool_repositories_on_organization_id", :algorithm=>:concurrently})
main:    -> 0.0022s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- add_index(:pool_repositories, :organization_id, {:name=>"index_pool_repositories_on_organization_id", :algorithm=>:concurrently})
main:    -> 0.0023s
main: -- execute("RESET statement_timeout")
main:    -> 0.0004s
main: == 20251008204839 AddConcurrentIndexToPoolRepositoriesOnOrganizationId: migrated (0.0274s)

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16938
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16941
ci: == 20251008204839 AddConcurrentIndexToPoolRepositoriesOnOrganizationId: migrating
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- view_exists?(:postgres_partitions)
ci:    -> 0.0005s
ci: -- index_exists?(:pool_repositories, :organization_id, {:name=>"index_pool_repositories_on_organization_id", :algorithm=>:concurrently})
ci:    -> 0.0057s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0003s
ci: -- add_index(:pool_repositories, :organization_id, {:name=>"index_pool_repositories_on_organization_id", :algorithm=>:concurrently})
ci:    -> 0.0022s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0004s
ci: == 20251008204839 AddConcurrentIndexToPoolRepositoriesOnOrganizationId: migrated (0.0329s)

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16941
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16943
main: == 20251008205011 AddNotValidForeignKeyConstraintToPoolRepositoriesOnOrganizationId: migrating
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE pool_repositories ADD CONSTRAINT fk_775c554d89 FOREIGN KEY (organization_id) REFERENCES organizations (id) ON DELETE CASCADE NOT VALID;")
main:    -> 0.0053s
main: == 20251008205011 AddNotValidForeignKeyConstraintToPoolRepositoriesOnOrganizationId: migrated (0.0370s)

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16943
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16944
ci: == 20251008205011 AddNotValidForeignKeyConstraintToPoolRepositoriesOnOrganizationId: migrating
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE pool_repositories ADD CONSTRAINT fk_775c554d89 FOREIGN KEY (organization_id) REFERENCES organizations (id) ON DELETE CASCADE NOT VALID;")
ci:    -> 0.0061s
ci: == 20251008205011 AddNotValidForeignKeyConstraintToPoolRepositoriesOnOrganizationId: migrated (0.0334s)

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16944
main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16946
main: == 20251008205112 AddNotValidNotNullConstraintToPoolRepositoriesShardingKey: migrating
main: -- current_schema(nil)
main:    -> 0.0002s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE pool_repositories\nADD CONSTRAINT check_96233d37c0\nCHECK ( organization_id IS NOT NULL )\nNOT VALID;\n")
main:    -> 0.0012s
main: == 20251008205112 AddNotValidNotNullConstraintToPoolRepositoriesShardingKey: migrated (0.0154s)

main: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16946
ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16947
ci: == 20251008205112 AddNotValidNotNullConstraintToPoolRepositoriesShardingKey: migrating
ci: -- current_schema(nil)
ci:    -> 0.0002s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE pool_repositories\nADD CONSTRAINT check_96233d37c0\nCHECK ( organization_id IS NOT NULL )\nNOT VALID;\n")
ci:    -> 0.0011s
ci: == 20251008205112 AddNotValidNotNullConstraintToPoolRepositoriesShardingKey: migrated (0.0258s)

ci: == [advisory_lock_connection] object_id: 129580, pg_backend_pid: 16947

Related to #519617

Edited by Hunter Stewart

Merge request reports

Loading