Skip to content

Backfill GroupMember's member_namespace_id with value of source_id

What does this MR do and why?

For GroupMember rows in the members table, the value of member_namespace_id should be equal to source_id. So this migration simply copies that value.

New records in the members table already have this column set: Set member_namespace_id when creating new proje... (!78491 - merged)

ProjectMember records will be migrated in !79204 (merged) (#350822 (closed)), as it requires all project namespaces to exist.

Database

See !80891 (closed) for DB testing.

up

> rake db:migrate
== 20220120211831 TempIndexForGroupNamespaceMemberBackfill: migrating =========
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:members, :id, {:where=>"members.member_namespace_id IS NULL and members.type = 'GroupMember'", :name=>"tmp_index_for_namespace_id_migration_on_group_members", :algorithm=>:concurrently})
   -> 0.0041s
-- execute("SET statement_timeout TO 0")
   -> 0.0003s
-- add_index(:members, :id, {:where=>"members.member_namespace_id IS NULL and members.type = 'GroupMember'", :name=>"tmp_index_for_namespace_id_migration_on_group_members", :algorithm=>:concurrently})
   -> 0.0037s
-- execute("RESET statement_timeout")
   -> 0.0004s
== 20220120211831 TempIndexForGroupNamespaceMemberBackfill: migrated (0.0107s) 

== 20220120211832 BackfillMemberNamespaceIdForGroupMembers: migrating =========
== 20220120211832 BackfillMemberNamespaceIdForGroupMembers: migrated (0.0298s) 

down

> rake db:rollback
== 20220120211832 BackfillMemberNamespaceIdForGroupMembers: reverting =========
== 20220120211832 BackfillMemberNamespaceIdForGroupMembers: reverted (0.0179s)

== 20220120211831 TempIndexForGroupNamespaceMemberBackfill: reverting =========
-- transaction_open?()
   -> 0.0000s
-- indexes(:members)
   -> 0.0041s
-- execute("SET statement_timeout TO 0")
   -> 0.0003s
-- remove_index(:members, {:algorithm=>:concurrently, :name=>"tmp_index_for_namespace_id_migration_on_group_members"})
   -> 0.0026s
-- execute("RESET statement_timeout")
   -> 0.0003s
== 20220120211831 TempIndexForGroupNamespaceMemberBackfill: reverted (0.0097s) 

dblab stats

See https://gitlab.com/gitlab-org/gitlab/-/issues/352561

queries

The queries are (from the local log)

   Load (1.1ms)  SELECT "members"."id" FROM "members" WHERE "members"."id" BETWEEN 1 AND 10 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL ORDER BY "members"."id" ASC, "members"."type" ASC LIMIT 1 /*application:test,correlation_id:58e373698929a27e95db0c60f2477471,db_config_name:main*/
  ↳ app/models/concerns/each_batch.rb:61:in `each_batch'
   Load (0.4ms)  SELECT "members"."id" FROM "members" WHERE "members"."id" BETWEEN 1 AND 10 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL AND "members"."id" >= 1 ORDER BY "members"."id" ASC, "members"."type" ASC LIMIT 1 OFFSET 100 /*application:test,correlation_id:58e373698929a27e95db0c60f2477471,db_config_name:main*/
  ↳ app/models/concerns/each_batch.rb:80:in `block in each_batch'
  #<Class:0x00007f99b6b156f0> Update All (0.9ms)  UPDATE "members" SET member_namespace_id=source_id WHERE "members"."id" BETWEEN 1 AND 10 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL AND "members"."id" >= 1 /*application:test,correlation_id:58e373698929a27e95db0c60f2477471,db_config_name:main*/
  ↳ lib/gitlab/background_migration/backfill_member_namespace_for_group_members.rb:14:in `block (2 levels) in perform'

explain

all with cold cache

WITHOUT INDEX

explain SELECT "members"."id" FROM "members" WHERE "members"."id" BETWEEN 6000000 AND 6002000 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL ORDER BY "members"."id" ASC, "members"."type" ASC LIMIT 1

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/8786/commands/31225

explain UPDATE members SET member_namespace_id=source_id WHERE members.id IN ( SELECT members.id FROM members INNER JOIN namespaces ON members.source_id = namespaces.id WHERE members.id BETWEEN 6000000 AND 6002000 AND members.type = "GroupMember" AND members.source_type = "Namespace" AND members.member_namespace_id IS NULL AND members.id >= 50)

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/8786/commands/31229

WITH INDEX

explain SELECT "members"."id" FROM "members" WHERE "members"."id" BETWEEN 6000000 AND 6002000 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL ORDER BY "members"."id" ASC, "members"."type" ASC LIMIT 1

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/8787/commands/31234

explain UPDATE "members" SET member_namespace_id=source_id WHERE "members"."id" BETWEEN 6000000 AND 6002000 AND "members"."type" = 'GroupMember' AND "members"."member_namespace_id" IS NULL AND "members"."id" >= 1

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/8786/commands/31232

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #349575 (closed)

Edited by charlie ablett

Merge request reports