Skip to content

Remove ci_namespace_mirrors sync_children_namespaces

What does this MR do and why?

We mirror namespaces and projects from the main database, to the tables ci_namespace_mirrors and ci_project_mirrors. More information about this here. If you are not familiar with this topic, it's explained in details here: https://docs.gitlab.com/ee/development/database/ci_mirrored_tables.html

This MR is to address this issue: #347541 (closed). You will find more context about the problem there as well.

Namespaces Traversal IDs.

Each namespace have an attribute called traversal_ids, which is an array of integers. This attribute represents the hierarchy of the namespace from the top to bottom. For example, let's consider we have this simple hierarchy of namespaces. A(1) -> B(2) -> C(3). And another namespace D(4) which has (A) has parent.

  • A.traversal_ids = [1]
  • B.traversal_ids = [1, 2]
  • C.traversal_ids = [1, 2, 3]
  • D.traversal_ids = [1, 4]

This attribute is also mirrored to the ci_namespace_mirrors table. Where we have a record for each namespace, with the attributes namespace_id & traversal_ids only.

The old sync method

Currently, every time the parent_id of a namespace changes, or a new namespace has been created, a Namespaces::EventSync record is created to mirror the namespace to the CI database. On the ci_namespace_mirror model, we sync the traversal_ids of the namespace, and all its children. See the implementation here.

This MR is basically to remove this functionality, because we already have a new sync method in place.

The new sync method

  • Every time a new namespace is created, or the parent_id of a namespace of the namespace has changed, we update the traversal_ids of the namespace and all its children in the hierarchy: https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/namespaces/traversal/linear.rb#L46-49
  • When we update the namespace traversal_ids, the trigger trigger_namespaces_traversal_ids_on_update is responsible for creating a record in the namespaces_sync_events, so that the worker Namespaces::ProcessSyncEventsWorker picks up the job and sync the namespace to the ci_namespace_mirrors
  • The sync_traversal_ids which is called for both (create) and (update), is also responsible to schedule the worker Namespaces::ProcessSyncEventsWorker to pick up the jobs, but only after the commit has happened. To make sure that the namespaces_sync_event record has been created in the table.

With this sync method in place, we don't need the old sync method. On the ci side, we don't care about the children of the namespaces anymore, because they are synced from the main side

How to set up and validate locally

  1. Make sure that all the gdk services are up and running with the newest code from this branch gdk restart
  2. Enter Rails Console ./bin/rails c

A new top-level namespace

Test creating a new top-level namespace, and check the added traversal_ids on both the namespace and `ci_mirror_namespace

a = FactoryBot.create(:group, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
puts a.reload.traversal_ids.inspect => it should print [a.id]
puts Ci::NamespaceMirror.where(namespace_id: a.id).first.traversal_ids.inspect => it should print [a.id] as well

A child new namespace

b = FactoryBot.create(:group, parent: a, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
puts b.reload.traversal_ids.inspect # it should print [a.id, b.id]
puts Ci::NamespaceMirror.where(namespace_id: b.id).first.traversal_ids.inspect # it should print [a.id, b.id]

Updating the parent of a namespace that has children

d = FactoryBot.create(:group, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
a.update!(parent: d)
puts a.reload.traversal_ids.inspect # it should print [d.id, a.id]
puts Ci::NamespaceMirror.where(namespace_id: a.id).first.traversal_ids.inspect # same as above
puts b.reload.traversal_ids.inspect # it should print [d.id, a.id, b.id]
puts Ci::NamespaceMirror.where(namespace_id: b.id).first.traversal_ids.inspect # same as above

Database Migrations

Up

./bin/rails db:migrate:up:main VERSION=20220824175648
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: migrating ==
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_parent_id_on_insert ON namespaces")
main:    -> 0.0024s
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_parent_id_on_update ON namespaces")
main:    -> 0.0004s
main: -- execute("CREATE TRIGGER trigger_namespaces_traversal_ids_on_update\nAFTER UPDATE ON namespaces\nFOR EACH ROW\nWHEN (OLD.traversal_ids IS DISTINCT FROM NEW.traversal_ids)\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main:    -> 0.0014s
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: migrated (0.0045s)

Down

./bin/rails db:migrate:down:main VERSION=20220824175648
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: reverting ==
main: -- execute("CREATE TRIGGER trigger_namespaces_parent_id_on_insert\nAFTER INSERT ON namespaces\nFOR EACH ROW\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main:    -> 0.0025s
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_traversal_ids_on_update ON namespaces")
main:    -> 0.0008s
main: -- execute("CREATE TRIGGER trigger_namespaces_parent_id_on_update\nAFTER UPDATE ON namespaces\nFOR EACH ROW\nWHEN (OLD.parent_id IS DISTINCT FROM NEW.parent_id)\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main:    -> 0.0009s
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: reverted (0.0046s)

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Omar Qunsul

Merge request reports