Remove ci_namespace_mirrors sync_children_namespaces
What does this MR do and why?
We mirror namespaces
and projects
from the main database, to the tables ci_namespace_mirrors
and ci_project_mirrors
. More information about this here. If you are not familiar with this topic, it's explained in details here: https://docs.gitlab.com/ee/development/database/ci_mirrored_tables.html
This MR is to address this issue: #347541 (closed). You will find more context about the problem there as well.
Namespaces Traversal IDs.
Each namespace
have an attribute called traversal_ids
, which is an array of integers. This attribute represents the hierarchy of the namespace from the top to bottom. For example, let's consider we have this simple hierarchy of namespaces. A(1) -> B(2) -> C(3). And another namespace D(4) which has (A) has parent.
- A.traversal_ids =
[1]
- B.traversal_ids =
[1, 2]
- C.traversal_ids =
[1, 2, 3]
- D.traversal_ids =
[1, 4]
This attribute is also mirrored to the ci_namespace_mirrors
table. Where we have a record for each namespace, with the attributes namespace_id
& traversal_ids
only.
The old sync method
Currently, every time the parent_id
of a namespace changes, or a new namespace has been created, a Namespaces::EventSync
record is created to mirror the namespace to the CI database. On the ci_namespace_mirror
model, we sync the traversal_ids
of the namespace, and all its children. See the implementation here.
This MR is basically to remove this functionality, because we already have a new sync method in place.
The new sync method
- Every time a new namespace is created, or the parent_id of a namespace of the namespace has changed, we update the
traversal_ids
of the namespace and all its children in the hierarchy: https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/namespaces/traversal/linear.rb#L46-49 - When we update the namespace
traversal_ids
, the triggertrigger_namespaces_traversal_ids_on_update
is responsible for creating a record in thenamespaces_sync_events
, so that the workerNamespaces::ProcessSyncEventsWorker
picks up the job and sync the namespace to theci_namespace_mirrors
- The
sync_traversal_ids
which is called for both (create) and (update), is also responsible to schedule the workerNamespaces::ProcessSyncEventsWorker
to pick up the jobs, but only after the commit has happened. To make sure that thenamespaces_sync_event
record has been created in the table.
With this sync method in place, we don't need the old sync method. On the ci
side, we don't care about the children of the namespaces anymore, because they are synced from the main
side
How to set up and validate locally
- Make sure that all the gdk services are up and running with the newest code from this branch
gdk restart
- Enter Rails Console
./bin/rails c
A new top-level namespace
Test creating a new top-level namespace, and check the added traversal_ids on both the namespace
and `ci_mirror_namespace
a = FactoryBot.create(:group, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
puts a.reload.traversal_ids.inspect => it should print [a.id]
puts Ci::NamespaceMirror.where(namespace_id: a.id).first.traversal_ids.inspect => it should print [a.id] as well
A child new namespace
b = FactoryBot.create(:group, parent: a, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
puts b.reload.traversal_ids.inspect # it should print [a.id, b.id]
puts Ci::NamespaceMirror.where(namespace_id: b.id).first.traversal_ids.inspect # it should print [a.id, b.id]
Updating the parent of a namespace that has children
d = FactoryBot.create(:group, name: "test" + (0...5).map { (65 + rand(26)).chr }.join)
a.update!(parent: d)
puts a.reload.traversal_ids.inspect # it should print [d.id, a.id]
puts Ci::NamespaceMirror.where(namespace_id: a.id).first.traversal_ids.inspect # same as above
puts b.reload.traversal_ids.inspect # it should print [d.id, a.id, b.id]
puts Ci::NamespaceMirror.where(namespace_id: b.id).first.traversal_ids.inspect # same as above
Database Migrations
Up
./bin/rails db:migrate:up:main VERSION=20220824175648
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: migrating ==
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_parent_id_on_insert ON namespaces")
main: -> 0.0024s
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_parent_id_on_update ON namespaces")
main: -> 0.0004s
main: -- execute("CREATE TRIGGER trigger_namespaces_traversal_ids_on_update\nAFTER UPDATE ON namespaces\nFOR EACH ROW\nWHEN (OLD.traversal_ids IS DISTINCT FROM NEW.traversal_ids)\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main: -> 0.0014s
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: migrated (0.0045s)
Down
./bin/rails db:migrate:down:main VERSION=20220824175648
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: reverting ==
main: -- execute("CREATE TRIGGER trigger_namespaces_parent_id_on_insert\nAFTER INSERT ON namespaces\nFOR EACH ROW\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main: -> 0.0025s
main: -- execute("DROP TRIGGER IF EXISTS trigger_namespaces_traversal_ids_on_update ON namespaces")
main: -> 0.0008s
main: -- execute("CREATE TRIGGER trigger_namespaces_parent_id_on_update\nAFTER UPDATE ON namespaces\nFOR EACH ROW\nWHEN (OLD.parent_id IS DISTINCT FROM NEW.parent_id)\n\nEXECUTE FUNCTION insert_namespaces_sync_event()\n")
main: -> 0.0009s
main: == 20220824175648 LimitNamespacesSyncTriggersToTraversalIdsUpdate: reverted (0.0046s)
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.