Skip to content

Define a namespace traversal cache

Alex Pooley requested to merge traversal-hierarchy into master

What does this MR do?

Cache the traversal path of descending from a top level namespace to the current namespace. Store the path in an indexed array column called traversal_ids on the namespaces table.

The Traversal::Hierarchy class provides methods that allow the initialization of the traversal_ids column per namespace hierarchy.

This MR is part of a sequence of MRs to replace the current recursive namespace search with a faster linear search.


The issue with namespaces right now is that we query them using recursive methods where we walk the namespace parent-child tree from the current namespace. This is slow and complicated. We can query the namespace hierarchy a lot faster and easier if we store the path from the root ancestor to each namespace as an attribute on the namespace.

Say we have this Namespace hierarchy...

graph TD;
    gitlab-->backend;
    backend-->create;
    backend-->manage;
    create-->source;
    manage-->access;
    gitlab-->frontend;

Then the path from gitlab to access is gitlab / backend / manage / access. The path from gitlab to source is gitlab / backend / create / source.

We can use this structure for fast queries. If the current namespace is backend then all our descendants match the path gitlab / backend / *, all our ancestors match the path * / backend, and everyone in the same Namespace hierarchy matches gitlab / *.

We store that path using an array of Namespace ids in a new column on Namespace called traversal_ids.

This MR creates the traversal_ids column on Namespace, and also provides a class that can tell us which Namespaces in a hierarchy don't have traversal_ids synchronized yet, and a method to set traversal_ids to the value they should be.

There will be a MR soon to use these features in a background worker to synchronize selected groups for dogfooding/canary purposes.

There is also a WIP MR to supersede our current Namespace querying scheme.


Further information on the core issue is defined at #195423 (comment 344833120)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Alex Pooley

Merge request reports