Skip to content

Linear group queries

Summary

Provide a set of linear Namespace querying methods to replace recursive CTE queries. The linear methods are faster and more simple to work with.

Currently Groups are nested in a parent-child hierarchy. To find groups relative to another group we have to walk the tree recursively. This is implemented through a recursive CTE with generally acceptable query performance. However building queries around the recursive CTE can be complex and result in slow queries.

Some related issues include:

We can query the namespace hierarchy a lot faster and easier if we store the path from the root ancestor to each namespace as an attribute on the namespace. We will store this on a new traversal_id attribute on Namespaces.

Say we have this Namespace hierarchy...

graph TD;
    gitlab-->backend;
    backend-->create;
    backend-->manage;
    create-->source;
    manage-->access;
    gitlab-->frontend;

Then the path from gitlab to access is gitlab / backend / manage / access. The path from gitlab to source is gitlab / backend / create / source.

We can use this structure for fast queries. If the current namespace is backend then all our descendants match the path gitlab / backend / *, all our ancestors match the path * / backend, and everyone in the same Namespace hierarchy matches gitlab / *.

We store that path using an array of Namespace ids in a new column on Namespace called traversal_ids.

Improvements

Faster group hierarchy querying. Simpler queries. Easier to work with.

Risks

Failing to keep the traversal_id column in sync with the hierarchy.

Involved components

app/models/namespace.rb

Edited by Alex Pooley