Skip to content

Namespace.self_and_hierarchy scope

Alex Pooley requested to merge 352119-namespace-self_and_hierarchy-scope into master

What does this MR do and why?

Provide a linear and recursive query to find all ancestors and descendants for a set of groups. This is part of the linear namespace queries work to replace recursive queries with linear queries.

The query is packaged as a scope and defaults to the recursive version, with a feature flag to enable the linear version.

Specifically, this functionality will replace all single argument calls to Gitlab::ObjectHierarchy#all_objects. The all_objects calls are currently being replaced.

This work is behind the use_traversal_ids_for_self_and_hierarchy_scopes feature flag.

How to set up and validate locally

Feature.enable :use_traversal_ids
Feature.enable :traversal_ids_btree
Feature.enable :use_traversal_ids_for_ancestor_scopes
Feature.enable :use_traversal_ids_for_descendants_scopes
Feature.enable :use_traversal_ids_for_self_and_hierarchy_scopes

# Make a sample group hierarchy
groups = FactoryBot.create(:group, :with_hierarchy)

groups.last.self_and_hierarchy

SQL Performance

Using group: https://gitlab.com/gitlab-org/incubation-engineering (12892338)

self_and_hierarchy SQL 👇

SELECT 
  "namespaces"."id", 
  "namespaces"."name", 
  "namespaces"."path", 
  "namespaces"."owner_id", 
  "namespaces"."created_at", 
  "namespaces"."updated_at", 
  "namespaces"."type", 
  "namespaces"."description", 
  "namespaces"."avatar", 
  "namespaces"."membership_lock", 
  "namespaces"."share_with_group_lock", 
  "namespaces"."visibility_level", 
  "namespaces"."request_access_enabled", 
  "namespaces"."ldap_sync_status", 
  "namespaces"."ldap_sync_error", 
  "namespaces"."ldap_sync_last_update_at", 
  "namespaces"."ldap_sync_last_successful_update_at", 
  "namespaces"."ldap_sync_last_sync_at", 
  "namespaces"."description_html", 
  "namespaces"."lfs_enabled", 
  "namespaces"."parent_id", 
  "namespaces"."shared_runners_minutes_limit", 
  "namespaces"."repository_size_limit", 
  "namespaces"."require_two_factor_authentication", 
  "namespaces"."two_factor_grace_period", 
  "namespaces"."cached_markdown_version", 
  "namespaces"."project_creation_level", 
  "namespaces"."runners_token", 
  "namespaces"."file_template_project_id", 
  "namespaces"."saml_discovery_token", 
  "namespaces"."runners_token_encrypted", 
  "namespaces"."custom_project_templates_group_id", 
  "namespaces"."auto_devops_enabled", 
  "namespaces"."extra_shared_runners_minutes_limit", 
  "namespaces"."last_ci_minutes_notification_at", 
  "namespaces"."last_ci_minutes_usage_notification_level", 
  "namespaces"."subgroup_creation_level", 
  "namespaces"."emails_disabled", 
  "namespaces"."max_pages_size", 
  "namespaces"."max_artifacts_size", 
  "namespaces"."mentions_disabled", 
  "namespaces"."default_branch_protection", 
  "namespaces"."unlock_membership_to_ldap", 
  "namespaces"."max_personal_access_token_lifetime", 
  "namespaces"."push_rule_id", 
  "namespaces"."shared_runners_enabled", 
  "namespaces"."allow_descendants_override_disabled_shared_runners", 
  "namespaces"."traversal_ids" 
FROM 
  (
    (
      WITH "base_ancestors_cte" AS MATERIALIZED (
        SELECT 
          "namespaces"."id", 
          "namespaces"."traversal_ids" 
        FROM 
          "namespaces" 
        WHERE 
          "namespaces"."type" = 'Group' 
          AND "namespaces"."id" = 12892338
      ), 
      "ancestors_cte" AS MATERIALIZED (
        SELECT 
          id as base_id, 
          unnest(traversal_ids) as ancestor_id 
        FROM 
          "base_ancestors_cte"
      ) 
      SELECT 
        DISTINCT "namespaces"."id", 
        "namespaces"."name", 
        "namespaces"."path", 
        "namespaces"."owner_id", 
        "namespaces"."created_at", 
        "namespaces"."updated_at", 
        "namespaces"."type", 
        "namespaces"."description", 
        "namespaces"."avatar", 
        "namespaces"."membership_lock", 
        "namespaces"."share_with_group_lock", 
        "namespaces"."visibility_level", 
        "namespaces"."request_access_enabled", 
        "namespaces"."ldap_sync_status", 
        "namespaces"."ldap_sync_error", 
        "namespaces"."ldap_sync_last_update_at", 
        "namespaces"."ldap_sync_last_successful_update_at", 
        "namespaces"."ldap_sync_last_sync_at", 
        "namespaces"."description_html", 
        "namespaces"."lfs_enabled", 
        "namespaces"."parent_id", 
        "namespaces"."shared_runners_minutes_limit", 
        "namespaces"."repository_size_limit", 
        "namespaces"."require_two_factor_authentication", 
        "namespaces"."two_factor_grace_period", 
        "namespaces"."cached_markdown_version", 
        "namespaces"."project_creation_level", 
        "namespaces"."runners_token", 
        "namespaces"."file_template_project_id", 
        "namespaces"."saml_discovery_token", 
        "namespaces"."runners_token_encrypted", 
        "namespaces"."custom_project_templates_group_id", 
        "namespaces"."auto_devops_enabled", 
        "namespaces"."extra_shared_runners_minutes_limit", 
        "namespaces"."last_ci_minutes_notification_at", 
        "namespaces"."last_ci_minutes_usage_notification_level", 
        "namespaces"."subgroup_creation_level", 
        "namespaces"."emails_disabled", 
        "namespaces"."max_pages_size", 
        "namespaces"."max_artifacts_size", 
        "namespaces"."mentions_disabled", 
        "namespaces"."default_branch_protection", 
        "namespaces"."unlock_membership_to_ldap", 
        "namespaces"."max_personal_access_token_lifetime", 
        "namespaces"."push_rule_id", 
        "namespaces"."shared_runners_enabled", 
        "namespaces"."allow_descendants_override_disabled_shared_runners", 
        "namespaces"."traversal_ids" 
      FROM 
        "ancestors_cte", 
        "namespaces" 
      WHERE 
        "namespaces"."type" = 'Group' 
        AND "namespaces"."id" = "ancestors_cte"."ancestor_id"
    ) 
    UNION 
      (
        WITH "descendants_base_cte" AS MATERIALIZED (
          SELECT 
            "namespaces"."traversal_ids", 
            LEAD (namespaces.traversal_ids, 1) OVER (
              ORDER BY 
                namespaces.traversal_ids ASC
            ) next_traversal_ids 
          FROM 
            "namespaces" 
          WHERE 
            "namespaces"."type" = 'Group' 
            AND "namespaces"."id" = 35
        ), 
        "descendants_cte" AS MATERIALIZED (
          SELECT 
            "namespaces"."id", 
            "namespaces"."name", 
            "namespaces"."path", 
            "namespaces"."owner_id", 
            "namespaces"."created_at", 
            "namespaces"."updated_at", 
            "namespaces"."type", 
            "namespaces"."description", 
            "namespaces"."avatar", 
            "namespaces"."membership_lock", 
            "namespaces"."share_with_group_lock", 
            "namespaces"."visibility_level", 
            "namespaces"."request_access_enabled", 
            "namespaces"."ldap_sync_status", 
            "namespaces"."ldap_sync_error", 
            "namespaces"."ldap_sync_last_update_at", 
            "namespaces"."ldap_sync_last_successful_update_at", 
            "namespaces"."ldap_sync_last_sync_at", 
            "namespaces"."description_html", 
            "namespaces"."lfs_enabled", 
            "namespaces"."parent_id", 
            "namespaces"."shared_runners_minutes_limit", 
            "namespaces"."repository_size_limit", 
            "namespaces"."require_two_factor_authentication", 
            "namespaces"."two_factor_grace_period", 
            "namespaces"."cached_markdown_version", 
            "namespaces"."project_creation_level", 
            "namespaces"."runners_token", 
            "namespaces"."file_template_project_id", 
            "namespaces"."saml_discovery_token", 
            "namespaces"."runners_token_encrypted", 
            "namespaces"."custom_project_templates_group_id", 
            "namespaces"."auto_devops_enabled", 
            "namespaces"."extra_shared_runners_minutes_limit", 
            "namespaces"."last_ci_minutes_notification_at", 
            "namespaces"."last_ci_minutes_usage_notification_level", 
            "namespaces"."subgroup_creation_level", 
            "namespaces"."emails_disabled", 
            "namespaces"."max_pages_size", 
            "namespaces"."max_artifacts_size", 
            "namespaces"."mentions_disabled", 
            "namespaces"."default_branch_protection", 
            "namespaces"."unlock_membership_to_ldap", 
            "namespaces"."max_personal_access_token_lifetime", 
            "namespaces"."push_rule_id", 
            "namespaces"."shared_runners_enabled", 
            "namespaces"."allow_descendants_override_disabled_shared_runners", 
            "namespaces"."traversal_ids" 
          FROM 
            "descendants_base_cte", 
            "namespaces" 
          WHERE 
            "namespaces"."type" = 'Group' 
            AND (
              "descendants_base_cte"."next_traversal_ids" IS NULL 
              OR "descendants_base_cte"."next_traversal_ids" > "namespaces"."traversal_ids"
            ) 
            AND next_traversal_ids_sibling(
              "descendants_base_cte"."traversal_ids"
            ) > "namespaces"."traversal_ids" 
            AND "descendants_base_cte"."traversal_ids" <= "namespaces"."traversal_ids"
        ) 
        SELECT 
          "namespaces"."id", 
          "namespaces"."name", 
          "namespaces"."path", 
          "namespaces"."owner_id", 
          "namespaces"."created_at", 
          "namespaces"."updated_at", 
          "namespaces"."type", 
          "namespaces"."description", 
          "namespaces"."avatar", 
          "namespaces"."membership_lock", 
          "namespaces"."share_with_group_lock", 
          "namespaces"."visibility_level", 
          "namespaces"."request_access_enabled", 
          "namespaces"."ldap_sync_status", 
          "namespaces"."ldap_sync_error", 
          "namespaces"."ldap_sync_last_update_at", 
          "namespaces"."ldap_sync_last_successful_update_at", 
          "namespaces"."ldap_sync_last_sync_at", 
          "namespaces"."description_html", 
          "namespaces"."lfs_enabled", 
          "namespaces"."parent_id", 
          "namespaces"."shared_runners_minutes_limit", 
          "namespaces"."repository_size_limit", 
          "namespaces"."require_two_factor_authentication", 
          "namespaces"."two_factor_grace_period", 
          "namespaces"."cached_markdown_version", 
          "namespaces"."project_creation_level", 
          "namespaces"."runners_token", 
          "namespaces"."file_template_project_id", 
          "namespaces"."saml_discovery_token", 
          "namespaces"."runners_token_encrypted", 
          "namespaces"."custom_project_templates_group_id", 
          "namespaces"."auto_devops_enabled", 
          "namespaces"."extra_shared_runners_minutes_limit", 
          "namespaces"."last_ci_minutes_notification_at", 
          "namespaces"."last_ci_minutes_usage_notification_level", 
          "namespaces"."subgroup_creation_level", 
          "namespaces"."emails_disabled", 
          "namespaces"."max_pages_size", 
          "namespaces"."max_artifacts_size", 
          "namespaces"."mentions_disabled", 
          "namespaces"."default_branch_protection", 
          "namespaces"."unlock_membership_to_ldap", 
          "namespaces"."max_personal_access_token_lifetime", 
          "namespaces"."push_rule_id", 
          "namespaces"."shared_runners_enabled", 
          "namespaces"."allow_descendants_override_disabled_shared_runners", 
          "namespaces"."traversal_ids" 
        FROM 
          "descendants_cte" AS "namespaces"
      )
  ) namespaces 
WHERE 
  "namespaces"."type" = 'Group'

Cold

Time: 20.773 ms  
  - planning: 5.596 ms  
  - execution: 15.177 ms  
    - I/O read: 12.945 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 35 (~280.00 KiB) from the buffer pool  
  - reads: 9 (~72.00 KiB) from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  

https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8537/commands/30275

Warm

Time: 6.047 ms  
  - planning: 4.303 ms  
  - execution: 1.744 ms  
    - I/O read: 0.000 ms  
    - I/O write: 0.000 ms  
  
Shared buffers:  
  - hits: 44 (~352.00 KiB) from the buffer pool  
  - reads: 0 from the OS file cache, including disk I/O  
  - dirtied: 0  
  - writes: 0  

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/8537/commands/30276

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Closes #352119 (closed)

Edited by Alex Pooley

Merge request reports