Skip to content

Clusters hierarchy CTE

Thong Kuah requested to merge clusters-group-cte into master

Part of https://gitlab.com/gitlab-org/gitlab-ce/issues/63475. EE MR is https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/14480

What does this MR do?

Improve query performance for DeploymentPlatform

Problems:

  1. ancestor_clusters_for_clusterable should return a relation instead of an array
  2. Partly because of this, we could not use a single query to search for the most relevant cluster

This MR creates a Common Table Expression (CTE) query to return the whole cluster hierarchy in one query:

  1. The query is a relation so we can chain scopes as we see fit
  2. We only use one query instead of several queries. We use this inside DeploymentPlatform behind a feature flag
DeploymentPlatform
  legacy implementation
    when group has configured kubernetes cluster
      when child group has configured kubernetes cluster
        deeply nested group
1) SELECT  "clusters".* FROM "clusters" INNER JOIN "cluster_projects" ON "clusters"."id" = "cluster_projects"."cluster_id" WHERE "cluster_projects"."project_id" = $1 AND "clusters"."enabled" = $2 AND "clusters"."environment_scope" = $3 ORDER BY "clusters"."id" DESC LIMIT $4
2) WITH RECURSIVE "base_and_ancestors" AS (SELECT 1 as depth, ARRAY[id] AS tree_path, false AS tree_cycle, "namespaces".* FROM "namespaces" WHERE "namespaces"."type" IN ('Group') AND "namespaces"."id" = 13
UNION
SELECT ("base_and_ancestors"."depth" + 1), tree_path || "namespaces".id, "namespaces".id = ANY(tree_path), "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" IN ('Group') AND "namespaces"."id" = "base_and_ancestors"."parent_id" AND "base_and_ancestors"."tree_cycle" = 'f') SELECT "namespaces"."id" AS t0_r0, "namespaces"."name" AS t0_r1, "namespaces"."path" AS t0_r2, "namespaces"."owner_id" AS t0_r3, "namespaces"."created_at" AS t0_r4, "namespaces"."updated_at" AS t0_r5, "namespaces"."type" AS t0_r6, "namespaces"."description" AS t0_r7, "namespaces"."avatar" AS t0_r8, "namespaces"."membership_lock" AS t0_r9, "namespaces"."share_with_group_lock" AS t0_r10, "namespaces"."visibility_level" AS t0_r11, "namespaces"."request_access_enabled" AS t0_r12, "namespaces"."ldap_sync_status" AS t0_r13, "namespaces"."ldap_sync_error" AS t0_r14, "namespaces"."ldap_sync_last_update_at" AS t0_r15, "namespaces"."ldap_sync_last_successful_update_at" AS t0_r16, "namespaces"."ldap_sync_last_sync_at" AS t0_r17, "namespaces"."description_html" AS t0_r18, "namespaces"."lfs_enabled" AS t0_r19, "namespaces"."parent_id" AS t0_r20, "namespaces"."shared_runners_minutes_limit" AS t0_r21, "namespaces"."repository_size_limit" AS t0_r22, "namespaces"."require_two_factor_authentication" AS t0_r23, "namespaces"."two_factor_grace_period" AS t0_r24, "namespaces"."cached_markdown_version" AS t0_r25, "namespaces"."plan_id" AS t0_r26, "namespaces"."project_creation_level" AS t0_r27, "namespaces"."runners_token" AS t0_r28, "namespaces"."trial_ends_on" AS t0_r29, "namespaces"."file_template_project_id" AS t0_r30, "namespaces"."saml_discovery_token" AS t0_r31, "namespaces"."runners_token_encrypted" AS t0_r32, "namespaces"."custom_project_templates_group_id" AS t0_r33, "namespaces"."auto_devops_enabled" AS t0_r34, "namespaces"."extra_shared_runners_minutes_limit" AS t0_r35, "namespaces"."last_ci_minutes_notification_at" AS t0_r36, "clusters"."id" AS t1_r0, "clusters"."user_id" AS t1_r1, "clusters"."provider_type" AS t1_r2, "clusters"."platform_type" AS t1_r3, "clusters"."created_at" AS t1_r4, "clusters"."updated_at" AS t1_r5, "clusters"."enabled" AS t1_r6, "clusters"."name" AS t1_r7, "clusters"."environment_scope" AS t1_r8, "clusters"."cluster_type" AS t1_r9, "clusters"."domain" AS t1_r10, "clusters"."managed" AS t1_r11 FROM "base_and_ancestors" AS "namespaces" LEFT OUTER JOIN "cluster_groups" ON "cluster_groups"."group_id" = "namespaces"."id" LEFT OUTER JOIN "clusters" ON "clusters"."id" = "cluster_groups"."cluster_id" WHERE "clusters"."enabled" = $1 AND "clusters"."environment_scope" = $2 ORDER BY "namespaces"."depth" ASC
3) SELECT "clusters".* FROM "clusters" WHERE "clusters"."enabled" = $1 AND "clusters"."environment_scope" = $2 AND "clusters"."cluster_type" = $3
4) SELECT  "cluster_platforms_kubernetes".* FROM "cluster_platforms_kubernetes" WHERE "cluster_platforms_kubernetes"."cluster_id" = $1 LIMIT $2
          returns most nested group cluster Kubernetes platform


  CTE implementation
    when group has configured kubernetes cluster
      when child group has configured kubernetes cluster
        deeply nested group
1) WITH RECURSIVE "clusters_cte" AS (SELECT "clusters".*, "projects"."namespace_id" AS group_parent_id, 1 AS depth FROM "projects" LEFT OUTER JOIN "cluster_projects" ON "cluster_projects"."project_id" = "projects"."id" LEFT OUTER JOIN "clusters" ON "clusters"."id" = "cluster_projects"."cluster_id" WHERE "projects"."id" = 2
UNION
SELECT "clusters".*, "namespaces"."parent_id" AS group_parent_id, ("clusters_cte"."depth" + 1) FROM "clusters_cte", "namespaces" LEFT OUTER JOIN cluster_groups ON cluster_groups.group_id = namespaces.id LEFT OUTER JOIN clusters ON cluster_groups.cluster_id = clusters.id WHERE "namespaces"."id" = "clusters_cte"."group_parent_id") SELECT  "clusters".* FROM "clusters_cte" "clusters" WHERE (clusters.id IS NOT NULL) AND "clusters"."enabled" = $1 AND "clusters"."environment_scope" = $2 ORDER BY "clusters"."depth" ASC LIMIT $3
2) SELECT  "cluster_platforms_kubernetes".* FROM "cluster_platforms_kubernetes" WHERE "cluster_platforms_kubernetes"."cluster_id" = $1 LIMIT $2
          returns most nested group cluster Kubernetes platform

NB: I have not incorporated instance clusters into the single query but because it is so cheap I think we can do this later

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports