Skip to content

Fix maven package query performance [RUN ALL RSPEC] [RUN AS-IF-FOSS]

Adam Hegyi requested to merge 325274-use-cte-to-optimize-packages-query into master

What does this MR do?

This MR adds an CTE optimization fence to the group level maven package finder query to improve its performance.

The change is behind a feature flag: maven_metadata_by_path_with_optimization_fence

Query

Find package on the group level by path.

Old query: https://explain.depesz.com/s/wnRb

Raw SQL
SELECT "packages_packages".*
FROM "packages_packages"
INNER JOIN "packages_maven_metadata" ON "packages_maven_metadata"."package_id" = "packages_packages"."id"
WHERE "packages_packages"."project_id" IN
    (SELECT "projects"."id"
     FROM "projects"
     INNER JOIN
       (SELECT "id"
        FROM
          (SELECT "namespaces".*
           FROM "namespaces"
           INNER JOIN
             (WITH RECURSIVE "base_and_descendants" AS (
                                                          (SELECT 1 AS depth, ARRAY[namespaces.id] AS tree_path,
                                                                                   FALSE AS tree_cycle,
                                                                                            "namespaces".*
                                                           FROM "namespaces"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."id" = 4249178)
                                                        UNION
                                                          (SELECT ("base_and_descendants"."depth" + 1), tree_path || "namespaces".id,
                                                                                                        "namespaces".id = ANY(tree_path),
                                                                                                        "namespaces".*
                                                           FROM "namespaces",
                                                                "base_and_descendants"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."parent_id" = "base_and_descendants"."id"
                                                             AND "base_and_descendants"."tree_cycle" = FALSE)) SELECT DISTINCT "id",
                                                                                                                               "depth"
              FROM "base_and_descendants" AS "namespaces") namespaces_join_table ON namespaces_join_table.id = namespaces.id
           WHERE "namespaces"."type" = 'Group'
           ORDER BY "namespaces_join_table"."depth" ASC) AS "namespaces"
        WHERE "namespaces"."type" = 'Group') namespaces ON namespaces.id=projects.namespace_id
     WHERE (EXISTS
              (SELECT 1
               FROM "project_authorizations"
               WHERE "project_authorizations"."user_id" = 3278538
                 AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (0,
                                             10,
                                             20)))
  AND "packages_maven_metadata"."path" = 'some_package'
ORDER BY "packages_packages"."id" DESC
LIMIT 1

New query: https://explain.depesz.com/s/8yok

Raw SQL
WITH "maven_metadata_by_path" AS
  (SELECT "packages_maven_metadata"."id",
          "packages_maven_metadata"."package_id"
   FROM "packages_maven_metadata"
   WHERE "packages_maven_metadata"."path" = 'some_package')
SELECT "packages_packages".*
FROM "packages_packages"
INNER JOIN maven_metadata_by_path ON maven_metadata_by_path.package_id=packages_packages.id
WHERE "packages_packages"."project_id" IN
    (SELECT "projects"."id"
     FROM "projects"
     INNER JOIN
       (SELECT "id"
        FROM
          (SELECT "namespaces".*
           FROM "namespaces"
           INNER JOIN
             (WITH RECURSIVE "base_and_descendants" AS (
                                                          (SELECT 1 AS depth, ARRAY[namespaces.id] AS tree_path,
                                                                                   FALSE AS tree_cycle,
                                                                                            "namespaces".*
                                                           FROM "namespaces"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."id" = 4249178)
                                                        UNION
                                                          (SELECT ("base_and_descendants"."depth" + 1), tree_path || "namespaces".id,
                                                                                                        "namespaces".id = ANY(tree_path),
                                                                                                        "namespaces".*
                                                           FROM "namespaces",
                                                                "base_and_descendants"
                                                           WHERE "namespaces"."type" = 'Group'
                                                             AND "namespaces"."parent_id" = "base_and_descendants"."id"
                                                             AND "base_and_descendants"."tree_cycle" = FALSE)) SELECT DISTINCT "id",
                                                                                                                               "depth"
              FROM "base_and_descendants" AS "namespaces") namespaces_join_table ON namespaces_join_table.id = namespaces.id
           WHERE "namespaces"."type" = 'Group'
           ORDER BY "namespaces_join_table"."depth" ASC) AS "namespaces"
        WHERE "namespaces"."type" = 'Group') namespaces ON namespaces.id=projects.namespace_id
     WHERE (EXISTS
              (SELECT 1
               FROM "project_authorizations"
               WHERE "project_authorizations"."user_id" = 3278538
                 AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (0,
                                             10,
                                             20)))
ORDER BY "packages_packages"."id" DESC
LIMIT 1

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #325274 (closed)

Edited by Toon Claes

Merge request reports