Skip to content

WIP: Improve snippet search performance

What does this MR do?

Improve performance of snippets search. The original plan is https://explain.depesz.com/s/JXHr and the original query is:

Original Query
SELECT "snippets".*
FROM
  (SELECT "snippets".*
   FROM "snippets"
   WHERE (snippets.visibility_level IN (0,
                                        10,
                                        20)
          OR snippets.author_id = 955795)
     AND "snippets"."project_id" IS NULL
   UNION SELECT "snippets".*
   FROM "snippets"
   INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
   INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
   WHERE (snippets.visibility_level IN (0,
                                        10,
                                        20)
          OR snippets.author_id = 955795)
     AND (projects.visibility_level IN (0,
                                        10,
                                        20))
     AND "project_features"."snippets_access_level" IN (20,
                                                        30)
   UNION SELECT "snippets".*
   FROM "snippets"
   INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
   INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
   WHERE "project_features"."snippets_access_level" IN (20,
                                                        30,
                                                        10)
     AND (EXISTS
            (SELECT 1
             FROM "project_authorizations"
             WHERE (project_id = snippets.project_id)
               AND "project_authorizations"."user_id" = 1))) snippets
WHERE "snippets"."content" ILIKE '%asd%'
ORDER BY created_at DESC,
         updated_at DESC
LIMIT 1000

Admin user

When the user is an admin, we include several subqueries that are not really necessary. For example:

SELECT "snippets".*
   FROM "snippets"
   WHERE (snippets.visibility_level IN (0,
                                        10,
                                        20)
          OR snippets.author_id = 955795)
     AND "snippets"."project_id" IS NULL

We're fetching here snippets with any visibility or if the author_id matches the current user. If we retrieve any record no matter the visibility, we doesn't need to check for the author, because the snippet will be already fetched from the first condition.

Therefore we can reduce the query to:

SELECT "snippets".*
   FROM "snippets"
   WHERE "snippets"."project_id" IS NULL

The new plan is https://explain.depesz.com/s/5pMx. The execution time is not representative compared to the original plan and is almost the same because underneath we're making the same request. Nevertheless, the planning time has been reduced.

Based on the new query, we have a similar problem with project visibility:

SELECT "snippets".*
   FROM "snippets"
   INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
   INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
   WHERE (projects.visibility_level IN (0,
                                        10,
                                        20))
     AND "project_features"."snippets_access_level" IN (20,
                                                        30)

Here we retrieve all projects no matter the visibility_level and the check the snippet_access_level. We can remove the first condition since it retrieves all the projets basically:

SELECT "snippets".*
   FROM "snippets"
   INNER JOIN "projects" ON "projects"."id" = "snippets"."project_id"
   INNER JOIN "project_features" ON "project_features"."project_id" = "projects"."id"
   WHERE "project_features"."snippets_access_level" IN (20,
                                                        30)

The new plan is https://explain.depesz.com/s/oOh7

Refs https://gitlab.com/gitlab-org/gitlab-ee/issues/26123

Does this MR meet the acceptance criteria?

Conformity

Edited by 🤖 GitLab Bot 🤖

Merge request reports