N+1 SQL issue in the tree view when the path lock feature is enabled

The following discussion from !7234 (merged) should be addressed:

  • @nick.thomas started a discussion: (+1 comment)

    We use the same Gitlab::PathLocksFinder instance for every lookup, and it does seem to perform some result caching, but I'm not convinced it's as efficient as it could be in this case, given we know the full set of paths we want to check upfront.

    I'll do some performance testing on this specific case, and make a decision about how to proceed based on that. The most likely outcome is that we open a follow-up issue to improve file lock performance in a separate MR; I wouldn't like the existing improvements to miss %11.3 while I tackle this detached case.

We're talking about this page: https://gitlab.com/gitlab-org/gitlab-ce/tree/master

With path locks enabled, here's an overview of the SQL queries we do according to pb:

1.612ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
1.555ms SELECT "licenses".* FROM "licenses" ORDER BY "licenses"."id" DESC LIMIT 1
1.511ms SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."id" = $1 LIMIT 1
1.508ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
1.442ms SELECT "routes".* FROM "routes" WHERE "routes"."source_id" = $1 AND "routes"."source_type" = $2 LIMIT 1
1.337ms SELECT 1 AS one FROM "identities" WHERE "identities"."user_id" = $1 AND (provider LIKE 'ldap%' AND extern_uid IS NOT NULL) LIMIT 1
1.062ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.982ms SELECT "project_features".* FROM "project_features" WHERE "project_features"."project_id" = $1 LIMIT 1
0.906ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.825ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.816ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.809ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.786ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.725ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.696ms SELECT "routes".* FROM "routes" WHERE "routes"."source_id" = $1 AND "routes"."source_type" = $2 LIMIT 1
0.689ms SELECT "features"."key" FROM "features"
0.665ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.639ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.603ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.421ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.361ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.345ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.342ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.328ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.314ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.306ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.304ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.291ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.278ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.275ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.272ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1
0.252ms SELECT "path_locks".* FROM "path_locks" WHERE "path_locks"."project_id" = $1 AND "path_locks"."path" = $2 LIMIT 1

(updated to remove some queries that a bug in a WIP MR introduced)

The path locks feature now dominates, although the total number of queries per call is in budget, and the total time is low.

pb-with-patch

Since we know all the paths we want to check for up-front, we should be able to reduce this to a single SQL query.

Edited by Nick Thomas