Rewrite SnippetsFinder to improve performance
What does this MR do?
Summary: this MR rewrites SnippetsFinder
from the ground up, improving its
performance by a factor of about 1500 (in the best case scenario).
This completely rewrites the SnippetsFinder class from the ground up in order to improve its performance. The old code was beyond salvaging. It was complex, included various Rails 5 workarounds, comments that shouldn't be necessary, and most important of all: it produced a really poorly performing database query.
As a result, I opted for rewriting the finder from scratch, instead of trying to patch the existing code. Instead of trying to reuse as many existing methods as possible, I opted for defining new methods specifically meant for the SnippetsFinder. This requires some extra code here and there, but allows us to have much more control over the resulting SQL queries. It is these changes that then allow us to produce a much more efficient query.
To illustrate how bad the old query was, we will use my own snippets as an example. Currently I have 52 snippets, most of which are global ones. To retrieve these, you would run the following Ruby code:
user = User.find_by(username: 'yorickpeterse')
SnippetsFinder.new(user, author: user).execute
On GitLab.com the resulting query will take between 10 and 15 seconds to run, producing the query plan found at https://explain.depesz.com/s/Y5IX. Apart from the long execution time, the total number of buffers (the sum of all shared hits) is around 185 GB, though the real number is probably (hopefully) much lower as I doubt simply summing these numbers produces the true total number of buffers used.
The new query's plan can be found at https://explain.depesz.com/s/wHdN, and this query takes between 10 and 100-ish milliseconds to run. The total number of buffers used is only about 30 MB.
EE MR: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/8105
Query differences
Old SQL query for getting my snippets (ID 209240):
SELECT snippets.*
FROM snippets
WHERE snippets.project_id IN (
SELECT id
FROM
(
SELECT projects.*
FROM projects
LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE EXISTS (
SELECT 1
FROM project_authorizations
WHERE project_authorizations.user_id = 209240
AND project_authorizations.project_id = projects.id
)
AND (
project_features.snippets_access_level IN (NULL, 20, 30)
OR (
project_features.snippets_access_level = 10
AND EXISTS (
SELECT 1
FROM project_authorizations
WHERE project_authorizations.user_id = 209240
AND project_authorizations.project_id = projects.id
)
)
)
UNION
SELECT projects.*
FROM projects
LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE projects.visibility_level IN (10, 20)
AND (
project_features.snippets_access_level IN (NULL, 20, 30)
OR (
project_features.snippets_access_level = 10
AND EXISTS
(
SELECT 1
FROM project_authorizations
WHERE project_authorizations.user_id = 209240
AND project_authorizations.project_id = projects.id
)
)
)
) projects
)
OR snippets.project_id IS NULL
AND (
EXISTS (
SELECT 1
FROM project_authorizations
WHERE project_authorizations.user_id = 209240
AND project_authorizations.project_id = snippets.project_id
)
OR snippets.visibility_level IN (10, 20)
OR snippets.author_id = (209240)
)
AND snippets.author_id = 209240
ORDER BY created_at DESC
And the new SQL query:
SELECT snippets.*
FROM (
SELECT snippets.*
FROM snippets
WHERE snippets.author_id = 209240
AND snippets.project_id IS NULL
UNION
SELECT snippets.*
FROM snippets
INNER JOIN projects ON projects.id = snippets.project_id
INNER JOIN project_features ON project_features.project_id = projects.id
WHERE snippets.author_id = 209240
AND projects.visibility_level IN ( 10,20)
AND project_features.snippets_access_level IN (NULL,20,30)
UNION
SELECT snippets.*
FROM snippets
INNER JOIN projects ON projects.id = snippets.project_id
INNER JOIN project_features ON project_features.project_id = projects.id
WHERE snippets.author_id = 209240
AND project_features.snippets_access_level IN (NULL,20,30,10)
AND EXISTS (
SELECT 1
FROM project_authorizations
WHERE project_id = snippets.project_id
AND project_authorizations.user_id = 209240
)
)
snippets
ORDER BY created_at DESC
Migration output
From staging:
[ gstg ] production> MigrateSnippetsAccessLevelDefaultValue.new.migrate(:up)
== MigrateSnippetsAccessLevelDefaultValue: migrating =========================
-- change_column_default(:project_features, :snippets_access_level, 20)
-> 0.0875s
-- change_column_null(:project_features, :snippets_access_level, false)
-> 0.3264s
== MigrateSnippetsAccessLevelDefaultValue: migrated (6.9050s) ================
=> nil
[ gstg ] production> MigrateSnippetsAccessLevelDefaultValue.new.migrate(:down)
== MigrateSnippetsAccessLevelDefaultValue: reverting =========================
-- change_column_null(:project_features, :snippets_access_level, true)
-> 1.4426s
-- change_column_default(:project_features, :snippets_access_level, nil)
-> 0.0153s
== MigrateSnippetsAccessLevelDefaultValue: reverted (1.4582s) ================
What are the relevant issue numbers?
https://gitlab.com/gitlab-org/gitlab-ce/issues/52639
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Tests added for this feature/bug -
Conforms to the code review guidelines -
Conforms to the merge request performance guidelines -
Conforms to the style guides -
Conforms to the database guides -
Link to e2e tests MR added if this MR has Requires e2e tests label. See the Test Planning Process.