Improvements to the Maven API files endpoints [RUN ALL RSPEC] [RUN AS-IF-FOSS]
MR 57600
🐿 Context
The GitLab maven package registry exposes a number of endpoints at different levels:
- Project, under ``
- Group, under ``
- Instance, under ``
In this MR, we're going to tackle the performance of the Group level file endpoint. The endpoint logic is quite simple:
- locate the package within the group
- check the permissions
- locate the package file requested
- return it
- Well actually, on gitlab.com a redirect to the object storage location is returned.
Here is the p95 for the response time. Well... I think it's obvious, this chart does not bring joy.
Looking at a particularly long request, we see this:
json.duration_s: 15.407
json.db_duration_s: 15.35
So clearly, we have an issue with the queries triggered with that endpoint.
We conducted a small analysis on this part. We identified 3 possible improvements:
- Avoid accessing the
projects
table in this scope - Avoid accessing the
packages_packages
table 3 times- That's a principle from the "Rails performance" course:
Don't access a table more than once per web request
- That's a principle from the "Rails performance" course:
- Implement #287638 (closed) which brings an additional condition when selecting projects.
The above is exactly #325869 (closed) and this MR will implement these 3 changes.
The maven package registry being heavily used, those changes will be gated behind a feature flag.
🔬 What does this MR do?
- feature flag tracking issue: #326099 (closed)
projects
table in this scope
Improvement 1: avoid accessing the - Put the
#any?
call behind the feature flag - Add specs for
Packages::Packages.for_projects
packages_packages
table 3 times
Improvement 2: Avoid accessing the - These access were done to check if the finder was dealing with versionless packages (a package with a
nil
version). When it was the case, the finder had to override the default order (most recent package created first) to a specific order (package with most recent package file created first) - The client of that finder can already tell if we're dealing with versionless packages or not. As such, the finder only needs to implement an option (
order_by_package_file
) which will use the custom ordering or not.- This option is gated behind the feature flag.
- Updated the related specs
- Updated
maven_packages.rb
to detect versionless packages and use the custom order in that case.- Note that the finder is used on the file endpoint on all levels but the change will only happen when the finder works with a
group
(Group level)
- Note that the finder is used on the file endpoint on all levels but the change will only happen when the finder works with a
- Updated the
maven_packages.rb
related specs- Add more examples for the group level context to verify that the versionless package detection was working as expected
#287638 (closed)
Improvement 3: Implement- The package finder helper was simply used for this. It's usage is gated behind the feature flag
- Update the related specs
📸 Screenshots (strongly suggested)
I can't really benchmark the endpoint to show a reduction in response time. This is due to the conditions needed to have slow requests on the Group level: we need many, many, many subgroups and sub projects with many, many packages.
Instead, I will dump the requests triggered by the endpoint and hopefully, we should have less queries with this MR.
Before this MR
User Load (0.6ms) SELECT "users".* FROM "users" WHERE "users"."id" = 1 ORDER BY "users"."id" ASC LIMIT 1
Group Load (0.6ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252 LIMIT 1
IpRestriction Load (0.3ms) SELECT "ip_restrictions".* FROM "ip_restrictions" WHERE "ip_restrictions"."group_id" = 252
Project Exists? (2.4ms) SELECT 1 AS one FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces") AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE "project_authorizations"."user_id" = 1 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN (0,10,20)) LIMIT 1
Packages::Package Exists? (2.4ms) SELECT 1 AS one FROM "packages_packages" INNER JOIN "packages_maven_metadata" ON "packages_maven_metadata"."package_id" = "packages_packages"."id" WHERE "packages_packages"."project_id" IN (SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces") AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE "project_authorizations"."user_id" = 1 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN (0,10,20))) AND "packages_maven_metadata"."path" = 'gl/pru/maven_pkg_01/9.2.1' LIMIT 1
Packages::Package Load (2.1ms) SELECT "packages_packages".* FROM "packages_packages" INNER JOIN "packages_maven_metadata" ON "packages_maven_metadata"."package_id" = "packages_packages"."id" WHERE "packages_packages"."project_id" IN (SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces") AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE "project_authorizations"."user_id" = 1 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN (0,10,20))) AND "packages_maven_metadata"."path" = 'gl/pru/maven_pkg_01/9.2.1' ORDER BY "packages_packages"."id" ASC LIMIT 1
Packages::Package Load (2.3ms) SELECT "packages_packages".* FROM "packages_packages" INNER JOIN "packages_maven_metadata" ON "packages_maven_metadata"."package_id" = "packages_packages"."id" WHERE "packages_packages"."project_id" IN (SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT id FROM "base_and_descendants" AS "namespaces") AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE "project_authorizations"."user_id" = 1 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN (0,10,20))) AND "packages_maven_metadata"."path" = 'gl/pru/maven_pkg_01/9.2.1' ORDER BY "packages_packages"."id" DESC LIMIT 1
Project Load (0.8ms) SELECT "projects".* FROM "projects" WHERE "projects"."id" = 289 LIMIT 1
ProjectFeature Load (0.5ms) SELECT "project_features".* FROM "project_features" WHERE "project_features"."project_id" = 289 LIMIT 1
Group Load (0.5ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 253 AND "namespaces"."type" = 'Group' LIMIT 1
Namespace Load (0.5ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."id" = 252 LIMIT 1
Group Load (1.3ms) WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 253)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces".* FROM "base_and_ancestors" AS "namespaces" WHERE "namespaces"."parent_id" IS NULL LIMIT 1
SamlProvider Load (0.3ms) SELECT "saml_providers".* FROM "saml_providers" WHERE "saml_providers"."group_id" = 252 LIMIT 1
Packages::PackageFile Load (0.5ms) SELECT "packages_package_files".* FROM "packages_package_files" WHERE "packages_package_files"."package_id" = 488 AND "packages_package_files"."file_name" = 'maven_pkg_01-9.2.1.pom' ORDER BY "packages_package_files"."id" DESC LIMIT 1
14 SQL queries
With this MR
User Load (2.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = 1 ORDER BY "users"."id" ASC LIMIT 1
Group Load (2.0ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252 LIMIT 1
IpRestriction Load (0.4ms) SELECT "ip_restrictions".* FROM "ip_restrictions" WHERE "ip_restrictions"."group_id" = 252
Packages::Package Load (5.9ms) SELECT "packages_packages".* FROM "packages_packages" INNER JOIN "packages_maven_metadata" ON "packages_maven_metadata"."package_id" = "packages_packages"."id" WHERE "packages_packages"."project_id" IN (SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (WITH RECURSIVE "base_and_descendants" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 252)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "id" FROM "base_and_descendants" AS "namespaces") AND (EXISTS (SELECT 1 FROM "project_authorizations" WHERE "project_authorizations"."user_id" = 1 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN (0,10,20))) AND "packages_maven_metadata"."path" = 'gl/pru/maven_pkg_01/9.2.1' ORDER BY "packages_packages"."id" DESC LIMIT 1
Project Load (0.9ms) SELECT "projects".* FROM "projects" WHERE "projects"."id" = 289 LIMIT 1
ProjectFeature Load (0.5ms) SELECT "project_features".* FROM "project_features" WHERE "project_features"."project_id" = 289 LIMIT 1
Group Load (0.5ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 253 AND "namespaces"."type" = 'Group' LIMIT 1
Namespace Load (0.7ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."id" = 252 LIMIT 1
Group Load (1.0ms) WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 253)
UNION
(SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT "namespaces".* FROM "base_and_ancestors" AS "namespaces" WHERE "namespaces"."parent_id" IS NULL LIMIT 1
SamlProvider Load (0.4ms) SELECT "saml_providers".* FROM "saml_providers" WHERE "saml_providers"."group_id" = 252 LIMIT 1
Packages::PackageFile Load (0.9ms) SELECT "packages_package_files".* FROM "packages_package_files" WHERE "packages_package_files"."package_id" = 488 AND "packages_package_files"."file_name" = 'maven_pkg_01-9.2.1.pom' ORDER BY "packages_package_files"."id" DESC LIMIT 1
11 SQL queries.
From the original 14 queries:
-
- 2 queries on
packages_packages
table
- 2 queries on
-
- 1 query for the
Project exists?
- 1 query for the
🔩 Does this MR meet the acceptance criteria?
Conformity
-
📋 Does this MR need a changelog?-
I have included a changelog entry. -
I have not included a changelog entry because feature flag
-
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides - [-] Separation of EE specific content
Availability and Testing
- [-] Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
- [-] Tested in all supported browsers
- [-] Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
💿 Database review
The changes in this MR will:
- remove 3 queries
- update an existing query:
- The queries and the explain plans for this existing query are here