Replace API order by created_at with order by id
In #334667 (comment 632358325) we found a simple API query of https://gitlab.com/api/v4/projects?membership=true&page=1&per_page=10
would result in a 500 Internal Server Error
. The generated query required minutes to complete.
The problem with the query was the order by created_at
portion:
with `order by created_at`
SELECT
MAX(
"project_authorizations"."access_level"
) AS maximum_access_level,
"project_authorizations"."project_id" AS project_authorizations_project_id
FROM
"project_authorizations"
WHERE
"project_authorizations"."user_id" = 8513236
AND "project_authorizations"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE
"project_authorizations"."user_id" = 8513236
AND "projects"."pending_delete" = FALSE
AND "projects"."hidden" = FALSE
ORDER BY
"projects"."created_at" DESC,
"projects"."id" DESC
LIMIT
10 OFFSET 0
)
GROUP BY
"project_authorizations"."project_id"
Time: 52.816 s
- planning: 1.196 ms
- execution: 52.815 s
- I/O read: 2.209 min
- I/O write: 9.693 ms
Shared buffers:
- hits: 3522368 (~26.90 GiB) from the buffer pool
- reads: 588581 (~4.50 GiB) from the OS file cache, including disk I/O
- dirtied: 5178 (~40.50 MiB)
- writes: 481 (~3.80 MiB)
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/9860/commands/34915.
without `order by created_at`
SELECT
MAX(
"project_authorizations"."access_level"
) AS maximum_access_level,
"project_authorizations"."project_id" AS project_authorizations_project_id
FROM
"project_authorizations"
WHERE
"project_authorizations"."user_id" = 8513236
AND "project_authorizations"."project_id" IN (
SELECT
"projects"."id"
FROM
"projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE
"project_authorizations"."user_id" = 8513236
AND "projects"."pending_delete" = FALSE
AND "projects"."hidden" = FALSE
ORDER BY
"projects"."id" DESC
LIMIT
10 OFFSET 0
)
GROUP BY
"project_authorizations"."project_id"
Time: 35.900 ms
- planning: 4.964 ms
- execution: 30.936 ms
- I/O read: 30.439 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 75 (~600.00 KiB) from the buffer pool
- reads: 42 (~336.00 KiB) from the OS file cache, including disk I/O
- dirtied: 1 (~8.00 KiB)
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/9860/commands/34914.
Other endpoints
Merge Requests
'GET /merge_requests' 'when unauthenticated'
New
Time: 16.720 ms
- planning: 3.980 ms
- execution: 12.740 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 6501 (~50.80 MiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10030/commands/35533.
Old
Did not finish?!
SELECT
"merge_requests".*
FROM
"merge_requests"
INNER JOIN "projects" ON "projects"."id" = "merge_requests"."target_project_id"
LEFT JOIN project_features ON projects.id = project_features.project_id
WHERE
(
EXISTS (
SELECT
1
FROM
"project_authorizations"
WHERE
"project_authorizations"."user_id" = 4092171
AND (
project_authorizations.project_id = projects.id
)
AND (
project_authorizations.access_level >= 20
)
)
OR projects.visibility_level IN (10, 20)
)
AND (
"project_features"."merge_requests_access_level" IS NULL
OR "project_features"."merge_requests_access_level" IN (20, 30)
OR (
"project_features"."merge_requests_access_level" = 10
AND EXISTS (
SELECT
1
FROM
"project_authorizations"
WHERE
"project_authorizations"."user_id" = 4092171
AND (
project_authorizations.project_id = project_features.project_id
)
AND (
project_authorizations.access_level >= 20
)
)
)
)
ORDER BY
"merge_requests"."created_at" DESC,
"merge_requests"."id" DESC
LIMIT
20 OFFSET 0
Notes
"when noteable is an Issue" "GET /projects/:id/noteable/:noteable_id/notes" "current user can view the note"
There are also endpoints for snippets and merge requests which I haven't profiled.
New
Time: 0.376 ms
- planning: 0.270 ms
- execution: 0.106 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 8 (~64.00 KiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10030/commands/35548.
Old
Time: 29.192 ms
- planning: 13.971 ms
- execution: 15.221 ms
- I/O read: 14.763 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 3 (~24.00 KiB) from the buffer pool
- reads: 5 (~40.00 KiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10030/commands/35546.
Users
Amended spec with created_at
instead of id
: 'GET /users' "when authenticated" 'when authenticate as a regular user' 'returns the correct order when sorted by id'
New
Time: 13.760 ms
- planning: 0.473 ms
- execution: 13.287 ms
- I/O read: 12.952 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 21 (~168.00 KiB) from the buffer pool
- reads: 2 (~16.00 KiB) from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10030/commands/35561.
Old
Did not finish?!
SELECT
"users".*
FROM
"users"
ORDER BY
"users"."created_at" ASC,
"users"."id" ASC
LIMIT
20 OFFSET 0
When I remove the order by id
so we only order by created_at
the query finishes fine.
Time: 53.867 ms
- planning: 0.412 ms
- execution: 53.455 ms
- I/O read: 53.007 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 0 from the buffer pool
- reads: 23 (~184.00 KiB) from the OS file cache, including disk I/O
- dirtied: 1 (~8.00 KiB)
- writes: 0
Details and visualization: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10030/commands/35560.
A further complication is that the projects API endpoint default sorts by created_at
and the code then adds a further redundant sort of id
.
There is a recommendation from the database team to sort by id
rather than created_at
.
In this MR we replace any API sort by created_at
to a sort by id
.
Closes #334667 (closed), #361607 (closed)