API: Keyset pagination support
What does this MR do?
This MR introduces keyset pagination for API calls. More background can be found in https://gitlab.com/gitlab-org/gitlab-ce/issues/45756 and these great explanations: https://use-the-index-luke.com/no-offset
Specifically with this change, keyset pagination can be used optionally instead of normal (offset) pagination. That is, the client needs to explicitly enable keyset pagination. This is done by appending a GET parameter pagination=keyset
.
Initially, GET requests don't need any extra information. Page size defaults to standards we already use for normal pagination.
GET requests will have specific headers that clients use to retrieve the next page. Let's look at an example request:
> GET /api/v4/projects?pagination=keyset HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.47.0
> Accept: */*
> PRIVATE-TOKEN: ...
>
< HTTP/1.1 200 OK
< Link: <http://localhost:3000/api/v4/projects?archived=false&ks_prev_created_at=2018-05-07+10%3A26%3A13+UTC&ks_prev_id=7&membership=false&order_by=created_at&owned=false&page=1&pagination=keyset&per_page=2&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="next", <http://localhost:3000/api/v4/projects?archived=false&membership=false&order_by=created_at&owned=false&page=1&pagination=keyset&per_page=2&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false>; rel="first"
< X-Next-Page: http://localhost:3000/api/v4/projects?archived=false&ks_prev_created_at=2018-05-07+10%3A26%3A13+UTC&ks_prev_id=7&membership=false&order_by=created_at&owned=false&page=1&pagination=keyset&per_page=2&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false
< X-Per-Page: 20
Let's look at the X-Next-Page
header with filter params removed:
http://localhost:3000/api/v4/projects?ks_prev_created_at=2018-05-07+10%3A26%3A13+UTC&ks_prev_id=7&order_by=created_at&pagination=keyset&per_page=2
These ks_*
parameters are automatically added (ks
for "keyset") and will be used to exclude records we already got. That is, those values correspond to the created_at
and id
of the last record in this page.
In this example, created_at
is used because default order is created_at desc
. If there was no order specified, we'd just use ks_prev_id
.
Note that the client may specify a (single) custom order_by
field with ascending or descending order. We always include the primary key and additionally the order_by
field to exclude records we've already seen. The reason to always include the primary key is to guarantee a deterministic order (for example, created_at
may have duplicates).
Internally, we use the mentioned ks_
parameters as follows:
- To exclude records we've already seen earlier, i.e.
WHERE (created_at, id) < (X, Y)
- To order the result relation, i.e.
ORDER BY created_at, id
orORDER BY created_at DESC, id DESC
.
Missing features (compared to offset pagination) or things to note:
- No way to jump to a certain page
- Client needs to stop asking for more once an empty response is received
Are there points in the code the reviewer needs to double check?
Why was this MR needed?
Screenshots (if relevant)
#note_74065760
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary (omitted because we're stealth-releasing this)This is a stealth release so we can first test things. -
Documentation created/updated (omitted because we're stealth-releasing this) -
API support added -
Tests added for this feature/bug - Review
-
Has been reviewed by Backend -
Has been reviewed by Database
-
-
Conform by the merge request performance guides -
Conform by the style guides -
Squashed related commits together -
End-to-end tests pass ( package-and-qa
manual pipeline job)