Skip to content

Investigate slow /api/:version/users query

Summary

Slow API requests due to offset-based pagination. A switch to keyset-based pagination woulp help.

More details on approach to fix

  • We should change this to not use offset. Since offset inhibits paginating via an index and thus puts a lot of load on postgres.
  • We should use keyset pagination
  • Examples of where we use keyset pagination for the rest api:
  • The remaining challenge here is that it would be a breaking change if we removed offset pagination so we will need to support both for some period of time and hope that customers move over to keyset pagination.
  • In the meantime, there is a way to automatically "switching" from offset pagination to keyset pagination if the following conditions are present:
    • The integration uses the next page headers for loading the next page. (https://docs.gitlab.com/ee/api/rest/#pagination-link-header)
    • The integration doesn't use the count and the current-page headers.
    • Integration requests a page.
      • In the next page LINK header generate add an extra parameter to the next page query: ?uses_link_header=true
      • If an API call has the uses_link_header=true query parameter, we know that they take the next page url from the link headers so we can generate the keyset-paginated url and provide it to the integration: ?cursor=xyz
      • The next request will start using keyset pagination

Details on problem

Query:

SELECT "users".* FROM "users"
WHERE "users"."state" NOT IN ($1, $2, $3)
AND ("users"."state" IN ($4))
AND "users"."user_type" IN ($5, $6, $7, $8)
AND "users"."user_type" IN ($9, $10, $11, $12)
AND "users"."user_type" IN ($13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24, $25, $26)
ORDER BY "users"."id" DESC
LIMIT $27
OFFSET $28

Example of a single user scraping /api/v4/users and paginating a lot:

Screenshot_2023-07-24_at_16.27.03

source

Edited by Jessie Young