Make API pagination size configurable

This has been requested by a GitLab EE user.

@athar So whats this doing here?

@wojciechlisik This is not an EE specific feature so it lives here. All development common to CE and EE happens in CE and then gets merged to EE.

added ~480950 label

changed title from Disable or control API pagination to Make API pagination size configurable

Disabling pagination is a recipe for timeouting requests, we shouldn't allow that.

@razer6 Won't this be covered by your changes?

https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/8606/diffs

@rymai it depends on how the OS on server is configured...... It well may not be recipe for timeouts.....

it depends on how the OS on server is configured...... It well may not be recipe for timeouts.....

@wojciechlisik Then it will be a recipe for a huge memory consumption since all the Ruby objects will need to be created before the response is sent back. :)

Make per_page default and maximum value configurable

@athar I'd be ok with that but I'd also like to have a rationale on why this is needed: if you want more than 100 results, you can just issue multiple requests?

Keeping pagination on by default is a reasonable way to prevent people from unknowingly causing themselves problems. However it also seems reasonable to at least a knowledgeable user pass a parameter to turn off pagination for a specific request.

ZD: https://gitlab.zendesk.com/agent/tickets/81571

Should we consider raising the per_page maximum for just the API?

Right now it's set to a maximum of 100 globally, I think that's quite restraining for the an API.

@reprazent I think that's a reasonable idea.

Raising the per_page limit would also help with the following request:

https://gitlab.com/gitlab-org/gitlab-ce/issues/18228

the arbitrary limit of 100 records on our self-hosted enterprise edition of gitlab adds unnecessary complexity to the automation of tasks we intend to execute across groups with 100+ projects via the API using CLI tools and shell scripting.

i understand that a change like this could conflate expectation and accountability for performance but, ceteris paribus, available server resources/performance are our problem and not gitlab's.

allowing the user to decide the limits (or to even turn pagination off) is greatly preferable to writing any number of workarounds that then need to be maintained/remembered when upgrading to a new version and the workarounds stop working.

What workarounds have you had to use? 2.5 years ago there was an issue where you asked for a page and Gitlab returned everything (causing infinite loops on improperly error-checked code), but since then it's been rock-solid for us (from both Python and Rust). We made our Gitlab API callers just do the depagination for us so that when something asked for all projects, they got all projects rather than each API user having do depaginate manually.

https://gitlab.zendesk.com/agent/tickets/100605 (for internal use)

This issue has been around for a year, and I'm surprised it hasn't gotten more traction. My company actually had to roll back the version we are using because the upgrade broke most of our interactions with GitLab's API.

As has been mentioned in this thread already: GitLab is source control, not the guardian of my system's resources.

Here's an example. I do not feel it is unreasonable to want to query a list of a project's files in its entirety. A "recursive" parameter even exists to help this scenario. Previously, a project with 10,000 files would return in a single sub-second request. Now it takes a minimum of 100 requests, which could take many seconds (even when run in parallel). When compared to this forced alternative, arguments relating to GitLab's performance carry less weight.

Timeouts can be configured. The only workaround for this seems to be an excessive number of API requests, which is an objectively terrible solution to a problem that doesn't seem necessary to begin with.

@jramsay You may want to have a look at this. I think "Make per_page default and maximum value configurable" is a reasonable proposal.

Thanks for the ping. Let's see if there is more widespread interest in this change. One of the requests was around improving the performance of an integration with GitLab by making fewer larger requests, but that use case wouldn't benefit most users unless we lifted the limit on GitLab.com.

added Create [DEPRECATED] label and removed 1 deleted label

@kylifornication I saw your comment on https://gitlab.com/gitlab-org/gitlab-ee/issues/7816 – this is the more general issue for tracking instance per_page settings /cc @jwoods06

A customer w/ 3500 users is interested in this... https://gitlab.my.salesforce.com/00161000002xBeQ

Why interested: More configurable pagination would make it easier for them to view diffs in their entirety.
How important to them (prioritization): HIGH

Yes, please make it configurable, including infinite. What was a single curl request is now a handful of lines of logic.

Large Premium Customer with 1500 seats is requesting this feature --> https://gitlab.zendesk.com/agent/tickets/118609 (internal use only)

From the customer:

What parts of this issue are important to you and why? – We would like to be able to configure the per_page default and maximum values in the admin panel and to also be able to go past the current 100 per_page limit to a value of our need.

Have you tried any workarounds? – There are no workarounds for our use case. Having more items per_page would reduce the total number of API calls on some expensive endpoints.

What is the priority of this issue to your organization? – Increasingly useful and needed.

cc/ @jramsay

added customer+ label

We don't want the API to randomly timeout or reduce the performance of GitLab for other users. Allowing very large or unbounded response sizes is a recipe for all sorts of failures that the client would need to handle. Pagination unlike slow responses and poor performance is predictable and a common restriction of many other APIs.

I don't care @jramsay, let me assume the risk by adding a knob to disable it. By all means, keep the current settings, but allow me to live dangerously when I want. This "safety" feature effectively turns a quick curl into a script.

This is for my 3000 seat instance (that is edging very close on being a 0 seat license).

@jramsay Since most of us run our own installation/setup in our own infrastructure i would say that it is our risk to take when increasing the per_page item limits. The proposed change is to add a setting in the admin panel for each customer to tailor the API calls as needed, this way GitLab.com would not be impacted or other customers if they keep the default 100 per page; but give the customers that need more items per page the ability to do so at their own risk. As you said, there are ways to mitigate slow responses or overall worker queues, timeouts, etc.

We're discussing this now, and someone pointed out that if we had an SDK to make it easy to handle pagination, maybe it would make this less of an issue.

FWIW there's the gitlab Ruby gem which provides an .auto_paginate method to abstract the pagination handling, e.g.

# a paginated response
projects = Gitlab.projects(per_page: 100)

# iterate all projects
projects.auto_paginate do |project|
  # do something
end

– https://www.rubydoc.info/gems/gitlab/frames

Not sure what you have in mind with the SDK @markpundsack, but unless I can craft a quick curl that can disable pagination, it's still broken in my mind.

Not that it matters to me any more. We've decided to move to a different platform.

I have a Premium customer w/ 3500 users who is interested in this... https://gitlab.my.salesforce.com/00161000004bZPD

They would like to see a "max api pagination size" configurable in the Admin panel. Let end users use whatever pagination size they want, but allow the GitLab admins to set an upper limit to prevent folks from getting out of hand.

added devopscreate label

added [deprecated] Accepting merge requests label

removed [deprecated] Accepting merge requests label

added 1 deleted label

removed 1 deleted label

added devopssystems label and removed devopscreate label

added groupecosystem [DEPRECATED] label

3000+ seat Premium customer requesting this feature --> https://gitlab.zendesk.com/agent/tickets/140684 (internal use only)

cc/ @deuley would this request fall under your product group?

I think it could definitely be a good fit for our work. Thanks for pinging me on this.

/cc @nhxnguyen anything come to mind that may be related to this?

mentioned in issue #39033 (closed)

mentioned in issue #28524

Make API pagination size configurable

Description

Proposal

Designs

Child items ...

Activity

Make API pagination size configurable

Description

Proposal

Relates to

Activity