Support pagination for `order_by=repository_size` or else support `exclude=1,2,...,X`
Problem
When fetching a list of projects ordered by the size of their respective repositories, it is important to have a method for collecting more than a given number of repositories.
Currently, when requesting a keyset paginated result set for projects ordered by repository size, an error is produced:
< HTTP/2 405
{"error":"Keyset pagination is not yet available for this type of request"}
This is because keyset pagination is not available when order by repository size: https://docs.gitlab.com/ee/api/README.html#keyset-based-pagination
An example use-case of the current method ordered by repository size is repository migration. After moving a repository from one shard to another, the repository_storage field will get updated. This means that if I re-execute the request for /api/v4/projects?order_by=repository_size&repository_storage=nfs-file45 the recently migrated project repository will no longer appear in the return results. However, if the migration fails, then the project will still be included.
Eventually, if enough migrations fail, then the returned results for /api/v4/projects will contain no projects which are capable of being migrated, and it will be impossible to use this method as a feeder to automated migration and re-balancing tools.
Intended users
Sidney (Systems Administrator)
User experience goal
Operationally speaking, administrative project repository replication and migration from one gitaly storage shard to another.
Proposal
Implement an admin API method to:
- Select beyond some result set of projects ordered by repository size on a particular repository storage.
- And also an additional method to exclude projects from a similar such result set by project identifiers.
Further details
Use cases include repository migration and statistics reporting. Benefits include automated repository balancing across configured shards.
Permissions and Security
The permission level here should be admin, and a Private-Token header should be required, the value of which should be a token set with the lowest level appropriate for read-only access to metadata information for all reprojectories in the GitLab managed inventory.
Availability & Testing
- What risks does this change pose to our availability?
- No known risks to availability.
- How might it affect the quality of the product?
- Unlikely any adverse effects.
- What additional test coverage or changes to tests will be needed?
- Unit tests for the backend controller implementations.
- Will it require cross-browser testing?
- No.
Test areas (unit, integration and end-to-end) that need to be added or updated to ensure that this feature will work as intended:
- Unit test changes
- End-to-end test change
What does success look like, and how can we measure that?
Success looks like the implementation of keyset pagination already implemented for other order_by parameter targets, for the new parameter target of repository_size.