Skip to content

fix(s3): handle pagination of parts in s3

Suleimi Ahmed requested to merge 1062-max-layer-size-of-10GB into master

What does this MR do?

We only receive (at most) the first 1000 parts when uploading a layer blob (i.e with each part having a max of 10485760 bytes - this would result in the 0GB (1000 * 10485760 bytes) limit per page). if a layer exceeds 10GB it spills into the next page of parts which is currently not accounted for and hence resulting in errors when trying to calculate the total size of a multipart upload that exceeds 10GB https://gitlab.com/gitlab-org/container-registry/-/blob/master/registry/handlers/blobupload.go?ref_type=heads#L388.

In this MR we add code to handle pagination of parts, which in turn fixes the max layer size of 10GB bug.

Cherry picked from https://github.com/distribution/distribution/pull/2815/commits/bda79219b2be81d8748499a00afb94bb5f67261d

Signed-off-by: Jack Baines jack.baines@uk.ibm.com

Related to #1062 (closed)

How to test

  • Build and run the registry using a s3 storage backend
  • Push an image greater than 10Gb:

docker pull vad1mo/10gb-random-file:latest

docker tag vad1mo/10gb-random-file:latest registry:5000/vad1mo/10gb-random-file:latest

docker push registry:5000/vad1mo/10gb-random-file:latest

  • Verify push succeeds

Author checklist

  • Feature flags
    • Added feature flag:
    • This feature does not require a feature flag
  • I added unit tests or they are not required
  • I added documentation (or it's not required)
  • I followed code review guidelines
  • I followed Go Style guidelines
  • For database changes including schema migrations:
    • Manually run up and down migrations in a postgres.ai production database clone and post a screenshot of the result here.
    • If adding new queries, extract a query plan from postgres.ai and post the link here. If changing existing queries, also extract a query plan for the current version for comparison.
    • Do not include code that depends on the schema migrations in the same commit. Split the MR into two or more.
  • Ensured this change is safe to deploy to individual stages in the same environment (cny -> prod). State-related changes can be troublesome due to having parts of the fleet processing (possibly related) requests in different ways.

Reviewer checklist

  • Ensure the commit and MR tittle are still accurate.
  • If the change contains a breaking change, apply the breaking change label.
  • If the change is considered high risk, apply the label high-risk-change
  • Identify if the change can be rolled back safely. (note: all other reasons for not being able to rollback will be sufficiently captured by major version changes).

If the MR introduces database schema migrations:

  • Ensure the commit and MR tittle start with fix:, feat:, or perf: so that the change appears on the Changelog
If the changes cannot be rolled back follow these steps:
  • If not, apply the label cannot-rollback.
  • Add a section to the MR description that includes the following details:
    • The reasoning behind why a release containing the presented MR can not be rolled back (e.g. schema migrations or changes to the FS structure)
    • Detailed steps to revert/disable a feature introduced by the same change where a migration cannot be rolled back. (note: ideally MRs containing schema migrations should not contain feature changes.)
    • Ensure this MR does not add code that depends on these changes that cannot be rolled back.
Edited by Suleimi Ahmed

Merge request reports