the Stat call in s3 storage drivers should not rely on lexographical sort only

This issue is meant as a central point for coordinating fixing of the bug in s3 storage drivers.

The problem is that List call from aws-sdk-go(v1/v2) was limited to returning max 1 element that matches given prefix. This usually works because AWS returns objects in lexicographical sort order, so foo goes before foo/ but breaks down for special characters that go before / like , !, ", #, $, %, &, ', (, ), *, +, ,, -, .. In case of storage driver paths, this list narrows down to ., _, -.

The fix is simple - instead of limiting to the List call to just the first entry returned, we ask sdk to return us the maximum (1000) entries and iterate over them until the first match is done. There should be no difference cost-wise, as we were doing List call anyway. The comparision loops will stop as soon as we get past the place where / should have been.

The bug itself is present in both v1, v2 and upstream. It does not manifest in upstream though because we have an extra check to tags deletion that triggers it.

https://gitlab.com/gitlab-org/container-registry/-/blob/aadc35b1922888ea671f9f34e7fafc1db885943b/registry/storage/tagstore.go#L100-111 https://github.com/distribution/distribution/blob/main/registry/storage/tagstore.go#L108-L118

I do not know ATM what activated the bug now, and not earlier though. If I have some time I will investigate it deeper.

RFH's:

Fix for master branch: !2348 (merged) Backports:

Backport exception issue: https://gitlab.com/gitlab-org/release/tasks/-/issues/20053

Edited by Pawel Rozlach