Infinite loop using Minio S3 as registry backend when removing manifests
Hi there everyone,
I'm terribly sorry for cross-posting this, I opened issue gitlab#386845 (closed) and also Minio issue 16362 to try track this problem down.
In summary, using Minio as an S3 backend for the container registry in Gitlab and removing a manifest results in an infinite loop, which seems to persist across restarts.
Partly this looks like a regression after changes in 3.62.0 (I think), using 3.61.0 doesn't give an infinite loop but does return manifest errors preventing the removal of manifests.
The Minio issue referenced above appears to point towards versioning of the bucket, which works with AWS but not with Minio. As far as I can tell (and I'm probably wrong), it seems that Minio returns a key count including versions while AWS doesn't.
Versioning in Minio is mandatory when used with replication, the setup I originally discovered this on had 3 nodes with replication between them. Replication is used by a number of providers and can cause excessive bills if billing is (even partly) based on number of requests.
If anyone could kindly point me in the right direction to try either do extra debugging, or possibly where to look to contribute a fix I'm more than willing to spend more time on this.
There is a full minimum working example in the gitlab-org/gitlab issue referenced above.