A customer with over 10,000+ tags in a registry is attempting to use the API to delete tags in bulk, but no images ever get deleted. Running the service manually also hangs.
Digging into our service looks like we're loading all the tags and then ordering them when the service runs. We could run them in batches instead of all at once.
Hi @ayufan , in the worst case we have 60k+ tags
We are going to migrate our infrastructure this weekend and at this point I'm thinking about just deleting the tags from S3 directly.
I'm not sure if it's something that we should even consider but I can't think of a better solution right now.
Any advise regarding this? The main issue is that we need to keep some images, and we don't have an easy way for getting all the images running in production so that we could filter them out when cleaning up the registry, that's why we wanted to keep something like ~3 months of images.
I'm using exactly that API, the sidekick job starts but it never ends up unlinking any image from the registry. Trying to do the same via the gitlab-rails console with the same result. After one SQL query there is no other output on the console, and it still doesn't unlink any of the images.
What about deleting the references directly from S3? Is that something we should consider? Is there an easy way to filter out the most recent tags?
We solved this problem by just manually unlinking the images from AWS. We used the minio client and we got rid of all the files under /docker/registry/v2/repositories/path/name/_manifests older than, in our case, 3 months.
The tags seems to have been unlinked correctly and they aren't shown in GitLab anymore, although we haven't run the GC yet.
I think this issue is still valid as the GitLab Registry API is not able to bulk delete anything when having lots of tags. In our worst case 60k+ tags (I'd say around 20k+ unique digests)
@cindy we are currently working on providing tag retention and expiration policies (&2270 (closed)), which include performance improvements (#208220 (closed)) and throttling (#208193 (closed)) to avoid problems like this.