Gitlab container registry not cleaning up during manual run

Summary

We are using extensively gitlab container registry - and just after possibility to do so, we enabled auto-cleanup of repositories by setting cleanup policy. Unfortunately, it does not work - possible reason is because of too many images. Automated run ends up with message:

The cleanup policy timed out before it could delete all tags.

Manual - mentioned in https://gitlab.nethone.io/help/administration/packages/container_registry#run-the-cleanup-policy-now - ends up with following message after a really long time:

could not generate manifest

What is more frustrating is that registry api has no pagination, so also I cannot do a script based on responses from api to delete older images (it returns only subset).

Steps to reproduce

  1. Have a container registry with 26788 tags - as a backend s3 bucket
  2. Set policy to delete older than 30 days
  3. Run policy on schedule (it will fail)
  4. Run policy manually (it will fail as well)

What is the current bug behavior?

Images are not deleted - more even, no clear message or way to remove them is presented. Unfortunately setting policy days to even 360 days (more than a half of images) is not changing anything.

What is the expected correct behavior?

Images in manual run are deleted - I know that schedule should be done on regular basis, but manual should be able to remove all images in range specified in policy.

Relevant logs and/or screenshots

irb(main):006:0> project.container_repositories.find_each do |repo|
irb(main):007:1*   puts repo.attributes
irb(main):008:1> 
irb(main):009:1>   # Start the tag cleanup
irb(main):010:1>   puts Projects::ContainerRepository::CleanupTagsService.new(project, user, policy.attributes.except("created_at", "updated_at")).execute(repo)
irb(main):011:1> end
{"id"=>1, "project_id"=>13, "name"=>"", "created_at"=>Mon, 24 Apr 2017 09:40:48 CEST +02:00, "updated_at"=>Wed, 04 Nov 2020 08:50:06 CET +01:00, "status"=>nil, "expiration_policy_started_at"=>Wed, 04 Nov 2020 08:50:06 CET +01:00}
{:message=>"could not generate manifest", :status=>:error}

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes