Skip to content

Disable cleanup policies linked to projects with no container images

📻 Context

Cleanup policies are objects holding cleanup parameters for container images tags.

When a cleanup policy is executed, the backend basically take those parameters and destroy tags that the policy doesn't include. This process uses heavily the container registry API. As such, cleanup policies are executed by background workers.

To help them, we created the following index: https://gitlab.com/gitlab-org/gitlab/-/blob/2a90aec857d55765fa7eaf597aef98e9bade02f2/db/structure.sql#L22128. Basically, the index references cleanup policies that are enabled.

Cleanup policies are automatically created when a Project is created: https://gitlab.com/gitlab-org/gitlab/-/blob/9403fbfe6e5b278e6b7af08e6403a1041719e77f/app/models/project.rb#L111. The main issue is that during one of the very first iterations of cleanup policies, they were created enabled by default.

Guess what happened with the above index? That's right, it is referencing many enabled policies that are linked to a project that doesn't have any container image = no tags at all = workers are not interested in those policies. That's issue #330315 (closed).

For gitlab.com, the index references:

  • ~3 millions (internal) cleanup policies that are enabled but the linked project has not container images
  • ~6000(internal) cleanup policies that are enabled and the linked project has at least one container image

🔬 What does this MR do?

  • Adds a background migration that will disable all the enabled cleanup policies that are linked to project with 0 container images.

📸 Screenshots (strongly suggested)

n / a

📐 Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

💿 Database review

Migration up

$ rails db:migrate 
== 20210518074332 DisableExpirationPoliciesLinkedToNoContainerImages: migrating 
== 20210518074332 DisableExpirationPoliciesLinkedToNoContainerImages: migrated (0.0105s) 

Migration down

Please note that the post deploy migration is not reversible, so the rollback is actually doing nothing 😺

$ rails db:rollback                                                                                           
== 20210518074332 DisableExpirationPoliciesLinkedToNoContainerImages: reverting 
== 20210518074332 DisableExpirationPoliciesLinkedToNoContainerImages: reverted (0.0000s) 

Queries / Explain plans

Background Migration Details

  • 2926937 policies to update
  • batch size: 30000
  • loops: 98
  • delay: 2 minutes
  • total time to migrate all the policies: 92 * 2min = 196min = ~ 3.2 hours
Edited by David Fernandez

Merge request reports