Add execute cleanup policy service for virtual registry

What does this MR do and why?

This MR adds the service class to implement cleanup policy execution for virtual registry cache entries:

  • Introduces a new index (idx_maven_cache_entries_requiring_cleanup_columns) on virtual_registries_packages_maven_cache_entries table.
  • Implements a new requiring_cleanup scope in the VirtualRegistries::Packages::Maven::Cache::Entry & VirtualRegistries::Container::Cache::Entry models to identify entries that should be cleaned up based on age.
  • Creates a new ExecutePolicyService in the VirtualRegistries::Cleanup namespace that:
    • Fetch all group's upstreams, and for each upstream, processes its cache entries in batches
    • Marks old entries for destruction by updating their status and relative path
    • Tracks metrics about deleted entries (count and size)
    • Handles multiple upstream types
  • In a subsequent MR, the service is going to be called from a background worker to put it in action.

References

Screenshots or screen recordings

N/A

How to set up and validate locally

  1. Create a virtual registry cleanup policy for a group:

    group = Group.all.detect(&:root?)
    policy = VirtualRegistries::Cleanup::Policy.create!(group: group, keep_n_days_after_download: 30)
  2. Create some test Maven cache entries with old download dates:

    upstream1 = FactoryBot.create(:virtual_registries_packages_maven_upstream, group: group)
    upstream2 = FactoryBot.create(:virtual_registries_packages_maven_upstream, group: group)
    
    # stub file upload
    def fixture_file_upload(*args, **kwargs)
      Rack::Test::UploadedFile.new(*args, **kwargs)
    end
    
    old_entry1 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream1, downloaded_at: Time.current - 35.days)
    never_downloaded_entry1 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream1, created_at: Time.current - 35.days)
    recent_entry1 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream1, downloaded_at: Time.current - 20.days)
    
    old_entry2 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream2, downloaded_at: Time.current - 35.days)
    never_downloaded_entry2 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream2, created_at: Time.current - 35.days)
    recent_entry2 = FactoryBot.create(:virtual_registries_packages_maven_cache_entry, upstream: upstream2, downloaded_at: Time.current - 20.days)
  3. Execute the cleanup policy:

    VirtualRegistries::Cleanup::ExecutePolicyService.new(policy).execute
    => #<ServiceResponse:0x000000014aad1f10
     @http_status=:ok,
     @message=nil,
     @payload={:maven=>{:deleted_entries_count=>4, :deleted_size=>4096}, :container=>{:deleted_entries_count=>0, :deleted_size=>0}},
     @reason=nil,
     @status=:success>
    
    upstream1.cache_entries.pending_destruction.size
    => 2
    
    upstream2.cache_entries.pending_destruction.size
    => 2

💾 Database analysis

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #570911 (closed)

Edited by Moaz Khalifa

Merge request reports

Loading