Maven VReg: use LFK update_column_to

🔥 Problem

In Maven virtual registry MVC (API only interactions) (&14137 - closed), cached responses are destroyed by using a mark system. Records that need to be destroyed bear a mark to flag them as "ready to be destroyed".

Then, we have a background job that will walk through them to actually destroy them. This allows us to better cope to the Object Storage references (a cached response is a file on Object Storage) and scalability (we could be destroying 1000s of cached responses).

Single record destruction is not a challenge, but things get interesting when parent objects get destroyed. Here are the considered associations:

  1. upstream - 1:n -> cached_response.
  2. group - 1:n -> cached_response.

(1.) is already handled by Loose Foreign Key nullify. In this case, the "mark" is a NULL value for the upstream_id column on cached responses records.

(2.) is not currently covered, and we can't use the same approach as the group_id column is the sharding key, which means that we can't set it to NULL.

🚒 Solution

In #475204 (closed), we implemented a new action for Loose Foreign Keys: update_column_to. This allows us to not nullify a given column but set any desired value on any column.

Thus, we can have a status column that can be updated to pending_destruction.

(1.) should be updated to use that approach too. This way, the cleanup background job can simply look for records with the status pending_destruction (instead of looking for 2 different "marks").

What does this MR do and why?

  • Convert the group_id FK in virtual_registries_packages_maven_cached_responses database table into a LFK.
  • Use LFK's update_column_to action to update the status column for the associated virtual_registries_packages_maven_cached_responses rows into pending_destruction when the parent group got deleted.
  • Update the virtual_registries_packages_maven_cached_responses.upstream_id LFK's on_delete to also update the status column for the associated virtual_registries_packages_maven_cached_responses rows into pending_destruction instead of nullifying them.
  • Replace the index on the group_id column with a composite index on group_id & status to be used in the LFK update query.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

N/A

How to set up and validate locally

Test when a group is destroyed:
In rails console:
# create a group
group = FactoryBot.create(:group)

# stub file upload
def fixture_file_upload(*args, **kwargs)
  Rack::Test::UploadedFile.new(*args, **kwargs)
end

# create a couple of cached responses that belong to the group
FactoryBot.create_list(:virtual_registries_packages_maven_cached_response, 3, group: group)

# delete the group
group.destroy!

# Run the loose foreign key cleanup worker
LooseForeignKeys::CleanupWorker.new.perform

# Verify that the status column has value 2 (pending_destruction)
VirtualRegistries::Packages::Maven::CachedResponse.for_group(group).pluck(:status)
Test when an upstream is destroyed:
In rails console:
# create an upstream
upstream = FactoryBot.create(:virtual_registries_packages_maven_upstream)

# stub file upload
def fixture_file_upload(*args, **kwargs)
  Rack::Test::UploadedFile.new(*args, **kwargs)
end

# create a couple of cached responses that belong to the upstream
FactoryBot.create_list(:virtual_registries_packages_maven_cached_response, 3, upstream: upstream)

# delete the upstream
upstream.destroy!

# Run the loose foreign key cleanup worker
LooseForeignKeys::CleanupWorker.new.perform

# Verify that the status column has value 2 (pending_destruction)
upstream.cached_responses.reset.pluck(:status)

Related to #486492 (closed)

Edited by Moaz Khalifa

Merge request reports

Loading