Use LFK to mark npm metadata caches for destruction

Context

In Loose foreign keys: introduce update_column_to ... (!164508 - merged) we introduced a new on_delete option to loose foreign key called update_column_to docs. This option is currently behind the feature flag. The current MR is a first candidate to use the new option.

What does this MR do and why?

  • Introduces status pending_destruction for Packages::Npm::MetadataCache. It'll replace existing stale scope that is currently used to recognize the entries scheduled for removal.
  • Replaces the foreign key to projects table in the packages_npm_metadata_caches table with the loose foreign key that will set status to 2 (pending destruction) when a linked project is deleted.

Note: at this moment the related background cleanup worker removes the entries without project_id. This won't be the case anymore as we won't set project_id to NULL when the project is deleted. Instead, we'll set the status to pending_destruction. Since the cleanup worker works with .pending_destruction scope, the operation shouldn't be affected. Additionally, all existing stale entries without project_id will be deleted with the background migration in Delete stale npm metadata caches with backgroun... (!165813 - merged).

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

No.

Database analysis

Because of the changes in the .pending_destruction method a few database queries have changed.

postgres.ai pre-setup

--- Create 1K entries with status 2 (pending_destruction)
exec do $$ begin for r in 1..1000 loop INSERT INTO packages_npm_metadata_caches(project_id, file_store, size, file, package_name, object_storage_key, created_at, updated_at, last_downloaded_at, status) VALUES (278964, 1, 1, 'metadata.json', r, 'key' || r, now(), now(), NULL, 2); end loop; end; $$;
exec DROP INDEX index_packages_npm_metadata_caches_on_project_id;
exec CREATE INDEX index_packages_npm_metadata_caches_on_project_id_status ON packages_npm_metadata_caches USING btree (project_id, status);
exec VACUUM ANALYZE packages_npm_metadata_caches;

Existence check here

SELECT 1 AS one
FROM "packages_npm_metadata_caches"
WHERE "packages_npm_metadata_caches"."status" = 2
LIMIT 1;

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31526/commands/97802

Next pending destructable entry here

SELECT "packages_npm_metadata_caches".*
FROM "packages_npm_metadata_caches"
WHERE "packages_npm_metadata_caches"."status" = 2
ORDER BY "packages_npm_metadata_caches"."updated_at" ASC
LIMIT 1
FOR
UPDATE SKIP LOCKED;

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31526/commands/97819

How to set up and validate locally

In rails console:

  1. Enable the feature flag Feature.enable(:loose_foreign_keys_update_column_to)

  2. Create a new entry in the packages_npm_metadata_caches table

    # stub file upload
    def fixture_file_upload(*args, **kwargs)
      Rack::Test::UploadedFile.new(*args, **kwargs)
    end
    
    cache = FactoryBot.create(:npm_metadata_cache)
  3. Now delete the project which the cache belongs to

    cache.project.destroy!
  4. Run the loose foreign key cleanup worker

    LooseForeignKeys::CleanupWorker.new.perform
  5. Verify that the status column has value 2 (pending_destruction)

    cache.reload.status

Related to #461787 (closed)

Edited by Dzmitry (Dima) Meshcharakou

Merge request reports

Loading