Use LFK to mark npm metadata caches for destruction
Context
In Loose foreign keys: introduce update_column_to ... (!164508 - merged) we introduced a new on_delete option to loose foreign key called update_column_to docs. This option is currently behind the feature flag. The current MR is a first candidate to use the new option.
What does this MR do and why?
- Introduces status
pending_destructionforPackages::Npm::MetadataCache. It'll replace existingstalescope that is currently used to recognize the entries scheduled for removal. - Replaces the foreign key to
projectstable in thepackages_npm_metadata_cachestable with the loose foreign key that will setstatusto2(pending destruction) when a linked project is deleted.
Note: at this moment the related background cleanup worker removes the entries without project_id. This won't be the case anymore as we won't set project_id to NULL when the project is deleted. Instead, we'll set the status to pending_destruction. Since the cleanup worker works with .pending_destruction scope, the operation shouldn't be affected. Additionally, all existing stale entries without project_id will be deleted with the background migration in Delete stale npm metadata caches with backgroun... (!165813 - merged).
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
No.
Database analysis
Because of the changes in the .pending_destruction method a few database queries have changed.
postgres.ai pre-setup
--- Create 1K entries with status 2 (pending_destruction)
exec do $$ begin for r in 1..1000 loop INSERT INTO packages_npm_metadata_caches(project_id, file_store, size, file, package_name, object_storage_key, created_at, updated_at, last_downloaded_at, status) VALUES (278964, 1, 1, 'metadata.json', r, 'key' || r, now(), now(), NULL, 2); end loop; end; $$;
exec DROP INDEX index_packages_npm_metadata_caches_on_project_id;
exec CREATE INDEX index_packages_npm_metadata_caches_on_project_id_status ON packages_npm_metadata_caches USING btree (project_id, status);
exec VACUUM ANALYZE packages_npm_metadata_caches;
Existence check here
SELECT 1 AS one
FROM "packages_npm_metadata_caches"
WHERE "packages_npm_metadata_caches"."status" = 2
LIMIT 1;
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31526/commands/97802
Next pending destructable entry here
SELECT "packages_npm_metadata_caches".*
FROM "packages_npm_metadata_caches"
WHERE "packages_npm_metadata_caches"."status" = 2
ORDER BY "packages_npm_metadata_caches"."updated_at" ASC
LIMIT 1
FOR
UPDATE SKIP LOCKED;
https://console.postgres.ai/gitlab/gitlab-production-main/sessions/31526/commands/97819
How to set up and validate locally
In rails console:
-
Enable the feature flag
Feature.enable(:loose_foreign_keys_update_column_to) -
Create a new entry in the
packages_npm_metadata_cachestable# stub file upload def fixture_file_upload(*args, **kwargs) Rack::Test::UploadedFile.new(*args, **kwargs) end cache = FactoryBot.create(:npm_metadata_cache) -
Now delete the project which the cache belongs to
cache.project.destroy! -
Run the loose foreign key cleanup worker
LooseForeignKeys::CleanupWorker.new.perform -
Verify that the
statuscolumn has value2(pending_destruction)cache.reload.status
Related to #461787 (closed)