Loose foreign keys: introduce update_column_to action

Context

In grouppackage registry we're trying to solve the problem of delayed cleaning of the entries that have references in the object storage.

How it currently works?

For the database tables that have references in the object storage, we define a foreign key with ON DELETE SET NULL option. The example of such tables is packages_npm_metadata_caches, that has defined project_id foreign key. When the referenced project is deleted, the project_id column in this table for linked entries will be updated to NULL and later those entries will be destroyed, with the scheduled cleanup worker. In this way we also cleanup the object storage files, or do something else, for example, recalculate namespace's storage statistics.

With the ongoing Cells work an each table needs to have a sharding key that cann't be NULL, and in some cases it corresponds to a foreign key. Like in our example with the packages_npm_metadata_caches table. Given that, we're not able to use ON DELETE SET NULL option for a foreign key anymore, and need to find a way to mark entries in the database tables for a delayed cleanup.

We think that existing Loose foreign keys technique might be a good choice for that: when the parent is deleted, then update a target column to a target value. In our case, we could have a status column that might be updated to pending_destruction value.

What does this MR do and why?

This MR introduces the new on_delete action: update_column_to that can be used with two other attributes target_column and target_value to set a target column to a target value when the parent record was deleted.

packages_npm_metadata_caches:
  - table: projects
    column: project_id
    on_delete: update_column_to
    target_column: status
    target_value: 4

Note: I've not updated the documentation yet, and will do it after a preliminary review.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

No.

How to set up and validate locally

  1. Enable the feature flag in rails console

    Feature.enable(:loose_foreign_keys_update_column_to)
  2. Create the migration to remove the existing foreign key to projects table from packages_npm_metadata_caches table and execute it

    $ bundle exec rails g post_deployment_migration remove_fk_to_projects_from_packages_npm_metadata_caches_project_id
  3. Edit the content of migration file db/post_migrate/<timestamp>_remove_fk_to_projects_from_packages_npm_metadata_caches_project_id.rb

    # frozen_string_literal: true
    
    class RemoveFkToProjectsFromPackagesNpmMetadataCachesProjectId < Gitlab::Database::Migration[2.2]
      disable_ddl_transaction!
      milestone '17.4'
    
      def up
        with_lock_retries do
          remove_foreign_key_if_exists(:packages_npm_metadata_caches, :projects, name: 'fk_ada23b1d30')
        end
      end
    
      def down
        add_concurrent_foreign_key(:packages_npm_metadata_caches, :projects, name: 'fk_ada23b1d30', column: :project_id, target_column: :id, on_delete: :nullify)
      end
    end
  4. Add new loose foreign key definition to update status column in the packages_npm_metadata_caches to 1 when the linked row in the projects table is deleted. Edit config/gitlab_loose_foreign_keys.yml

    packages_npm_metadata_caches:
      - table: projects
        column: project_id
        on_delete: update_column_to
        target_column: status
        target_value: 1
  5. Create new entry in the packages_npm_metadata_caches table using rails console

    # stub file upload
    def fixture_file_upload(*args, **kwargs)
      Rack::Test::UploadedFile.new(*args, **kwargs)
    end
    
    cache = FactoryBot.create(:npm_metadata_cache, status: :default)
  6. Now delete the project which the cache belongs to

    cache.project.destroy!
  7. Run the loose foreign key cleanup worker

    LooseForeignKeys::CleanupWorker.new.perform
  8. Verify that the status column has value 1 (processing)

    cache.reload.status
  9. Rollback the migration to return the original database state and delete the file.

    cache.destroy! # to avoid foreign key violations when rolling back the migration
    $ rails db:rollback:main
    $ rm db/post_migrate/<timestamp>_remove_fk_to_projects_from_packages_npm_metadata_caches_project_id.rb
  10. Remove the loose foreign key definition for packages_npm_metadata_caches in config/gitlab_loose_foreign_keys.yml

Related to #475204 (closed)

Edited by Lukas Wanko

Merge request reports

Loading