Skip to content

Add migration helper to swap partitioned tables

Patrick Bair requested to merge 241267-swap-partitioned-tables-helper into master

What does this MR do?

Related to #241267 (closed)

Adds a new migration helper that is used to replace a non-partitioned table with its partitioned copy. This would be used as the final step of a partitioning migration, when the partitioned copy is fully populated with data from the source table, and the application is prepared to leverage the partitioning setup.

The approach is outlined in the related issue, and since the schemas are nearly identical between the two tables, it should be safe to swap the tables out while the application is running. Typically at the time the tables are swapped, the trigger that syncs writes would be dropped altogether, but for now we will create a new trigger which reverses the sync from the partitioned table back to the original table. By doing this, data will remain consistent between the two tables, allowing us the flexibility to rollback the swap in the event of an unforeseen issue.

Because most of the changes are metadata-only, this should be a fast operation. One note is that the series of migration helpers used for partitioning do not create the secondary indexes on the partitioned table, so they would need to be added manually before the swap.

Used a test migration against dblab to see that the helper works correctly:

class SwapPartitionedAuditEvents < ActiveRecord::Migration[6.0]
  include Gitlab::Database::PartitioningMigrationHelpers

  DOWNTIME = false

  def up
    replace_with_partitioned_table :audit_events
  end

  def down
    rollback_replace_with_partitioned_table :audit_events
  end
end
Result of running migration up
== 20201006181141 SwapPartitionedAuditEvents: migrating =======================
-- execute("DROP TRIGGER IF EXISTS table_sync_trigger_ee39a25f9d ON audit_events")
   -> 0.0787s
-- execute("DROP FUNCTION IF EXISTS table_sync_function_2be879775d()")
   -> 0.0810s
-- replace_table("ALTER TABLE "audit_events" ALTER COLUMN "id" DROP DEFAULT;

ALTER TABLE "audit_events_part_5fc467ac26" ALTER COLUMN "id" SET DEFAULT nextval('"audit_events_id_seq"'::regclass);

ALTER SEQUENCE "audit_events_id_seq" OWNED BY "audit_events_part_5fc467ac26"."id";

ALTER TABLE "audit_events" RENAME TO "audit_events_archived";

ALTER TABLE "audit_events_archived" RENAME CONSTRAINT "audit_events_pkey" TO "audit_events_archived_pkey";

ALTER TABLE "audit_events_part_5fc467ac26" RENAME TO "audit_events";

ALTER TABLE "audit_events" RENAME CONSTRAINT "audit_events_part_5fc467ac26_pkey" TO "audit_events_pkey"")
-- execute("CREATE  FUNCTION table_sync_function_2be879775d()\nRETURNS TRIGGER AS\n$$\nBEGIN\nIF (TG_OP = 'DELETE') THEN\n  DELETE FROM audit_events_archived where id = OLD.id;\nELSIF (TG_OP = 'UPDATE') THEN\n  UPDATE audit_events_archived\n  SET author_id = NEW.author_id,\n    entity_id = NEW.entity_id,\n    entity_type = NEW.entity_type,\n    details = NEW.details,\n    created_at = NEW.created_at,\n    ip_address = NEW.ip_address,\n    author_name = NEW.author_name,\n    entity_path = NEW.entity_path,\n    target_details = NEW.target_details,\n    target_type = NEW.target_type,\n    target_id = NEW.target_id\n  WHERE audit_events_archived.id = NEW.id;\nELSIF (TG_OP = 'INSERT') THEN\n  INSERT INTO audit_events_archived (id,\n    author_id,\n    entity_id,\n    entity_type,\n    details,\n    created_at,\n    ip_address,\n    author_name,\n    entity_path,\n    target_details,\n    target_type,\n    target_id)\n  VALUES (NEW.id,\n    NEW.author_id,\n    NEW.entity_id,\n    NEW.entity_type,\n    NEW.details,\n    NEW.created_at,\n    NEW.ip_address,\n    NEW.author_name,\n    NEW.entity_path,\n    NEW.target_details,\n    NEW.target_type,\n    NEW.target_id);\nEND IF;\nRETURN NULL;\n\nEND\n$$ LANGUAGE PLPGSQL\n")
   -> 0.1201s
-- execute("COMMENT ON FUNCTION table_sync_function_2be879775d IS 'Partitioning migration: table sync for audit_events table'")
   -> 0.0898s
-- current_schema()
   -> 0.0725s
-- execute("CREATE TRIGGER table_sync_trigger_ee39a25f9d\nAFTER INSERT OR UPDATE OR DELETE ON audit_events\nFOR EACH ROW\nEXECUTE FUNCTION table_sync_function_2be879775d()\n")
   -> 0.1246s
== 20201006181141 SwapPartitionedAuditEvents: migrated (1.3548s) ==============
Result of running migration down
== 20201006181141 SwapPartitionedAuditEvents: reverting =======================
-- execute("DROP TRIGGER IF EXISTS table_sync_trigger_ee39a25f9d ON audit_events")
   -> 0.0963s
-- execute("DROP FUNCTION IF EXISTS table_sync_function_2be879775d()")
   -> 0.0726s
-- replace_table("ALTER TABLE "audit_events" ALTER COLUMN "id" DROP DEFAULT;

ALTER TABLE "audit_events_archived" ALTER COLUMN "id" SET DEFAULT nextval('"audit_events_id_seq"'::regclass);

ALTER SEQUENCE "audit_events_id_seq" OWNED BY "audit_events_archived"."id";

ALTER TABLE "audit_events" RENAME TO "audit_events_part_5fc467ac26";

ALTER TABLE "audit_events_part_5fc467ac26" RENAME CONSTRAINT "audit_events_pkey" TO "audit_events_part_5fc467ac26_pkey";

ALTER TABLE "audit_events_archived" RENAME TO "audit_events";

ALTER TABLE "audit_events" RENAME CONSTRAINT "audit_events_archived_pkey" TO "audit_events_pkey"")
-- execute("CREATE  FUNCTION table_sync_function_2be879775d()\nRETURNS TRIGGER AS\n$$\nBEGIN\nIF (TG_OP = 'DELETE') THEN\n  DELETE FROM audit_events_part_5fc467ac26 where id = OLD.id;\nELSIF (TG_OP = 'UPDATE') THEN\n  UPDATE audit_events_part_5fc467ac26\n  SET author_id = NEW.author_id,\n    entity_id = NEW.entity_id,\n    entity_type = NEW.entity_type,\n    details = NEW.details,\n    ip_address = NEW.ip_address,\n    author_name = NEW.author_name,\n    entity_path = NEW.entity_path,\n    target_details = NEW.target_details,\n    created_at = NEW.created_at,\n    target_type = NEW.target_type,\n    target_id = NEW.target_id\n  WHERE audit_events_part_5fc467ac26.id = NEW.id;\nELSIF (TG_OP = 'INSERT') THEN\n  INSERT INTO audit_events_part_5fc467ac26 (id,\n    author_id,\n    entity_id,\n    entity_type,\n    details,\n    ip_address,\n    author_name,\n    entity_path,\n    target_details,\n    created_at,\n    target_type,\n    target_id)\n  VALUES (NEW.id,\n    NEW.author_id,\n    NEW.entity_id,\n    NEW.entity_type,\n    NEW.details,\n    NEW.ip_address,\n    NEW.author_name,\n    NEW.entity_path,\n    NEW.target_details,\n    NEW.created_at,\n    NEW.target_type,\n    NEW.target_id);\nEND IF;\nRETURN NULL;\n\nEND\n$$ LANGUAGE PLPGSQL\n")
   -> 0.0782s
-- execute("COMMENT ON FUNCTION table_sync_function_2be879775d IS 'Partitioning migration: table sync for audit_events table'")
   -> 0.0727s
-- current_schema()
   -> 0.0698s
-- execute("CREATE TRIGGER table_sync_trigger_ee39a25f9d\nAFTER INSERT OR UPDATE OR DELETE ON audit_events\nFOR EACH ROW\nEXECUTE FUNCTION table_sync_function_2be879775d()\n")
   -> 0.0713s
== 20201006181141 SwapPartitionedAuditEvents: reverted (1.2408s) ==============

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Patrick Bair

Merge request reports