Skip to content

Use Background Migration to copy historic data between non-partitioned and partitioned tables

This issue is part of the overarching goal to have an automated way to migrate data into partitioned tables. The high-level plan is outlined here: #202618 (closed)

This issue captures the implementation of the final step, which is using a Background Migration to batch copy data from the original table into the partitioned table.

Overview of the complete migration process:

Release N

Post migration to partition the table, which does the following:

  1. Create a copy of the original table partitioned by RANGE (date)
  2. On the new table, create a partition per month, based on data currently present in the original table
  3. Install a trigger on the original table that executes on INSERT/UPDATE/DELETE and writes the change to the partitioned table
  4. Schedule background migrations to copy data from the original table to the partitioned table

Release N+1

Post migration to cleanup after the background migration:

  1. Steal any remaining background migration jobs that haven't finished
  2. Query the background_migration_jobs table and copy data from the original table to the partitioned table, for each range of ids the background migration didn't successfully process
  3. Run a VACUUM FREEZE ANALYZE on the partitioned table

At this point we could manually verify the accuracy of the migration, and benchmark side-by-side performance of the original and partitioned tables in the production environment.

Release N+2

Migration to swap to the new partitioned table:

  1. Drop trigger on the original table
  2. Swap the table names of the original and partitioned table
  3. Delete the original table? (or some point further down the line)
Edited by Patrick Bair