Skip to content

Support additional filters when iterating in batches

Context:

Currently, the PrimaryKeyBatchingStrategy always iterates over the whole table. This can cause performance issues if we only want to transform a small subset of data.

Hypothetical example:

  • Table: Cars
  • Records: 200M
  • BMW cars: 100 = Cars.where(brand: bmw).size

Problem: If we want to make a data transformation only on BMW cars, our batching strategy will iterate over the 200M records.

Goal:

Add support for applying additional filters when iterating a table.

Edited by Krasimir Angelov