Target sink - optimization strategies for when to flush batches

While draining or flushing a collection of target sinks, priority could be given to a number of different values:

Drain often for benefit of:
1. reduced memory pressure by flushing stored records
2. reduced latency, at least for earlier-emitted record(s)
3. more frequent checkpoints, aka increased frequency of STATE messages emitted
Drain less often for benefit of:
1. Fewer overall batches
2. Efficiency of bulk loading records at high scale
3. Lower costs on destination platforms which may charged or metered per batch
  - for instance Snowflake charges less when running 1 minute out of every 15 versus running intermittently for all 15 minutes.
Other factors to consider:
1. defining 'full' for each sink
  - Each sink should report when it is full, either by writing custom is_full() logic or else by specifying a max record count.
2. controlling max per-record latency
  - We may want to provide a max per-record latency threshold - for instance, prioritizing a batch to be loaded if contains one ore more record in queue for over 120 minutes.
3. draining multiple sinks when one is triggered ('toppling')
  - When one sink is being drained, we may want to opportunistically drain all others at the same time. This could have benefits for metering and platform costs.
  - For instance, it is cheaper in the Snowflake case to flush all at once and have fewer minutes of each hour running batches.
  - Draining all sinks also allows us to flush the stored state message(s).
4. memory pressure
  - If memory pressure is detected, this might force the flush of one or more streams

Our strategy for this (broadly) should probably be to have at least two layers:

The developer provides some default logic or prioritization strategy that is tuned to work well for the destination system.
The user may optionally have some ability to override at runtime using config options.

Edited May 26, 2021 by AJ Steers