Properly handle stream selection and `selected_properties`, including both `RECORD` and `SCHEMA` messages
This would add proper stream selection and property selection via the input catalog (when provided).
Approach
Following from the discussion here, the selection filtering should occur at four steps in the process:
- Selecting or deselecting entire streams.
- As an input to
get_records()
- so tap developers can avoid collecting data they don't need.- This is available today in some manner, if the developer interrogates
Tap.input_catalog
(not a trivial effort).
- This is available today in some manner, if the developer interrogates
- As a property filter on the standard (base class) transformation from raw data into the RECORD message.
- Failsafe for excessive fields being returned from
get_records()
.
- Failsafe for excessive fields being returned from
- As a property filter on the standard (base class) generation of SCHEMA messages.
- Many taps don't perform this filtering today, which results in downstream columns being unnecessarily (and confusingly) created in the target's destination tables.
Edited by AJ Steers