New silent parent stream or "MultiStream" class proposal for taps
There are two recently uncovered use cases for a "MultiStream" Stream class in the Tap SDK, which reads data serially from a source containing multiple streams and schemas:
- Binlogs - Database logs will be system- or database-wide and the most efficient way to parse them is to read from beginning to end and emit whatever (qualified) stream records are found.
- "Replay" capability - For a proposed "--replay" capability, we may want to need to read a similar string of data containing multiple streams' data, and then emit each.
For either scenario (binlogs being the more common and timely), we'd want to deserialize lines of text into "records" where we don't know in advance which stream the line would apply to.
Proposal
A new "MultiStream" class could read in data from any number of streams, qualify whether they should be emitted, and then either (a) emit the record downstream directly or (b) delegate to another stream object which is custom to that stream type.
Edited by AJ Steers