Allow syncing a stream without its parent stream
Summary
Could the sdk add a method for allowing to sync a stream without automatically syncing its parent stream?
Proposed benefits
In our use case with MeltanoLabs/tap-github
, we sync different streams at different rates (eg: issues
are synced hourly, while the parent stream is only synced daily). With the current sdk logic, I cannot sync issues
without also syncing the parent repositories
stream.
The reason being "cost" of accessing the resource, in the form of rate limits.
I think the use case probably applies to other taps where access to the resource is constrained somehow, either by money or technical limits, which is why I'm opening this ticket here rather than in the tap's repo.
Proposal details
The SDK decides whether to sync a stream with:
for stream in self.streams.values():
if not stream.selected and not stream.has_selected_descendents:
self.logger.info(f"Skipping deselected stream '{stream.name}'.")
continue
with has_selected_descendents
a @final
property. In theory, the tap could simply overload this method and return False
depending on some config option, but I suspect that a clearly defined method for doing this that all sdk-based taps can share would be nice.
Best reasons not to build
I can't think of any definitive reasons not to implement this, but it could also be argued that leaving this to each tap is an acceptable viewpoint