Skip to content

Add `--test=schema` option to emit tap SCHEMA messages only

Laurent Savaëte requested to merge LaurentS/sdk:option-to-dump-schema into main

This MR proposes an extra CLI option --schema (now: --test=schema) which allows a tap to produce a SCHEMA message for each of its streams but no RECORD messages. This can allow a downstream target to process the schema without the need to produce data (and potentially the associated cost of getting it).

Reference slack chat https://meltano.slack.com/archives/C01PKLU5D1R/p1637173666242800

This comes from our use case with tap-github and target-postgres where we currently need to run the tap on all streams, therefore generating requests against the github api, to produce the tables in the database. This is particularly problematic in CI, where we can't test any code downstream without these tables.

@aaronsteers in the chat above, suggested tap-github --test. While this works, in my test, 3065 messages were produced (for 18 streams) and the run took 40+ seconds to complete (!204 (merged) should improve this a bit). With this change, I was able to get the full set of tables setup in about 5 seconds, and without connecting to the github API.

On the downsides, I am not sure how this would work with dynamic schemas.

Edited by AJ Steers

Merge request reports