Pipeline log contains confusing error messages when tap doesn't support autodiscovery
When running a pipeline based on tap-gitlab
, the job log starts like this:
Running extract & load...
Found state from 2019-10-31 11:07:34.275664.
before_invoke has failed: Command /Users/douwemaan/meltano-projects/carbon/.meltano/extractors/tap-gitlab/venv/bin/tap-gitlab ['/Users/douwemaan/meltano-projects/carbon/.meltano/extractors/tap-gitlab/venv/bin/tap-gitlab', '--config', '/Users/douwemaan/meltano-projects/carbon/.meltano/run/elt/pipeline-1572515341624/e8f753dc-176c-4261-8f7d-c2d3eac66024/tap.config.json', '--state', '/Users/douwemaan/meltano-projects/carbon/.meltano/run/elt/pipeline-1572515341624/e8f753dc-176c-4261-8f7d-c2d3eac66024/state.json'] returned 1
Could not select stream, catalog file is missing.
INFO Starting sync
INFO GET https://gitlab.com/api/v4/groups/meltano
...
The third and fourth lines both look like pretty serious error messages, but after that, extraction continues just fine and ultimately completes successfully. So what's the deal with those error messages?
Both have to do with schema autodiscovery and selection, but that's not clear from the messages themselves. tap-gitlab
doesn't actually support autodiscovery, which explains why the command failed and why this doesn't affect extraction, but the user reading the logs doesn't know that.
If an extractor/tap doesn't support autodiscovery, we shouldn't attempt to autodiscover at all, and we definitely shouldn't complain to the user that something that isn't expected to work isn't working.
If an extractor does support autodiscovery, but fails for some reason, the error message should be clearer.