`meltano elt` should fail when a pipeline with the same job ID is still running
As brought up by Niall Woodward on Slack:
How does Meltano behave when a run is started before the previous has completed? I can't find that in the docs. Example where this could happen is with a scheduler running hourly and a run that takes over an hour
I responded there:
meltano elt
could check the system database to see if a job with the same job ID is already running, but it doesn't currently, so 2 parallelmeltano elt
s would start with the same incremental state (of the most recent complete run) and would both try to extract (and load) all the new data. In that scenario, it'd be up to the loader and the database to use UPSERT, primary keys, and uniqueness constraints to prevent duplicate records from being created.
Since a job in the system database could get stuck in the running
state if the job is killed, meltano elt
should support a --force
flag (or something to that extent) to skip this "is this pipeline already running?" check.