Retry connections to database used for internal data
Problem to solve
We need to retry the connection to the database used for internal storage so that Meltano doesn't fail to start due to network interruptions or laggy database startup when using an external database as recommended by the production setup guide.
Target audience
People who run Meltano in production.
Further details
There should probably be two ENV settings to allow the end user to configure the number of retries and the timeout between those retries.
What does success look like, and how can we measure that?
- The Meltano application can be started before the database without immediately failing and it successfully connects after retrying once the database is online.
- Meltano should also attempt to reconnect to the database whenever the connection is lost and continue operating once it has reconnected.
- Meltano uses the ENV configuration for maximum number of retries and fails with an error if they are exceeded.
- Meltano uses the ENV configuration for timeout to space out its attempts to reconnect.
- There are included defaults for number of retries and timeout so that end users don't need to configure them and Meltano will operate if those settings are missing from the ENV.
Links / references
Related to this MR for use in a docker environment.
Relevant recommendation from the docker startup order documentation
To handle this, design your application to attempt to re-establish a connection to the database after a failure. If the application retries the connection, it can eventually connect to the database.
The best solution is to perform this check in your application code, both at startup and whenever a connection is lost for any reason.