Enable timing for health checks logical restore
We experience the failed data refresh and we thing is because the restore container being seen as unhealthy to fast. We see the following log lines:
2022/03/31 01:43:20 restore.go:217: [INFO] Running container: dblab_lr_c7n823nlih4ne84j42e0. ID: ffddc121c4b5b399418120e297a195e35648c735f9309bc3974acc8d3c8c55ae
2022/03/31 01:43:20 restore.go:225: [INFO] Waiting for container readiness
2022/03/31 01:43:20 tools.go:285: [INFO] Check container readiness: ffddc121c4b5b399418120e297a195e35648c735f9309bc3974acc8d3c8c55ae
2022/03/31 01:43:47 tools.go:343: [INFO] Container logs:
2022/03/31 01:43:47 tools.go:371: [INFO] Removing container ID: ffddc121c4b5b399418120e297a195e35648c735f9309bc3974acc8d3c8c55ae
2022/03/31 01:44:20 tools.go:377: [INFO] Container "ffddc121c4b5b399418120e297a195e35648c735f9309bc3974acc8d3c8c55ae" has been stopped
2022/03/31 01:44:20 tools.go:388: [INFO] Container "ffddc121c4b5b399418120e297a195e35648c735f9309bc3974acc8d3c8c55ae" has been removed
2022/03/31 01:44:20 retrieval.go:412: [ERROR] Failed to run full-refresh failed to readiness check: container health check failed
After some some tests we manage to start the dblab_lr_c7n823nlih4ne84j42e0 container manual within the extended_postgres container with the following command:
docker run --name db-repair \
-e PGDATA=/var/lib/dblab/outbound/outbound_01/data \
-e POSTGRES_PASSWORD=does_not_matter \
-e PG_SERVER_VERSION=11 \
-v /var/lib/dblab/outbound/outbound_01/data:/var/lib/dblab/outbound/outbound_01/data \
-v /var/lib/dblab/outbound/outbound_01/dump:/var/lib/dblab/outbound/outbound_01/dump \
oasissharedregistry.azurecr.io/postgresai/extended-postgres-ssl:11 sleep infinity
By doing this, the logical restore container has more time to startup and do some DB magic. After this we can DLE is in a health state and we can create clones.
It would be helpful to be able to influence the following hardcoded healthcheck configuration.