Add timeout monitoring to zlonk
Add better observability to zlonk, as the failure we experienced last week went unnoticed until the Data Team reported a failure on data ingestion.
- ensure we detect when the database fails to go into production mode after recovery
- add timeout monitoring (runbook)
Edited by Gerardo Lopez-Fernandez