Skip to content

Set a timeout for Praefect's SQL metrics

Praefect currently doesn't set any timeout for SQL metrics. While this generalizes to all of the Prometheus metrics collected, most of the metrics are read from in-memory counters. Some metrics Praefect collects have to be fetched from the database though. If the query runs for longer than the Prometheus instance waits for the results, Prometheus may do another scrape run while the original query is still running. This can cause the metrics queries to pile up in the database, eventually overloading it. While it would be ideal to have a context available to cancel the queries as soon as Prometheus stops waiting for the scrape results, the Collector interface doesn't pass it down. This commit adds a configurable timeout that is applied to the metrics queries. The default is set to 10 seconds, which is an ample time to get metrics.

Closes #3242 (closed)

Edited by Sami Hiltunen

Merge request reports

Loading