Skip to content

Drop health checks from WebExporter

What does this MR do and why?

This change removes code from WebExporter that we used to use to perform health/readiness checks.

This has been deprecated for a while now because WebExporter runs only in the puma primary, not in workers, and it is the workers that we want to health check. We now do this via HealthController, as documented here: https://docs.gitlab.com/ee/user/admin_area/monitoring/health_check.html#health-check

Note that technically, WebExporter still responds to /readiness and /liveness, since these are exposed by its base class, so what we remove are the probes invoked. We still need those endpoint for Sidekiq at the moment. We have a separate issue to extract those: #345804 (closed)

This change also helps decoupling WebExporter from the main Rails apps, as we are looking to extract this into a separate process in &7304 (closed)

Screenshots or screen recordings

To verify that these endpoints really are unused, I leveraged a recent change where we introduced HTTP request metrics for the exporter server instances. The graph below shows that it is indeed only /metrics we query for all webservice instances:

Screenshot_from_2022-01-18_13-49-22

https://thanos-query.ops.gitlab.net/graph?g0.expr=sum%20by%20(type%2C%20pid%2C%20path)%20(exporter_http_requests_total%7Benv%3D%22gprd%22%7D)&g0.tab=1&g0.stacked=0&g0.range_input=1h&g0.max_source_resolution=0s&g0.deduplicate=1&g0.partial_response=0&g0.store_matches=%5B%5D

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #350546 (closed)

Edited by Matthias Käppler

Merge request reports