Create Elasticsearch indexing utilization SLI
To set an SLO on the indexing process, we need to have a good way of measuring the actual performance of the whole process, which we currently don't have. We currently monitor the Elasticsearch indexing/initial indexing queue length and RPS, but we don't have a clear metric for the utilization SLI. This metric should behave as follows: - Whenever the queue size increase, it should decrease - Whenever the queue size decrease, it should increase - Whenever the queue size < RPS, it should be 0 (or baseline) - Whenever the RPS increase, it should increase - Whenever the RPS decreases, it should decrease Simply put, we should derive a metric that express the rate at which we are clearing the queue, whenever there is a queue. Then I think we'll be in a good position to setup SLOs on this metric.
issue