Expose metrics on Advanced Search project indexing status
Problem to solve
As a self-managed customer, we want to be able to monitor the Advanced Search indexing state of projects via GitLab exposed metrics. Specifically, we want to monitor the number of indexed vs not-indexed projects, to enable our team to action (e.g. manually trigger project reindexing) if indexing backs up.
Proposal
Updated proposal from discussion below. GitLab.com will also consume the new metrics and add or modify existing alerts to use them
Add two new prometheus metrics:
-
global_search_indexing_paused- true/false (or 0/1) -
global_search_index_repair_total- counter every time index repair queues a project for indexing
Alerting can look at:
-
global_search_index_repair_total- a large number of these could indicate an issue. If this alerts, open a request for help or issue with groupglobal search -
global_search_indexing_pausedcombined with existing metrics over a threshold. I recommend a threshold depending on how busy the dedicated instance is. You could start at 10,000 and adjust if neededglobal_search_bulk_cron_queue_sizeglobal_search_bulk_cron_initial_queue_sizeglobal_search_awaiting_indexing_queue_size
Additional Context
We (Dedicated) have not been able to formulate a query that accurately reflects these metrics via general ElasticSearch metrics (elasticsearch_exporter). More details on the investigation can be found in https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/5143.