Improve ClusterReindexingService states and logging

Description

We have different problems with Elasticsearch (ES) migrations using MigrationReindexTaskHelper, and there was not enough information and logs to understand the issues, eg. !189044 (comment 2475476273) or !190508 (comment 2504271245).

After a discussion hour with the Global Search group members, we decided to improve the ClusterReindexingService class because the current reindex task doesn't reflect the system's real states.

ReindexingTask States

Current states

Initial - indexing is paused here
Indexing paused - tasks kicked off to ES
Reindexing - monitoring, check for success at the end

New proposed states

Preflight check – anything before changes are made, errors can be returned to the user
Pause ES indexing
Set the index as read-only
Kick off reindexing tasks to ES
Reindexing monitoring
Check for success

Implementation

We will make significant changes with new states, but don't want to break existing or new reindexing tasks. One possible approach is to version the reindexing task records using a new constant similar to Elasticsearch's SCHEMA_VERSION. We save this constant in the task itself, for example, we can use ReindexingTask#options. If the version doesn't match what we have, we can mark the task as failed. It's not a problem when a reindexing fails because the system will return to the original index, and we can show a nice message to the user: "The reindexing task started in a previous version of the GitLab code, please retry it".

Edited Aug 05, 2025 by 🤖 GitLab Bot 🤖