Skip to content

Backfill traversal_ids in notes index

What does this MR do and why?

This change adds a migration to backfill traversal IDs in the notes index for Elasticsearch. The migration is scheduled for version 18.1 and will process notes in batches of 10,000 with a 1-minute delay between batches to reduce system load. The code includes a test file that verifies the migration works correctly with different types of notes (issue notes, project snippet notes, commit notes, and merge request notes). There's also a small fix to the shared test examples to correctly handle test cases with more than 4 objects. This migration is part of the Global Search group's efforts to improve search functionality.

This MR is step 2 in improving notes query performance. Notes uses the legacy authorization in Elasticsearch queries which can send 1000s of project_id to Elasticsearch. This will improve performance for global and group searches (the same was previously done for code, merge requests, issues, etc). The query needs to be improved because it is now used during issue search. The plan for improving the query is:

  1. add traversal_ids to notes index !193056 (merged)
  2. backfill traversal_ids in notes index (this MR)
  3. switch notes index to use new authorization in queries (behind a FF)
  4. remove FF

References

Screenshots or screen recordings

Before After

How to set up and validate locally

after checking out the branch, you will probably need to restart rails-background-jobs to make sure the workers pick up the sidekiq jobs

  1. enable advanced search
  2. reset schema_version in indexed data
    curl --request POST \
      --url 'http://localhost:9200/gitlab-development-notes/_update_by_query?wait_for_completion=true&refresh=true' \
      --header 'Content-Type: application/json' \
      --data '{
    	"script": {
    		"source": "ctx._source.schema_version=2222"
    	},
    	"query": {
    		"match_all": {}
    	}
    }'
  3. remove migration from migrations index (if it exists)
       curl --request DELETE --url http://localhost:9200/gitlab-development-migrations/_doc/20250530160142
  4. open rails console, run the migration worker: Elastic::MigrationWorker.new.perform
  5. (optional) in rails console, run the indexing process manually: Elastic::ProcessInitialBookkeepingService.new.execute
  6. run the migration worker: Elastic::MigrationWorker.new.perform until it's completed
  7. watch the logs in log/elasticsearch.log to make sure the migration runs and passes

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Terri Chu

Merge request reports

Loading