Skip to content

Add migration for backfilling traversal_ids in blobs and wiki blobs

Siddharth Dungarwal requested to merge 351381-backfill-blobs-and-wiki-blobs into master

What does this MR do and why?

Describe in detail what your merge request does and why.

Backfills the traversal_ids for blobs and wiki blobs in the main index, more details in the issue #351381 (closed)

Time for completion estimate calculation (internal link): 278 hours (may take a little longer due to having to work through each project). Indexing will not be paused during the migration.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Run the following query to get the blobs and wiki_blobs with missing traversal_ids
{
  "size": 0,
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "traversal_ids"
        }
      },
      "must": {
        "terms": {
          "type": [
            "blob",
            "wiki_blob"
          ]
        }
      }
    }
  },
  "aggs": {
    "my-agg-name": {
      "terms": {
        "size": 1000,
        "field": "project_id"
      }
    }
  }
}
  1. Make sure advanced search is enabled and you run the migration from rails console by entering the following lines:
require File.expand_path('ee/elastic/migrate/20221221110300_add_traversal_ids_in_blobs_and_wiki_blobs.rb')
BackfillTraversalIdsToBlobsAndWikiBlobs.new(20221221110300).migrate
  1. Run the query again in ES to verify that there are no records with misssing traversal_ids

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #351381 (closed)

Edited by Terri Chu

Merge request reports