Skip to content

Reindex and remove leftover merge_request documents from the main index

What does this MR do and why?

This MR looks for the merge_request documents in the main index. Call ProcessBookkeepingService.track! with leftover merge_requests from the main index. After this, a delete_by_query call will be fired to delete the merge_request documents from the main index.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  • Make sure the elasticsearch is enabled in GDK.
  1. Open the rails console
bundle exec rails c
  1. Populate merge_request in the main index
  def populate_merge_request_in_main_index!(mr)
    client = ::Gitlab::Search::Client.new
    index_name = 'gitlab-development'
    client.index(index: index_name, routing: "project_#{mr.project_id}", id: "merge_request_#{mr.id}",
      refresh: true, body: {
        id: mr.id, iid: mr.iid, target_branch: mr.target_branch, source_branch: mr.source_branch, title: mr.title,
        description: mr.description, state: mr.state, merge_status: mr.merge_status, project_id: mr.project_id,
        source_project_id: mr.source_project_id, target_project_id: mr.target_project_id, author_id: mr.author_id,
        created_at: mr.created_at.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'), visibility_level: mr.project.visibility_level,
        updated_at: mr.updated_at.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'),
        join_field: { name: 'merge_request', parent: "project_#{mr.project_id}" }, type: 'merge_request',
        merge_requests_access_level: mr.project.merge_requests_access_level
      }
    )
  end

  MergeRequest.all.each { |n| populate_merge_request_in_main_index!(n) }
  1. Ensure there is at least one merge_request in the main index by running the following curl command in bash
curl -XGET "http://localhost:9200/gitlab-development/_count" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "type": "merge_request" } }
      ]
    }
  }
}'

count should be greater than 0

  1. Now run the following command in the rails console
 Elastic::DataMigrationService[20231005103449].send(:migration).migrate
  1. Run again the curl command and ensure the count is 0

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Run time

~82 minutes

[1] pry(main)> no_of_documents = 27014.0
=> 27014.0
[2] pry(main)> batch_size = 1000
=> 1000
[3] pry(main)> throttle_delay = 3.minute
=> 3 minutes
[4] pry(main)> (no_of_documents.to_f / batch_size) * throttle_delay
=> 81.042 minutes

Related to #424872 (closed)

Edited by Ravi Kumar

Merge request reports

Loading