Skip to content

Reindex and remove leftover notes from main index

What does this MR do and why?

This MR looks for the note documents in the main index. Call ProcessBookkeepingService.track! with leftover notes from the main index. After this, a delete_by_query call will be fired to delete the note documents from the main index.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Make sure the elasticsearch is enabled in GDK.

  1. Open the rails console
bundle exec rails c
  1. Populate notes in the main index
  def populate_notes_in_main_index!(note)
    client = ::Gitlab::Search::Client.new
    index_name = 'gitlab-development'
    client.index(index: index_name, routing: "project_#{note.project_id}", id: "note_#{note.id}", refresh: true,
      body: {
        id: note.id, note: note.note, noteable_type: note.noteable_type, noteable_id: note.noteable_id,
        created_at: note.created_at.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'),
        updated_at: note.updated_at.strftime('%Y-%m-%dT%H:%M:%S.%3NZ'),
        issue: { assignee_id: [], author_id: note.author_id, confidential: note.noteable.try(:confidential?).presence },
        join_field: { name: 'note', parent: "project_#{note.project_id}" }, project_id: note.project_id,
        repository_access_level: note.project.repository_access_level, visibility_level: note.project.visibility_level,
        issues_access_level: note.project.issues_access_level, type: 'note', confidential: note.confidential?
      }
    )
  end

  Note.all.each { |n| populate_notes_in_main_index!(n) }
  1. Ensure there is at least one note in the main index by running the following curl command in bash
curl -XGET "http://localhost:9200/gitlab-development/_count" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "type": "note" } }
      ]
    }
  }
}'

count should be greater than 0

  1. Now run the following command in the rails console
 Elastic::DataMigrationService[20231004124852].send(:migration).migrate
  1. Run again the curl command and ensure the count is 0

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #424874 (closed)

Run time

~2 minutes

[6] pry(main)> no_of_documents = 1096.0
=> 1096.0
[7] pry(main)> batch_size = 2000
=> 2000
[8] pry(main)> throttle_delay = 3.minute
=> 3 minutes
[9] pry(main)> (no_of_documents.to_f / batch_size) * throttle_delay
=> 1.6440000000000001 minutes
Edited by Ravi Kumar

Merge request reports

Loading