Skip to content

Create Epic Elasticsearch index

Madelein van Niekerk requested to merge 250699-epic-index into master

What does this MR do and why?

Creates an index for Epics in Elasticsearch.

Currently all epic searches are doing a Basic Search and we want to allow Advanced Search to be used when Elasticsearch is available for faster and better searching.

To achieve this, we need the following:

Note: this feature is guarded by a feature flag which prevents epics from being added to the index while the second MR is not merged. The first three MRs should be merged in the same milestone.

Epic index settings and mappings:

Click to expand
{
  "gitlab-development-epics-20230613-1336" : {
    "aliases" : {
      "gitlab-development-epics" : { }
    },
    "mappings" : {
      "dynamic" : "strict",
      "_meta" : {
        "created_by" : "16.1.0-pre"
      },
      "properties" : {
        "author_id" : {
          "type" : "integer"
        },
        "confidential" : {
          "type" : "boolean"
        },
        "created_at" : {
          "type" : "date"
        },
        "description" : {
          "type" : "text"
        },
        "due_date" : {
          "type" : "date"
        },
        "group_id" : {
          "type" : "integer"
        },
        "hashed_root_namespace_id" : {
          "type" : "integer"
        },
        "id" : {
          "type" : "integer"
        },
        "iid" : {
          "type" : "integer"
        },
        "label_ids" : {
          "type" : "keyword"
        },
        "schema_version" : {
          "type" : "short"
        },
        "start_date" : {
          "type" : "date"
        },
        "state" : {
          "type" : "keyword"
        },
        "title" : {
          "type" : "text"
        },
        "traversal_ids" : {
          "type" : "keyword"
        },
        "type" : {
          "type" : "keyword"
        },
        "updated_at" : {
          "type" : "date"
        }
      }
    },
    "settings" : {
      "index" : {
        "codec" : "best_compression",
        "highlight" : {
          "max_analyzed_offset" : "1048576"
        },
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "5",
        "provided_name" : "gitlab-development-epics-20230613-1336",
        "creation_date" : "1686663417552",
        "analysis" : {
          "filter" : {
            "word_delimiter_graph_filter" : {
              "type" : "word_delimiter_graph",
              "preserve_original" : "true"
            }
          },
          "normalizer" : {
            "sha_normalizer" : {
              "filter" : [
                "lowercase"
              ],
              "type" : "custom"
            }
          },
          "analyzer" : {
            "email_analyzer" : {
              "tokenizer" : "email_tokenizer"
            },
            "default" : {
              "filter" : [
                "lowercase",
                "stemmer"
              ],
              "tokenizer" : "standard"
            },
            "whitespace_reverse" : {
              "filter" : [
                "lowercase",
                "asciifolding",
                "reverse"
              ],
              "tokenizer" : "whitespace"
            },
            "path_analyzer" : {
              "filter" : [
                "lowercase",
                "asciifolding"
              ],
              "type" : "custom",
              "tokenizer" : "path_tokenizer"
            },
            "code_analyzer" : {
              "filter" : [
                "word_delimiter_graph_filter",
                "flatten_graph",
                "lowercase",
                "asciifolding",
                "remove_duplicates"
              ],
              "type" : "custom",
              "tokenizer" : "whitespace"
            },
            "my_ngram_analyzer" : {
              "filter" : [
                "lowercase"
              ],
              "tokenizer" : "my_ngram_tokenizer"
            }
          },
          "tokenizer" : {
            "my_ngram_tokenizer" : {
              "token_chars" : [
                "letter",
                "digit"
              ],
              "min_gram" : "2",
              "type" : "ngram",
              "max_gram" : "3"
            },
            "email_tokenizer" : {
              "type" : "uax_url_email"
            },
            "path_tokenizer" : {
              "reverse" : "true",
              "type" : "path_hierarchy"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "UoNgar5CRkOiFgU7fedELg",
        "version" : {
          "created" : "8060299"
        }
      }
    }
  }
}

Routing is done by root ancestor group so that we leverage traversal_ids and limit documents to epics in the ancestor and its descendent groups.

Example document
{
  "_index" : "gitlab-development-epics-20230628-0730",
  "_id" : "epic_56",
  "_routing" : "group_78",
  "_source" : {
    "id" : 56,
    "iid" : 1,
    "group_id" : 79,
    "created_at" : "2023-06-28T07:27:44.205Z",
    "updated_at" : "2023-06-28T07:27:44.205Z",
    "title" : "Some epic",
    "description" : "description",
    "state" : "opened",
    "confidential" : false,
    "author_id" : 1,
    "label_ids" : [ ],
    "start_date" : null,
    "due_date" : null,
    "traversal_ids" : "78-79-",
    "hashed_root_namespace_id" : 368,
    "visibility_level" : 0,
    "schema_version" : 2306,
    "type" : "epic"
  }
}

Logs

Creating the index:

"Elastic::MigrationWorker","message":"MigrationWorker: migration[CreateEpicIndex] executing migrate method"
"CreateEpicIndex","message":"[Elastic::Migration: 20230518135700] Creating standalone epic index gitlab-development-epics"
"Elastic::MigrationWorker","message":"MigrationWorker: migration[CreateEpicIndex] updating with completed: true"

How to set up and validate locally

  1. Check that an epic index doesn't exist: curl "http://localhost:9200/_cat/aliases/gitlab-development-epics?h=i"
  2. Execute the migration worker a few times: Elastic::MigrationWorker.new.perform
  3. Optional: view the logs: tail -f log/elasticsearch.log
  4. Check that the epic index now exists: curl "http://localhost:9200/_cat/aliases/gitlab-development-epics?h=i"
  5. Disable the feature flag: Feature.disable(:elastic_index_epics)
  6. Update an epic, e.g. Epic.first.update(title: "test")
  7. Notice that the epic isn't scheduled to be updated in Elasticsearch (no logs in elasticsearch.log)
  8. Enable the feature flag: Feature.enable(:elastic_index_epics)
  9. Update an epic, e.g. Epic.first.update(title: "test2")
  10. Notice that the epic is scheduled to be updated in Elasticsearch.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #250699 (closed)

Edited by Madelein van Niekerk

Merge request reports