Skip to content

Migration to add knn mappings to workitems index

Madelein van Niekerk requested to merge 454764-knn-mappings-migration into master

Context

Create abstraction layer to support Elasticsear... (#454764 - closed) and Move embeddings from issues index to workitems ... (#476537 - closed) are being done together in this order:

  1. !163009 (merged)
  2. !163946 (merged) 👈 this MR
  3. !164059 (merged)
  4. #479776 (closed)
  5. #479778 (closed)
  6. #479777 (closed)

What does this MR do and why?

This MR adds a migration to add the mapping for embeddings in the workitems index when the index already exists. The mapping is the same as in Allow Elasticsearch and OpenSearch specific map... (!163009 - merged).

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

Elasticsearch:

  1. Run the Migration worker on repeat: Elastic::MigrationWorker.new.perform
  2. See that the migration was executed by looking at the logs
  3. Note that the embedding_0 field exists in the workitems index
::Gitlab::Elastic::Helper.default.get_mapping(index_name: "gitlab-development-work_items")
=> {"archived"=>{"type"=>"boolean"},
 "assignee_id"=>{"type"=>"integer"},
 "author_id"=>{"type"=>"integer"},
 "confidential"=>{"type"=>"boolean"},
 "created_at"=>{"type"=>"date"},
 "description"=>{"type"=>"text", "analyzer"=>"code_analyzer"},
 "due_date"=>{"type"=>"date"},
 "embedding_0"=>{"type"=>"dense_vector", "dims"=>768, "index"=>true, "similarity"=>"cosine"},

(Optional) OpenSearch:

  1. Connect to opensearch
  2. Run the Migration worker on repeat: Elastic::MigrationWorker.new.perform
  3. See that the migration was executed by looking at the logs
  4. Note that the embedding_0 field exists in the workitems index
::Gitlab::Elastic::Helper.default.get_mapping(index_name: "gitlab-development-work_items")
=> {"archived"=>{"type"=>"boolean"},
 "assignee_id"=>{"type"=>"integer"},
 "author_id"=>{"type"=>"integer"},
 "confidential"=>{"type"=>"boolean"},
 "created_at"=>{"type"=>"date"},
 "description"=>{"type"=>"text", "analyzer"=>"code_analyzer"},
 "due_date"=>{"type"=>"date"},
 "embedding_0"=>{"type"=>"knn_vector", "dimension"=>768, "method"=>{"engine"=>"nmslib", "space_type"=>"cosinesimil", "name"=>"hnsw", "parameters"=>{"ef_construction"=>100, "m"=>16}}},

Related to #454764 (closed)

Edited by Madelein van Niekerk

Merge request reports

Loading