Skip to content

Add support for WorkItem embeddings

Context

Create abstraction layer to support Elasticsear... (#454764 - closed) and Move embeddings from issues index to workitems ... (#476537 - closed) are being done together in this order:

  1. !163009 (merged)
  2. !163946 (merged) (dependant on 1)
  3. !164059 (merged) 👈 this MR
  4. #479778 (closed) (dependant on 1 and 3)
  5. #479777 (closed) (dependant on all of the above)
  6. #479776 (closed) (dependant on Implement WorkItemQueryBuilder and start using ... (#478007 - closed))

What does this MR do and why?

This MR allows for WorkItem embedding references. It changes:

  • index_name to be able to use Search::Elastic::Types::WorkItem.index_name
  • as_indexed_json to use a different content definition for workitems
  • rename preload_for_indexing scope to preload_indexing_data so that it's compatible with DatabaseClassReference.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

Requires being able to generate embeddings locally:

Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'documentation_search').text_embeddings(content: "How can I create an issue?") 

should return an embedding.

  1. Index a workitem: ::Elastic::ProcessBookkeepingService.track!(WorkItem.first), ::Elastic::ProcessBookkeepingService.new.execute
  2. Track the workitem's embedding: ::Search::Elastic::ProcessEmbeddingBookkeepingService.track_embedding!(WorkItem.first)
{"severity":"DEBUG","time":"2024-08-27T13:18:14.604Z","class":"Search::Elastic::ProcessEmbeddingBookkeepingService","message":"track_items","meta.indexing.redis_set":"elastic:embedding:updates:0:zset","meta.indexing.count":1,"meta.indexing.tracked_items_encoded":"[[1,\"Embedding|WorkItem|1|group_22\"]]"}
  1. Execute the embedding service: Search::Elastic::ProcessEmbeddingBookkeepingService.new.execute
{"severity":"INFO","time":"2024-08-27T13:18:39.615Z","correlation_id":"8b6d01d961fd4a028541330af43dd70b","class":"Search::Elastic::ProcessEmbeddingBookkeepingService","message":"indexing_done","meta.indexing.reference_class":"Embedding","meta.indexing.database_id":1,"meta.indexing.identifier":1,"meta.indexing.routing":"group_22","meta.indexing.search_indexing_duration_s":1139877.838794,"meta.indexing.search_indexing_flushing_duration_s":0.03269299981184304}
  1. Note that the document has the embedding.

Related to #454764 (closed)

Edited by Madelein van Niekerk

Merge request reports

Loading