Add support for WorkItem embeddings
Context
Create abstraction layer to support Elasticsear... (#454764 - closed) and Move embeddings from issues index to workitems ... (#476537 - closed) are being done together in this order:
- !163009 (merged)
- !163946 (merged) (dependant on 1)
-
!164059 (merged)
👈 this MR - #479778 (closed) (dependant on 1 and 3)
- #479777 (closed) (dependant on all of the above)
- #479776 (closed) (dependant on Implement WorkItemQueryBuilder and start using ... (#478007 - closed))
What does this MR do and why?
This MR allows for WorkItem embedding references. It changes:
-
index_nameto be able to useSearch::Elastic::Types::WorkItem.index_name -
as_indexed_jsonto use a different content definition for workitems - rename
preload_for_indexingscope topreload_indexing_dataso that it's compatible withDatabaseClassReference.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
Requires being able to generate embeddings locally:
Gitlab::Llm::VertexAi::Client.new(User.first, unit_primitive: 'documentation_search').text_embeddings(content: "How can I create an issue?")
should return an embedding.
- Index a workitem:
::Elastic::ProcessBookkeepingService.track!(WorkItem.first),::Elastic::ProcessBookkeepingService.new.execute - Track the workitem's embedding:
::Search::Elastic::ProcessEmbeddingBookkeepingService.track_embedding!(WorkItem.first)
{"severity":"DEBUG","time":"2024-08-27T13:18:14.604Z","class":"Search::Elastic::ProcessEmbeddingBookkeepingService","message":"track_items","meta.indexing.redis_set":"elastic:embedding:updates:0:zset","meta.indexing.count":1,"meta.indexing.tracked_items_encoded":"[[1,\"Embedding|WorkItem|1|group_22\"]]"}
- Execute the embedding service:
Search::Elastic::ProcessEmbeddingBookkeepingService.new.execute
{"severity":"INFO","time":"2024-08-27T13:18:39.615Z","correlation_id":"8b6d01d961fd4a028541330af43dd70b","class":"Search::Elastic::ProcessEmbeddingBookkeepingService","message":"indexing_done","meta.indexing.reference_class":"Embedding","meta.indexing.database_id":1,"meta.indexing.identifier":1,"meta.indexing.routing":"group_22","meta.indexing.search_indexing_duration_s":1139877.838794,"meta.indexing.search_indexing_flushing_duration_s":0.03269299981184304}
- Note that the document has the embedding.
Related to #454764 (closed)
Edited by Madelein van Niekerk