ActiveContext: add indexing_embedding_fields to Collection class

What does this MR do and why?

  • Adds the indexing_embedding_fields method to Collection
  • Updates the ActiveContext::Databases::Elasticsearch::Client to use both Collection.current_embedding_fields and Collection.indexing_embedding_fields` when adding the embedding fields to the query
    • see explanation comment in the code
  • Adds tests for the MarkRepositoryAsReadyEventWorker, which makes use of the current_embedding_fields

Step-by-step changes summary

MR Status
Introduce the new hash/object-based models
Add indexing_embedding_fields to Collection class This MR
Add embeddings_with_model_redesign preprocessor Ready for review
Integrate model redesign into Code Embeddings pipeline Pending

References

Screenshots or screen recordings

N/A

How to set up and validate locally

Since we can be sure that indexing_embedding_fields is still empty in production, and that it's not yet used anywhere outside of the ActiveContext::Databases::Elasticsearch::Client.add_source_fields method, the unit tests should cover the validation.

You may validate in the console that indexing_embedding_fields is indeed an empty array if current_indexing_embedding_model are next_indexing_embedding_model model metadatas are both nil.

  1. Set the metadata values:

    Ai::ActiveContext::Collections::Code.collection_record.update_metadata!(current_indexing_embedding_model: nil, next_indexing_embedding_model: nil)
  2. indexing_embedding_fields should return an empty array

    Ai::ActiveContext::Collections::Code.indexing_embedding_fields
    => []

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #588847

Edited by Pam Artiaga

Merge request reports

Loading