ActiveContext embedding model redesign

What does this MR do and why?

As part of [Code Embeddings] Allow SM instances with self-... (gitlab-org#20110), we agreed to redesign how embedding models are referenced in ActiveContext in order to support self-hosted models.

This MR is part 1 of the model redesign.

In this MR, we:

  • introduce additional metadata fields for the collection record, all with the same model schema

    • new metadatas: current_indexing_embedding_model, next_indexing_embedding_model, search_embedding_model

    • schema:

      {
        "model_type": ["string", "null"],
        "model_ref": "string",
        "field": "string",
        "dimensions": ["integer", "null"]
      }
  • introduce an ::ActiveContext::EmbeddingModel class

    • properties: model_name, field, llm_class, llm_params
    • method: generate_embeddings - delegates to llm_class.new(...).execute
  • introduce a ModelSelector class to be used by Ai::ActiveContext::Collections::Code

    • this is essentially a factory class that builds the ActiveContext::EmbeddingModel object according to the model metadata

In a follow-up MR, we will:

  • integrate the new model design into the embeddings generation process, in particular the bulk embeddings processing and the embedding generation for search

In follow-up issues, we will:

  • add support for different model types (self-hosted|gitlab-managed|byok) for self-hosted AIGW setups

References

Screenshots or screen recordings

N/A

How to set up and validate locally

Since this is not yet integrated into the actual ActiveContext processes, the unit tests should cover most validations. However, we can still validate a few things locally:

Test new embeddings generation approach

Test embeddings generation through ::ActiveContext::EmbeddingModel#generate_embeddings method. On the Rails console:

  1. Set the current_indexing_embedding_model metadata of your Code Collection record:

    Ai::ActiveContext::Collections::Code.collection_record.update_metadata!(
      current_indexing_embedding_model: { model_ref: 'text_embedding_005_vertex', field: 'embeddings_v1' }
    )
  2. Call generate_embeddings on Ai::ActiveContext::Collections::Code.current_indexing_embedding_model. This should return the embeddings as expected:

    Ai::ActiveContext::Collections::Code.current_indexing_embedding_model.generate_embeddings(
      "test", unit_primitive: 'generate_embeddings_codebase', user: User.first
    )

Test that the ActiveContext processes are still working as expected

We need to make sure that the current ActiveContext processes (which still uses the version-based models) are still working as expected.

  1. Setup your Code Embeddings Indexing pipeline

  2. In ee/app/services/ai/active_context/code/indexing_service_base.rb, comment out the code inside enqueue_refs!

    Note: this is for easier verification, allowing you to manually validate the refs processed by the ::Ai::ActiveContext::BulkProcessWorker without the confusion of the same worker automatically picking up queued refs in the background.

    diff --git a/ee/app/services/ai/active_context/code/indexing_service_base.rb b/ee/app/services/ai/active_context/code/indexing_service_base.rb
    index 125148469843..afb30d49ac4e 100644
    --- a/ee/app/services/ai/active_context/code/indexing_service_base.rb
    +++ b/ee/app/services/ai/active_context/code/indexing_service_base.rb
    @@ -34,7 +34,7 @@ def run_indexer!(&block)
            end
    
            def enqueue_refs!(ids)
    -          ::Ai::ActiveContext::Collections::Code.track_refs!(hashes: ids, routing: repository.project_id)
    +          # ::Ai::ActiveContext::Collections::Code.track_refs!(hashes: ids, routing: repository.project_id)
            end
  3. Index new code by doing either of the following:

    • run initial indexing for a project you have not indexed before
    • push new commits to a project that has already gone through initial indexing
  4. On the gitlab_active_context_code index, pick one of the chunks created during indexing, and verify that the embeddings_v1 field of this chunk should still be empty.

  5. Manually add the chunk's id/ref to the bulk processing queue:

    ::Ai::ActiveContext::Collections::Code.track_refs!(routing: "1", hashes: ["4b48fbce868f829cd39d1757dc3937af5d7a56d7dc9973f45d096050b54330dd"])
  6. Wait for the ::Ai::ActiveContext::BulkProcessWorker to process the queued ref, or you can run it manually:

    ::Ai::ActiveContext::BulkProcessWorker.new.perform("Ai::ActiveContext::Queues::Code", 0)
  7. Check the document on the vector store index and verify that its embeddings_v1 field has been filled.

  8. For further verification, check the log/active_context.log and verify that there are no errors related to the embeddings version and processing

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #588847

Edited by Pam Artiaga

Merge request reports

Loading