ActiveContext embedding model redesign
What does this MR do and why?
As part of [Code Embeddings] Allow SM instances with self-... (gitlab-org#20110), we agreed to redesign how embedding models are referenced in ActiveContext in order to support self-hosted models.
This MR is part 1 of the model redesign.
In this MR, we:
-
introduce additional
metadatafields for the collection record, all with the same model schema-
new metadatas:
current_indexing_embedding_model,next_indexing_embedding_model,search_embedding_model -
schema:
{ "model_type": ["string", "null"], "model_ref": "string", "field": "string", "dimensions": ["integer", "null"] }
-
-
introduce an
::ActiveContext::EmbeddingModelclass- properties:
model_name,field,llm_class,llm_params - method:
generate_embeddings- delegates tollm_class.new(...).execute
- properties:
-
introduce a
ModelSelectorclass to be used byAi::ActiveContext::Collections::Code- this is essentially a factory class that builds the
ActiveContext::EmbeddingModelobject according to the model metadata
- this is essentially a factory class that builds the
In a follow-up MR, we will:
- integrate the new model design into the embeddings generation process, in particular the bulk embeddings processing and the embedding generation for search
- MR: ActiveContext: integrate new model design into ... (!222417 - merged)
- note: this effort is still part of the model redesign, but it will be submitted in a separate MR to minimize risk and for easier review
In follow-up issues, we will:
- add support for different model types (
self-hosted|gitlab-managed|byok) for self-hosted AIGW setups
References
- Issue: [ActiveContext] Redesign how models are referen... (#588847)
- Proposal thread: &20110 (comment 3056711151)
- Initial POC: Draft: POC: ActiveContext model selection redesign (!222017 - closed)
- Epic: [Code Embeddings] Allow SM instances with self-... (gitlab-org#20110)
Screenshots or screen recordings
N/A
How to set up and validate locally
Since this is not yet integrated into the actual ActiveContext processes, the unit tests should cover most validations. However, we can still validate a few things locally:
Test new embeddings generation approach
Test embeddings generation through ::ActiveContext::EmbeddingModel#generate_embeddings method. On the Rails console:
-
Set the
current_indexing_embedding_modelmetadata of your Code Collection record:Ai::ActiveContext::Collections::Code.collection_record.update_metadata!( current_indexing_embedding_model: { model_ref: 'text_embedding_005_vertex', field: 'embeddings_v1' } ) -
Call
generate_embeddingsonAi::ActiveContext::Collections::Code.current_indexing_embedding_model. This should return the embeddings as expected:Ai::ActiveContext::Collections::Code.current_indexing_embedding_model.generate_embeddings( "test", unit_primitive: 'generate_embeddings_codebase', user: User.first )
Test that the ActiveContext processes are still working as expected
We need to make sure that the current ActiveContext processes (which still uses the version-based models) are still working as expected.
-
Setup your Code Embeddings Indexing pipeline
-
In
ee/app/services/ai/active_context/code/indexing_service_base.rb, comment out the code insideenqueue_refs!Note: this is for easier verification, allowing you to manually validate the refs processed by the
::Ai::ActiveContext::BulkProcessWorkerwithout the confusion of the same worker automatically picking up queued refs in the background.diff --git a/ee/app/services/ai/active_context/code/indexing_service_base.rb b/ee/app/services/ai/active_context/code/indexing_service_base.rb index 125148469843..afb30d49ac4e 100644 --- a/ee/app/services/ai/active_context/code/indexing_service_base.rb +++ b/ee/app/services/ai/active_context/code/indexing_service_base.rb @@ -34,7 +34,7 @@ def run_indexer!(&block) end def enqueue_refs!(ids) - ::Ai::ActiveContext::Collections::Code.track_refs!(hashes: ids, routing: repository.project_id) + # ::Ai::ActiveContext::Collections::Code.track_refs!(hashes: ids, routing: repository.project_id) end -
Index new code by doing either of the following:
- run initial indexing for a project you have not indexed before
- push new commits to a project that has already gone through initial indexing
-
On the
gitlab_active_context_codeindex, pick one of the chunks created during indexing, and verify that theembeddings_v1field of this chunk should still be empty. -
Manually add the chunk's id/ref to the bulk processing queue:
::Ai::ActiveContext::Collections::Code.track_refs!(routing: "1", hashes: ["4b48fbce868f829cd39d1757dc3937af5d7a56d7dc9973f45d096050b54330dd"]) -
Wait for the
::Ai::ActiveContext::BulkProcessWorkerto process the queued ref, or you can run it manually:::Ai::ActiveContext::BulkProcessWorker.new.perform("Ai::ActiveContext::Queues::Code", 0) -
Check the document on the vector store index and verify that its
embeddings_v1field has been filled. -
For further verification, check the
log/active_context.logand verify that there are no errors related to the embeddings version and processing
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #588847