ActiveContext: integrate new model design into the pipeline (!222417) · Merge requests · GitLab.org / GitLab

What does this MR do and why?

Integrates the hash/object-based model redesign introduced in previous MRs (see below) into the ActiveContext pipelines:

Introduces a derisk Feature Flag to control the switch to the model redesign: active_context_embedding_model_redesign

Switches to accessing the redesign model metadata if active_context_embedding_model_redesign is enabled.

Class/Method	Original behavior	Updated behavior
`Collections::Code.indexing?`	check the presence of `current_indexing_embedding_versions`	check the presence of `indexing_embedding_models`
`References::Code.embeddings` block	call `apply_embeddings`	call `apply_embeddings_with_model_redesign`
`MarkRepositoryAsReadyEventWorker.embedding_fields`	return `Collections::Code.current_embedding_fields`	return `Collections::Code.indexing_embedding_fields`
`Queries::Code`	use the `Collections::Code.current_search_embedding_version`	use the `Collections::Code.search_embedding_model`

Additional minor refactor:
- in ::ActiveContext::EmbeddingModel, symbolize the field property to follow the old version-based approach
- in ::ActiveContext::Concerns::Collection, stringify the indexing_embedding_fields to follow the old version-based approach

Step-by-step changes summary

MR	Status
Introduce the new hash/object-based models	✅
Add `indexing_embedding_fields` to Collection class	✅
Add `embeddings_with_model_redesign` preprocessor	✅
Add migration to update models metadata	✅
Integrate model redesign into Code Embeddings pipeline	This MR

References

Issue: [ActiveContext] Redesign how models are referen... (#588847)
Feature Flag rollout issue: [FF] `active_context_embedding_model_redesign` -- (#588677)

Screenshots or screen recordings

Please see validation steps and expected outcomes below.

How to set up and validate locally

Initial setup

Setup your Code Embeddings Indexing pipeline

Verify the model metadata. This is rebased on top of ActiveContext: migration for update_code_model_... (!223302 - closed), so the migration to set the models metadata should already be run. However, you can verify that they have the expected values:

Ai::ActiveContext::Collections::Code.collection_record.current_indexing_embedding_model
=> {"field"=>"embeddings_v1", "model_ref"=>"text_embedding_005_vertex"}

Ai::ActiveContext::Collections::Code.collection_record.search_embedding_model
=> {"field"=>"embeddings_v1", "model_ref"=>"text_embedding_005_vertex"}

Ai::ActiveContext::Collections::Code.collection_record.next_indexing_embedding_model
=> nil

Enable the active_context_embedding_model_redesign Feature Flag
```
Feature.enable(:active_context_embedding_model_redesign)
```

Testing the indexing pipeline

Optional: testing a single reference - expand for test steps

The migration should already take care of this, but make sure that the current_indexing_embedding_model is set:

Ai::ActiveContext::Collections::Code.collection_record.update_metadata!(current_indexing_embedding_model: { model_ref: 'text_embedding_005_vertex', field: 'embeddings_v1' })

On the gitlab_active_context_code index, pick a document that still has an empty embeddings_v1 field, and manually add the document's ID to the bulk process queue:
```
::Ai::ActiveContext::Collections::Code.track_refs!(routing: "1", hashes: ["4b48fbce868f829cd39d1757dc3937af5d7a56d7dc9973f45d096050b54330dd"])
```
Wait for the ::Ai::ActiveContext::BulkProcessWorker to process the queued ref, or you can run it manually:
```
::Ai::ActiveContext::BulkProcessWorker.new.perform("Ai::ActiveContext::Queues::Code", 0)
```

Testing the entire pipeline - expand for test steps

This allows you test both the reference processing (in particular the embeddings_with_model_redesign preprocessor) and the MarkRepositoryAsReadyEventWorker.

Pick an eligible project to test. Make sure it has not yet been indexed previously. It's also best to test with a project that only has a few files, so the indexing is completed quickly.

note: you may use the seeded gitlab-duo/test project, with ID 1000000

Run ad-hoc indexing for that project:

project_id = 1000000
Ai::ActiveContext::Code::AdHocIndexingWorker.new.perform(project_id)

Wait until the project's ai_active_context_repository record is marked as ready. Alternatively, you can run MarkRepositoryAsReadyEventWorker worker (since this is scheduled to run every 2 hours):

# optional: run worker
Ai::ActiveContext::Code::MarkRepositoryAsReadyEventWorker.new.handle_event("")

# verify the state of the test repository
Ai::ActiveContext::Code::Repository.find_by(project_id: project_id).state
=> "ready"

# additional verification for the last queued item
Ai::ActiveContext::Code::Repository.find_by(project_id: project_id).initial_indexing_last_queued_item
=> "8e0d0f24bc2cb6c4bc2a42cd5af55877991acc2319d16c183000748dfe0c83da"

Verify that the project has documents with content and embeddings in the vector store.

On the browser/through curl, test with this url + params:

http://localhost:9200/gitlab_active_context_code/_search?q=project_id:1000000&pretty=true

Use the ::Ai::ActiveContext::Collections::Code.search, but only filtering by project_id:

query = ::ActiveContext::Query.filter(project_id: project_id)
result = ::Ai::ActiveContext::Collections::Code.search(query: query, user: nil)
items = [].tap { |arr| result.each {|r| arr << r } }
items.all? { |item| item['content'].present? && item['embeddings_v1'].present? }
=> true

Testing the `semantic_code_search` tool

The simplest way to test this is through an MCP debugger tool like MCP inspector

Test with MCP inspector - expand for test steps

Authorize the local GitLab MCP server

NODE_TLS_REJECT_UNAUTHORIZED=0 npx \
mcp-remote@latest http://gdk.test:3000/api/v4/mcp --static-oauth-client-metadata '{"scope": "mcp"}' --debug --allow-http

Run the MCP inspector
```
npx @modelcontextprotocol/inspector npx
```
Test the semantic_code_search tool, e.g.

IMPORTANT REGRESSION TEST: Verify that the version-based model design still works

Disable the active_context_embedding_model_redesign Feature Flag
Do all of the tests detailed above

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #588847

Edited Feb 19, 2026 by Pam Artiaga

ActiveContext: integrate new model design into the pipeline