Zoekt RolloutWorker fails to create indices for namespaces with correct replica counts

Summary

The Zoekt RolloutWorker fails to create indices for namespaces that have correct replica counts but are missing indices. The SelectionService only selects namespaces with mismatched replica counts, causing namespaces with proper replicas but no indices to never be processed for indexing.

Steps to reproduce

  1. Enable Zoekt indexing for a namespace
  2. Ensure the namespace has the correct number of replicas created (e.g., 1 replica matching the default)
  3. Verify that no indices exist for the namespace (e.g., due to a previous failure or partial provisioning)
  4. Observe that RolloutWorker runs periodically but never creates indices for this namespace

What is the current bug behavior?

  • SelectionService#fetch_enabled_namespace_for_indexing uses each_batch_with_mismatched_replicas scope
  • This scope only selects namespaces where replica count doesn't match expected count
  • Namespaces with correct replica counts but missing indices are never selected
  • RolloutWorker never processes these namespaces, leaving them without indices indefinitely
  • Customer impact: 27 namespaces enabled for indexing with correct replicas but 0 indices created

What is the expected correct behavior?

  • RolloutWorker should process namespaces that need indices created, regardless of replica count status
  • Namespaces with missing indices should be selected for processing
  • ProvisioningService should create indices for namespaces that need them
  • After rollout completes, all enabled namespaces should have both correct replica counts AND indices

Relevant logs and/or screenshots

From customer environment (GitLab 18.7.0):

Indexing enabled: yes
All 3 nodes online and reachable
27 namespaces enabled for indexing
27 namespaces with 1 replica each (matching expected)
27 namespaces without indices
0 tasks pending/processing
RolloutWorker running without errors

Possible fixes

The fix involves updating SelectionService to select namespaces that either:

  1. Have mismatched replica counts (existing behavior), OR
  2. Have missing indices (new behavior)

Files to modify:

  • ee/app/models/search/zoekt/enabled_namespace.rb: Add with_mismatched_replicas_or_missing_indices scope
  • ee/app/services/search/zoekt/selection_service.rb: Use new scope instead of with_mismatched_replicas

Code changes:

# ee/app/models/search/zoekt/enabled_namespace.rb
scope :with_mismatched_replicas_or_missing_indices, -> do
  with_mismatched_replicas.or(with_missing_indices)
end

def self.each_batch_with_mismatched_replicas_or_missing_indices(batch_size: 5000, &block)
  each_batch(of: batch_size) do |batch|
    batch.with_mismatched_replicas_or_missing_indices.each(&block)
  end
end
# ee/app/services/search/zoekt/selection_service.rb
def fetch_enabled_namespace_for_indexing
  [].tap do |batch|
    ::Search::Zoekt::EnabledNamespace.with_rollout_allowed.each_batch_with_mismatched_replicas_or_missing_indices do |ns|
      batch << ns
      break if batch.count >= max_batch_size
    end
  end
end

Related Issues

  • Request for Help: https://gitlab.com/gitlab-com/request-for-help/-/issues/4126
  • Zendesk Ticket: https://gitlab.zendesk.com/agent/tickets/689154

cc @johnmason @rkumar555 @changzhengliu

Assignee Loading
Time tracking Loading