Advanced Search: Allow disabling the code scope
## Overview This epic introduces the capability to disable code indexing in Advanced Search, providing administrators with control over whether blob/code documents are indexed and searchable. This feature allows GitLab instances to optimize their Elasticsearch usage by selectively disabling code search while maintaining other search functionalities. This is especially important with the availability of [Exact Code Search](https://docs.gitlab.com/user/search/exact_code_search/) for Gitlab.com, Dedicated, and SM instances. ## Key Objectives - **Resource Optimization**: Enable administrators to reduce Elasticsearch index size and improve performance by disabling code indexing when not needed - **Flexible Configuration**: Provide a toggle mechanism that can be enabled/disabled based on organizational requirements - **Data Integrity**: Ensure proper handling of existing indexed data when the setting changes - **Seamless Operations**: Maintain search functionality for commits and other content while managing code-specific indexing ## Implementation Components ### 1. Application Setting Infrastructure - Add new `elasticsearch_code_scope` setting to the elasticsearch JSONB column (default: `true`) - Establish the foundational configuration mechanism for the feature ### 2. Indexing Logic Integration - Modify `Search::Elastic::CommitIndexerWorker` to respect the new setting - Update Go indexer to conditionally index blob documents based on the setting - Ensure commit indexing continues regardless of code scope setting ### 3. Data Management - Implement automated cleanup of existing blob documents when code scope is disabled - Create scheduling mechanism (similar to Zoekt scheduling service) for document deletion - Ensure efficient and safe removal of code-related documents from the index ### 4. Backfill Operations - Develop worker/scheduling task to re-index projects when code scope is re-enabled - Handle the transition from disabled to enabled state by backfilling missing code documents - Ensure data consistency during setting toggles ### 5. Administrative Interface - Expose the setting in the admin UI for easy configuration - Update `gitlab:elastic:info` rake task to display current code scope status - Provide clear visibility into the feature's current state ## Dependencies All issues follow a sequential dependency chain: 1. Application setting must be created first 2. Indexing logic depends on the setting being available 3. Data cleanup and backfill operations build upon the core functionality 4. UI exposure comes after all backend functionality is complete ## Benefits - **Performance**: Reduced index size and faster search operations for non-code content - **Cost Efficiency**: Lower Elasticsearch resource consumption for instances that don't require code search - **Flexibility**: Administrators can adapt search functionality to their specific use cases - **Maintainability**: Clean separation between code and non-code search functionality
epic