Update ES cluster and shard size guidelines
What does this MR do and why?
-
Add a new rake task
gitlab:elastic:estimate_shard_sizes
which allows SM customers to estimate document counts and provides shard size recommendations for indexes which contain database records. -
Modify existing rake task
gitlab:elastic:info
to provide document counts. This will be helpful for SM customers (and customer support) to gather info on cluster size -
Modify existing rake task
gitlab:elastic:estimate_cluster_size
to also include wiki repository sizes. -
Updates documentation with more guidance on Advanced search settings for shard and replica sizes.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
N/A - rake task
How to set up and validate locally
-
enable and configure advanced search in gdk
-
run the new rake task
bundle exec rake gitlab:elastic:estimate_shard_sizes
example output
Using approximate counts to estimate shard counts for data indexed from database. This does not include repository data.
The approximate document counts, recommended shard size, and replica size for each index are:
- gitlab-development-projects:
document count: 45
recommended shards: 5
recommended replicas: 1
- gitlab-development-users:
document count: 50
recommended shards: 5
recommended replicas: 1
- gitlab-development-issues:
document count: 626
recommended shards: 5
recommended replicas: 1
- gitlab-development-merge_requests:
document count: 104
recommended shards: 5
recommended replicas: 1
- gitlab-development-notes:
document count: 159
recommended shards: 5
recommended replicas: 1
- gitlab-development-epics:
document count: 37
recommended shards: 5
recommended replicas: 1
Please note that it is possible to index only selected namespaces/projects by using Advanced search indexing restrictions. This estimate does not take into account indexing restrictions.
- run the existing rake task
bundle exec rake gitlab:elastic:info
example output
➜ be rake gitlab:elastic:info
Advanced Search
Server version: 8.11.1
Server distribution: elasticsearch
Indexing enabled: yes
Search enabled: yes
Requeue Indexing workers: yes
Pause indexing: no
Indexing restrictions enabled: no
File size limit: 1024 KiB
Indexing number of shards: 2
Max code indexing concurrency: 30
Indexing Queues
Initial queue: 0
Incremental queue: 0
Indices
- gitlab-development-20240327-1532:
document_count: 42335
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-commits-20240327-1532:
document_count: 137916
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-epics-20240327-1533:
document_count: 37
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-issues-20240327-1532:
document_count: 625
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-merge_requests-20240327-1532:
document_count: 104
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-notes-20240327-1532:
document_count: 79
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-projects-20240327-1533:
document_count: 45
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-users-20240327-1533:
document_count: 50
number_of_shards: 5
number_of_replicas: 1
- gitlab-development-wikis-20240327-1533:
document_count: 4
number_of_shards: 5
number_of_replicas: 1