Skip to content

Update ES cluster and shard size guidelines

What does this MR do and why?

  • Add a new rake task gitlab:elastic:estimate_shard_sizes which allows SM customers to estimate document counts and provides shard size recommendations for indexes which contain database records.

  • Modify existing rake task gitlab:elastic:info to provide document counts. This will be helpful for SM customers (and customer support) to gather info on cluster size

  • Modify existing rake task gitlab:elastic:estimate_cluster_size to also include wiki repository sizes.

  • Updates documentation with more guidance on Advanced search settings for shard and replica sizes.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

N/A - rake task

How to set up and validate locally

  1. enable and configure advanced search in gdk

  2. run the new rake task

bundle exec rake gitlab:elastic:estimate_shard_sizes
example output
Using approximate counts to estimate shard counts for data indexed from database. This does not include repository data.
The approximate document counts, recommended shard size, and replica size for each index are:
- gitlab-development-projects:
   document count: 45
   recommended shards: 5
   recommended replicas: 1
- gitlab-development-users:
   document count: 50
   recommended shards: 5
   recommended replicas: 1
- gitlab-development-issues:
   document count: 626
   recommended shards: 5
   recommended replicas: 1
- gitlab-development-merge_requests:
   document count: 104
   recommended shards: 5
   recommended replicas: 1
- gitlab-development-notes:
   document count: 159
   recommended shards: 5
   recommended replicas: 1
- gitlab-development-epics:
   document count: 37
   recommended shards: 5
   recommended replicas: 1
Please note that it is possible to index only selected namespaces/projects by using Advanced search indexing restrictions. This estimate does not take into account indexing restrictions.
  1. run the existing rake task
bundle exec rake gitlab:elastic:info
example output
➜ be rake gitlab:elastic:info

Advanced Search
Server version:			8.11.1
Server distribution:		elasticsearch
Indexing enabled:		yes
Search enabled:			yes
Requeue Indexing workers:	yes
Pause indexing:			no
Indexing restrictions enabled:	no
File size limit:		1024 KiB
Indexing number of shards:	2
Max code indexing concurrency:	30

Indexing Queues
Initial queue:			0
Incremental queue:		0

Indices
- gitlab-development-20240327-1532:
  document_count: 42335
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-commits-20240327-1532:
  document_count: 137916
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-epics-20240327-1533:
  document_count: 37
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-issues-20240327-1532:
  document_count: 625
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-merge_requests-20240327-1532:
  document_count: 104
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-notes-20240327-1532:
  document_count: 79
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-projects-20240327-1533:
  document_count: 45
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-users-20240327-1533:
  document_count: 50
  number_of_shards: 5
  number_of_replicas: 1
- gitlab-development-wikis-20240327-1533:
  document_count: 4
  number_of_shards: 5
  number_of_replicas: 1
Edited by Terri Chu

Merge request reports