Elasticsearch Sizing, Capacity, Planning, and Documentation for Deployed Elastic Cloud (formerly ECE)
## Overview The Purpose of this epic is to determine sizing and capacity planning details. Enterprise Cloud Elastic is suited for GitLab.com's needs because at the scale that GitLab.com Scales, there is a need to scale individual nodes at different times to account for more usage, and more content indexed. Without Scaling each node by tasks, we will underuse processing or overuse memory. - Provide details for estimating capacity and sizing of the Elasticsearch Cluster(s) for GitLab.com - Provide a Guide for Self-managed customers to estimate their needs ### Why this is important Scaling is expensive not just by hardware needs but also by time to research and evaluate the solutions. We are anticipating a large amount of growth for Self-managed users if we do this correctly. This means that an inefficient approach to the cost of infrastructure is passed to the customer. ## What are we researching There are specific resources we should estimate for planning the ES cluster. Storage, network, Compute, Memory We should also consider the architectural changes that might be needed. Like using Hot-Warm-Frozen ### Desired Findings - Optimal Shard Size - Optimal Shards per Node - Document growth over time - number of active users - JVM Memory Allocation - Caching Logic GC1G or CMS - Growth Projections Documents - Growth Projections for Usage - Ratio of Dedicated nodes by - Optimal Indexing speed documents per min. - Ideal dedicated node ratio for Index and Delete optimized - Triggers to Scale-up/down on any specific node type. ### Questions - Can this use something that is known from the self-managed instance reporting? - can we use data to determine some of these by percentages? - can we use a calculator that references repo details to calculate the hardware needs? - Can we use Cloud marketplaces to set up deployments for Gitlab certified ELK? - Hardware generalized to XS-M-L-XL =XL+ with scaling? ### Method for researching We should determine a few things for each item to get to a conclusive answer. - How are we testing this? - What are the variables? - What is the systematic approach to identifying these comparables. - Setup the Benchmarking process. - What are the limitations to variables. (example Memory are preset configs by standard sizes) - How will I record and make the findings shareable in a guide for customers. - How can the conclusion be simplified? ## Reference Documentation - [elasticsearch-sizing-and-capacity-planning.pdf](/uploads/e21f315177585c93ea6e748470367dca/elasticsearch-sizing-and-capacity-planning.pdf) - https://gitlab.com/gitlab-org/gitlab/-/issues/1676 - <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION --> *This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.* <!-- triage-serverless v3 PLEASE DO NOT REMOVE THIS SECTION -->
epic