10,000 user reference architecture
GitLab would like to build a 10,000 user reference architecture. This will enabled us to confidently share an architecture with customers that is:
- based on knowledge of running GitLab.com
- based on knowledge of working with existing large enterprise customers
- load tested for approximate real-world load expected with 10,000 users
- highly-available through the entire stack (to the extent currently possible)
It's important to note that this architecture won't be sufficient for all customers with 10,000 users - some have unique use-cases, particularly large projects or more average active users. But it will be a solid starting point and 'reference'(!) to work from.
This to think about:
- How will we validate this architecture?
- Where should it be documented? HA documentation probably. How and in what format is most appropriate. Recent changes to HA docs might facilitate this nicely - with sections for each type of scaled/HA architecture rolling out soon. See Basic Scaling format.
- It should probably include Prometheus and Grafana along with some useful dashboards to ensure Support have insight when problems occur.
@dawsmith @Finotto @glopezfernandez Can you please nominate someone from your team(s) share GitLab.com infra knowledge and experience with us?
cc/ @lbot FYI
-
Draft specifications for the 10k reference architecture (starting point) - DONE #1513 (comment 148751392) -
Build a POC environment where we can do some initial testing (DO for now. Hand built. Will likely move to QA and different environment later) - DONE #1513 (comment 160013945) -
Configure a load balancer -
Configure separate Sidekiq nodes (Sidekiq Cluster) - See #1513 (comment 162950214). Currently, we can't support this as intended.
-
Configure Prometheus metrics so we can get some baseline metrics during load testing -
Work with @andrewn to do some initial load testing -
Tweak specifications, if necessary. -
Update documentation or create new issues to address documentation shortfalls -
Turn over reference architecture to Quality/Document the reference architecture/determine next steps
Edited by Drew Blessing