Topology service performance under production load conditions validation

Overview

The topology service is a critical component of the Cells infrastructure that needs to handle production-scale load and support future growth. Before deploying to production, we need to validate that the service can meet performance requirements under realistic load conditions.

This issue tracks the execution of comprehensive load testing following GitLab's performance guidelines to ensure production readiness.

Related Epic: &4 (closed)

Goal

Validate that the topology service can handle production load and scale to support growth by:

Executing load tests following GitLab performance testing guidelines
Establishing baseline performance metrics (throughput, latency, resource utilization)
Identifying performance bottlenecks and capacity limits
Documenting production readiness findings

Proposal

Define test scenarios based on expected production usage patterns:
- Normal load conditions
- Peak load scenarios
- Growth projections (e.g., 2x, 5x current expected load)
Set up load testing environment following GitLab guidelines:
- Configure realistic test data
- Set up monitoring and observability
- Prepare load generation tools
Execute load tests measuring:
- Request throughput (requests/second)
- Response latency (p50, p95, p99)
- Resource utilization (CPU, memory, I/O)
- Error rates under load
Analyze results and document:
- Performance baselines
- Identified bottlenecks
- Capacity limits
- Recommendations for optimization or scaling

Exit Criteria

Load test scenarios defined based on production requirements
Load testing environment configured with monitoring
Load tests executed for normal, peak, and growth scenarios
Performance metrics documented (throughput, latency, resource usage)
Bottlenecks and capacity limits identified and documented
Production readiness assessment completed with recommendations

References

GitLab Performance Guidelines
Epic: &4 (closed)

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information