Investigate degradation in GPT AI test results with higher RPS
Summary
Degradation might be caused by the fact that all environments are currently using same machine size for AI mock backend - ai_gateway_machine_type = "n1-highcpu-2"
The issue is to debug this further and identify what needs to be done to align results on various size environments. If specs of AI backend don't improve performance - a new performance issue for the product should be risen following template.
Details
Performance tests for AI features are failing with higher load on bigger environments. For example, below is 10k/200RPS resutls
NAME | RPS | RPS RESULT | TTFB AVG | TTFB P90 | REQ STATUS | RESULT
----------------------------------------------|-------|---------------------|-----------|---------------------|----------------|-------
api_v4_code_suggestions_completions | 200/s | 100.32/s (>48.00/s) | 1841.46ms | 4271.04ms (<8000ms) | 100.00% (>99%) | Passed
api_v4_code_suggestions_completions_streaming | 200/s | 101.04/s (>48.00/s) | 1826.49ms | 4236.84ms (<8000ms) | 100.00% (>99%) | Passed
api_v4_code_suggestions_generations | 200/s | 101.74/s (>48.00/s) | 1823.70ms | 5540.01ms (<8000ms) | 100.00% (>99%) | Passed
When 1k results are significantly lower
NAME | RPS | RPS RESULT | TTFB AVG | TTFB P90 | REQ STATUS | RESULT
----------------------------------------------|------|--------------------|-----------|--------------------|----------------|---------
api_v4_code_suggestions_completions | 20/s | 19.46/s (>16.00/s) | 84.84ms | 80.84ms (<500ms) | 100.00% (>99%) | Passed
api_v4_code_suggestions_completions_streaming | 20/s | 19.93/s (>16.00/s) | 76.92ms | 79.32ms (<500ms) | 100.00% (>99%) | Passed
api_v4_code_suggestions_generations | 20/s | 19.86/s (>16.00/s) | 85.14ms | 87.66ms (<500ms) | 100.00% (>99%) | Passed
This might be caused by the fact that all environments are currently using same machine size for AI mock backend - ai_gateway_machine_type = "n1-highcpu-2"