Investigate environment network blocks on GCP
Since yesterday (2020-09-15) the 1k environment has started failing randomly during it's performance test run. After some investigation it's been found this is due to some irregular network issue.
On the surface the issue is haproxy detects that the gitlab-rails
node is down via it's heath checks and stops routing traffic accordingly with the base error message reason: Layer4 timeout, info: " at initial connection step of tcp-check"
. Behind the scenes this appears to be that internal network connections (internal connections have no restrictions) between haproxy and the rails node is blocked, with both nodes completely not able to communication with each other. ping
reports 100% packet loss.
As a gut feel this looks like a block is being put in place, a change in GCP policy perhaps? Connections between the two boxes eventually returns after a while, around 10 minutes. Other boxes in the environment appear to be fine and none of the other environments are showing the behavior at first glance.
Test run from 1k today shows the breakdown in communication happening at least twice:
* Environment: 1k
* Environment Version: 13.4.0-pre `0e9ed172118`
* Option: 60s_20rps
* Date: 2020-09-16
* Run Time: 1h 10m 31.16s (Start: 01:18:08 UTC, End: 02:28:39 UTC)
* GPT Version: v2.0.6
❯ Overall Results Score: 47.72%
NAME | RPS | RPS RESULT | TTFB AVG | TTFB P90 | REQ STATUS | RESULT
---------------------------------------------------------|------|--------------------|-----------|-----------------------|-----------------|--------
api_v4_groups | 20/s | 19.53/s (>16.00/s) | 44.81ms | 56.20ms (<500ms) | 100.00% (>95%) | Passed
api_v4_groups_group | 20/s | 4.77/s (>1.60/s) | 3733.04ms | 7233.80ms (<7500ms) | 100.00% (>95%) | Passed
api_v4_groups_group_subgroups | 20/s | 19.72/s (>16.00/s) | 45.47ms | 50.07ms (<1500ms) | 100.00% (>95%) | Passed
api_v4_groups_projects | 20/s | 13.9/s (>1.60/s) | 1251.40ms | 1636.01ms (<7000ms) | 100.00% (>95%) | Passed
api_v4_projects | 20/s | 19.72/s (>0.80/s) | 0.46ms | 0.51ms (<11000ms) | 0.00% (>95%) | FAILED²
api_v4_projects_deploy_keys | 20/s | 19.72/s (>16.00/s) | 0.49ms | 0.56ms (<500ms) | 0.00% (>95%) | FAILED²
api_v4_projects_issues | 20/s | 19.72/s (>9.60/s) | 0.46ms | 0.51ms (<2000ms) | 0.00% (>95%) | FAILED²
api_v4_projects_issues_issue | 20/s | 19.72/s (>9.60/s) | 0.49ms | 0.56ms (<3000ms) | 0.00% (>95%) | FAILED²
api_v4_projects_languages | 20/s | 19.72/s (>16.00/s) | 0.47ms | 0.52ms (<500ms) | 0.00% (>95%) | FAILED²
api_v4_projects_merge_requests | 20/s | 19.72/s (>9.60/s) | 0.45ms | 0.51ms (<2000ms) | 0.00% (>95%) | FAILED²
api_v4_projects_merge_requests_merge_request | 20/s | 19.72/s (>16.00/s) | 0.48ms | 0.55ms (<500ms) | 0.00% (>95%) | FAILED²
api_v4_projects_merge_requests_merge_request_changes | 20/s | 2.73/s (>1.60/s) | 5990.80ms | 8548.67ms (<12000ms) | 100.00% (>95%) | Passed
api_v4_projects_merge_requests_merge_request_commits | 20/s | 19.53/s (>16.00/s) | 42.59ms | 53.12ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_merge_requests_merge_request_discussions | 20/s | 19.35/s (>12.80/s) | 131.15ms | 174.18ms (<1500ms) | 100.00% (>95%) | Passed
api_v4_projects_pagination_keyset | 20/s | 5.98/s (>0.80/s) | 2961.53ms | 3695.12ms (<11000ms) | 100.00% (>95%) | Passed
api_v4_projects_project | 20/s | 19.58/s (>16.00/s) | 64.78ms | 70.24ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_project_pipelines | 20/s | 19.6/s (>16.00/s) | 34.96ms | 41.39ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_project_services | 20/s | 19.67/s (>16.00/s) | 26.62ms | 31.73ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_branches | 20/s | 12.37/s (>3.20/s) | 1393.03ms | 1636.95ms (<7500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_branches_branch | 20/s | 19.7/s (>16.00/s) | 43.31ms | 48.74ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_commits | 20/s | 19.68/s (>16.00/s) | 45.73ms | 55.99ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_commits_commit | 20/s | 18.93/s (>16.00/s) | 93.11ms | 89.35ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_commits_commit_diff | 20/s | 19.6/s (>16.00/s) | 87.98ms | 92.64ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_compare_commits | 20/s | 19.17/s (>16.00/s) | 121.62ms | 128.55ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_files_file | 20/s | 19.57/s (>16.00/s) | 85.59ms | 123.99ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_files_file_blame | 20/s | 2.37/s (>0.16/s) | 7131.60ms | 10314.93ms (<35000ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_files_file_raw | 20/s | 19.55/s (>16.00/s) | 92.12ms | 130.73ms (<500ms) | 100.00% (>95%) | Passed
api_v4_projects_repository_tree | 20/s | 19.62/s (>16.00/s) | 58.68ms | 65.44ms (<500ms) | 100.00% (>95%) | Passed
api_v4_search_global | 20/s | 19.28/s (>4.80/s) | 168.10ms | 265.04ms (<25000ms) | 100.00% (>9.5%) | Passed
api_v4_search_groups | 20/s | 19.15/s (>4.80/s) | 106.08ms | 171.34ms (<25000ms) | 100.00% (>9.5%) | Passed
api_v4_search_projects | 20/s | 19.2/s (>4.80/s) | 118.26ms | 185.76ms (<25000ms) | 100.00% (>9.5%) | Passed
api_v4_user | 20/s | 19.7/s (>16.00/s) | 20.14ms | 23.50ms (<500ms) | 100.00% (>95%) | Passed
api_v4_users | 20/s | 19.67/s (>16.00/s) | 35.03ms | 42.42ms (<500ms) | 100.00% (>95%) | Passed
git_ls_remote | 2/s | 1.98/s (>1.60/s) | 43.48ms | 37.78ms (<500ms) | 100.00% (>95%) | Passed
git_pull | 2/s | 2.03/s (>1.60/s) | 45.06ms | 48.21ms (<500ms) | 100.00% (>95%) | Passed
git_push | 2/s | 1.93/s (>1.60/s) | 354.15ms | 407.23ms (<600ms) | 100.00% (>95%) | Passed
scenario_api_new_branches | 1/s | 1.03/s (>0.16/s) | 456.96ms | 1042.10ms (<1500ms) | 100.00% (>95%) | Passed
scenario_api_new_commits | 1/s | 1.07/s (>0.16/s) | 425.37ms | 590.31ms (<700ms) | 100.00% (>95%) | Passed
scenario_api_new_group_variables | 1/s | 1.05/s (>0.16/s) | 53.04ms | 74.63ms (<500ms) | 100.00% (>95%) | Passed
scenario_api_new_issues | - | - | - | - | - | FAILED²
scenario_api_new_project_variables | - | - | - | - | - | FAILED²
web_group | 2/s | 1.98/s (>1.60/s) | 0.52ms | 0.57ms (<500ms) | 0.00% (>95%) | FAILED²
web_project | 2/s | 1.98/s (>1.60/s) | 0.58ms | 0.66ms (<750ms) | 0.00% (>95%) | FAILED²
web_project_branches | - | - | - | - | - | FAILED²
web_project_commit | - | - | - | - | - | FAILED²
web_project_commits | 2/s | 1.98/s (>1.60/s) | 0.50ms | 0.57ms (<750ms) | 0.00% (>95%) | FAILED²
web_project_file_blame | - | - | - | - | - | FAILED²
web_project_file_rendered | 2/s | 1.98/s (>0.02/s) | 0.51ms | 0.62ms (<30000ms) | 0.00% (>95%) | FAILED²
web_project_file_source | 2/s | 1.98/s (>0.16/s) | 0.48ms | 0.55ms (<5000ms) | 0.00% (>95%) | FAILED²
web_project_files | 2/s | 1.98/s (>1.20/s) | 0.57ms | 0.58ms (<1000ms) | 0.00% (>95%) | FAILED²
web_project_issue | - | - | - | - | - | FAILED²
web_project_issues | - | - | - | - | - | FAILED²
web_project_merge_request_changes | - | - | - | - | - | FAILED²
web_project_merge_request_commits | - | - | - | - | - | FAILED²
web_project_merge_request_discussions | - | - | - | - | - | FAILED²
web_project_merge_requests | - | - | - | - | - | FAILED²
web_project_pipelines | 2/s | 1.98/s (>0.96/s) | 0.48ms | 0.53ms (<1000ms) | 0.00% (>95%) | FAILED²
web_search_global | 2/s | 1.92/s (>1.60/s) | 0.45ms | 0.52ms (<1500ms) | 0.00% (>95%) | FAILED²
web_search_groups | - | - | - | - | - | FAILED²
web_search_projects | - | - | - | - | - | FAILED²
web_user | 2/s | 1.98/s (>0.96/s) | 0.47ms | 0.55ms (<4000ms) | 0.00% (>95%) | FAILED²
Task is to investigate and fix.