Skip to content

dedicated-test: Improve SSH reconnection and error handling

Vitaliy Kukharik requested to merge ssh-retry-error-handling into master

Fixed: https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/jobs/6183937136

WARNING: Connection terminated, reconnection attempt... (54 from 60).
WARNING: Connection terminated, reconnection attempt... (55 from 60).
WARNING: Connection terminated, reconnection attempt... (56 from 60).
WARNING: Connection terminated, reconnection attempt... (57 from 60).
WARNING: Connection terminated, reconnection attempt... (58 from 60).
WARNING: Connection terminated, reconnection attempt... (59 from 60).
WARNING: Connection terminated, reconnection attempt... (60 from 60).
$ if [[ $? -eq 0 ]]; then # collapsed multi-line command

Uploading artifacts for successful job
00:01
Uploading artifacts...
pgbench_test.passed: found 1 matching artifact files and directories 
Uploading artifacts as "archive" to coordinator... 201 Created  id=6183937136 responseStatus=201 Created token=glcbt-65

Cleaning up project directory and file based variables
00:01
Job succeeded

Changes

  1. Exit with an error if all attempts to reconnect SSH are used

Example: https://gitlab.com/postgres-ai/postgresql-consulting/tests-and-benchmarks/-/jobs/6192707426

WARNING: Connection terminated, reconnection attempt... (26 from 30).
ssh: connect to host 34.89.229.60 port 22: Connection timed out
ssh: connect to host 34.89.229.60 port 22: Connection timed out
WARNING: Connection terminated, reconnection attempt... (27 from 30).
ssh: connect to host 34.89.229.60 port 22: Connection timed out
ssh: connect to host 34.89.229.60 port 22: Connection timed out
WARNING: Connection terminated, reconnection attempt... (28 from 30).
ssh: connect to host 34.89.229.60 port 22: Connection timed out
ssh: connect to host 34.89.229.60 port 22: Connection timed out
WARNING: Connection terminated, reconnection attempt... (29 from 30).
ssh: connect to host 34.89.229.60 port 22: Connection timed out
ssh: connect to host 34.89.229.60 port 22: Connection timed out
WARNING: Connection terminated, reconnection attempt... (30 from 30).
ERROR: Maximum number of reconnection attempts (30) reached. Exiting...

Uploading artifacts for failed job
00:00
Uploading artifacts...
WARNING: pgbench_test.passed: no matching files. Ensure that the artifact path is relative to the working directory (/builds/postgres-ai/postgresql-consulting/tests-and-benchmarks) 
ERROR: No files to upload                          

Cleaning up project directory and file based variables
00:01
ERROR: Job failed: exit code 1
  1. Avoid immediate exit in case of an error executing the ssh command.
  • use all attempts (max_attempts variable) to retry SSH connect.
  1. Added ConnectTimeout=10 option to ssh_command function.
Edited by Vitaliy Kukharik

Merge request reports