Skip to content

Fix gtm_tls_impl.c regression (in a7d6551e) related to ECONNRESET handling

Narayanan Iyer requested to merge nars1/YDBEncrypt:openssl3 into master

Background

  • As part of a prior commit (a7d6551e), a change was done to the ssl_error() function in the SSL_ERROR_SYSCALL case to not set tls_errno to ECONNRESET in case errno was 0. This is because OpenSSL 3.0 ensured that would never be the case.

  • But as part of that change, a set of tls_errno = errno; was also moved into code that executed only for pre-OpenSSL-3.0.

  • This turned out to be incorrect. Because it is still possible for OpenSSL 3.0 to return with an error code of SSL_ERROR_SYSCALL. All that we are guaranteed is that the errno would not be 0 in that case. In case a connection got reset and a system call failed because of that, one would see an errno of ECONNRESET in OpenSSL 3.0 whereas one would see an errno of 0 in pre-OpenSSL-3.0.

  • This scenario was observed in various replication tests in the YDBTest suite in case they ran with TLS randomly enabled and shut the receiver server down while keeping the source server still running. In that case, the source server would notice an error during the send() system call and get to ssl_error() with an error code of SSL_ERROR_SYSCALL and errno set to ECONNRESET but since we incorrectly did nothing in that case for OpenSSL 3.0, tls_errno did not get set to ECONNRESET in that case and that caused the replication source server logic (which invokes this reference implementation of the encryption plugin in gtm_tls_impl.c) to think no error occurred and therefore retried the send indefinitely resulting in an ever-increasing source server log file that had messages of the following form (and was in a spin-loop as well using up a full CPU).

    Sat Apr  9 10:01:52 2022 : Returning err: 0

Fix

  • The set of tls_errno = errno; is now done for OpenSSL versions older than 3.0 as well as 3.0 and greater. It is only the check for 0 == tls_errno (and the accompanying reset of tls_errno to ECONNRESET) that is now done for OpenSSL versions older than 3.0.

Merge request reports