Possible race condition in gnutls_x509_trust_list_verify_crt2
Description of problem:
Possible race condition leading to a memory corruption issue in
trust_list_add_compat called indirectly from
gnutls_x509_trust_list_verify_crt2 (see below) when handling outgoing (client) TLS connections from multiple threads. Or possibly I'm holding GnuTLS wrong.
Version of gnutls used:
Distributor of gnutls (e.g., Ubuntu, Fedora, RHEL)
Arch Linux and compiled from source.
I'm experiencing a race condition leading to a memory corruption issue in dnsdist 1.7.0-alpha1 (developer here), when using GnuTLS 3.7.2 to handle outgoing (client) TLS connections from multiple threads, and I'm trying to understand whether I'm holding GnuTLS wrong or if this is an issue that needs to be fixed in GnuTLS itself.
Our design is that we create a single
gnutls_certificate_credentials_t object while parsing the configuration, in this particular case calling
gnutls_certificate_set_x509_system_trust to use the system CA store. PKCS11 support is enabled in this GnuTLS build, which will be important later.
Later we have several worker threads each creating several new TLS connections, a single
gnutls_session_t being only accessed by one thread, but the
gnutls_certificate_credentials_t is shared by all connections by calling
GNUTLS_CRD_CERTIFICATE. My understanding after reading the "Thread safety" and "gnutls_credentials_set" parts of the documentation is that it should be safe to do so, but perhaps I'm wrong and this is the root cause of my issue.
We also require certificate verification by calling
We are then experiencing a memory corruption when several handshakes are processed at the same time from different threads, in the certification verification code:
================================================================= ==82302==ERROR: AddressSanitizer: attempting double-free on 0x627000085100 in thread T19 (dnsdist/healthC): #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2) #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8 #2 0x7fc1b57c6b03 in trust_list_add_compat /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:310:3 #3 0x7fc1b57c6b03 in gnutls_x509_trust_list_get_issuer /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:1165:10 #4 0x7fc1b57c732b in gnutls_x509_trust_list_verify_crt2 /data/sources/gnutls-3.7.2/lib/x509/verify-high.c:1521:7 #5 0x7fc1b5755208 in _gnutls_x509_cert_verify_peers /data/sources/gnutls-3.7.2/lib/cert-session.c:597:10 #6 0x7fc1b57541c0 in auto_verify_cb /data/sources/gnutls-3.7.2/lib/auto-verify.c:40:9 #7 0x7fc1b5719148 in _gnutls_run_verify_callback /data/sources/gnutls-3.7.2/lib/handshake.c:2972:10 #8 0x7fc1b5719148 in _gnutls_run_verify_callback /data/sources/gnutls-3.7.2/lib/handshake.c:2938:5 #9 0x7fc1b571156c in _gnutls13_handshake_client /data/sources/gnutls-3.7.2/lib/handshake-tls13.c:132:9 #10 0x7fc1b571cf41 in handshake_client /data/sources/gnutls-3.7.2/lib/handshake.c:3012:10 #11 0x7fc1b571cf41 in gnutls_handshake /data/sources/gnutls-3.7.2/lib/handshake.c:2855:10 #12 0x5610db811868 in GnuTLSConnection::tryHandshake() /work/pdns/pdns/dnsdistdist/tcpiohandler.cc:1103:13 #13 0x5610db81396b in GnuTLSConnection::tryWrite(std::vector<unsigned char, noinit_adaptor<std::allocator<unsigned char> > > const&, unsigned long&, unsigned long) /work/pdns/pdns/dnsdistdist/tcpiohandler.cc:1145:20 #14 0x5610da8af7ab in TCPIOHandler::tryWrite(std::vector<unsigned char, noinit_adaptor<std::allocator<unsigned char> > > const&, unsigned long&, unsigned long) /work/pdns/pdns/dnsdistdist/./tcpiohandler.hh:402:22 #15 0x5610da8a88b3 in healthCheckTCPCallback(int, boost::any&) /work/pdns/pdns/dnsdistdist/dnsdist-healthchecks.cc:261:37 #16 0x5610db7d3bb4 in boost::function2<void, int, boost::any&>::operator()(int, boost::any&) const /usr/include/boost/function/function_template.hpp:763:14 #17 0x5610db84be27 in EpollFDMultiplexer::run(timeval*, int) /work/pdns/pdns/dnsdistdist/epollmplexer.cc:193:9 #18 0x5610da8a9f64 in handleQueuedHealthChecks(FDMultiplexer&, bool) /work/pdns/pdns/dnsdistdist/dnsdist-healthchecks.cc:451:23 #19 0x5610db6d0ed9 in healthChecksThread() /work/pdns/pdns/dnsdistdist/dnsdist.cc:1907:5 #20 0x7fc1b55433c3 in execute_native_thread_routine /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:82:18 #21 0x7fc1b568f258 in start_thread (/usr/lib/libpthread.so.0+0x9258) #22 0x7fc1b522f5e2 in clone (/usr/lib/libc.so.6+0xfe5e2) 0x627000085100 is located 0 bytes inside of 14000-byte region [0x627000085100,0x6270000887b0) freed by thread T6 (dnsdist/tcpClie) here: #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2) #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8 previously allocated by thread T3 (dnsdist/tcpClie) here: #0 0x5610da6eada2 in realloc (/work/pdns/pdns/dnsdistdist/dnsdist+0x12fcda2) #1 0x7fc1b573ab14 in _gnutls_reallocarray_fast /data/sources/gnutls-3.7.2/lib/mem.c:63:8
We see that the certificate verification code is reallocating an array inside the cred's
trust_list_add_compat, after being called by
That happens only if PKCS11 support is enabled and the trust list's pkcs11_token field is set.
The documentation for
gnutls_x509_trust_list_get_issuer states that "the flag
GNUTLS_TL_GET_COPY is required for this function to work with PKCS#11 trust lists in a thread-safe way", but
gnutls_x509_trust_list_verify_crt2 does not set that flag.
Unfortunately that means that another thread might be trying to access the array at the same time, or even reallocating it, which leads to memory corruption (use-after-free).
gnutls_x509_trust_list_get_issuer was not called before e97a5f07, so this behaviour might have been introduced in 3.7.1.
No memory corruption.
I would welcome some help understanding whether I should be doing things differently in dnsdist in order to prevent this. Many thanks in advance :)