Connections to the database timed out through the Envoy port
Summary
In a SG cluster suddenly connections to the database using the service start to throw connections errors
$ psql -U pguser -d mydb -h 10.10.10.10
psql: error: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
Postgres and pgbouncer services were working as expected:
pgbouncer=# show databases;
name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections | paused | disabled
-----------+------+------+-----------+------------+-----------+--------------+-----------+-----------------+---------------------+--------+----------
pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0 | 0 | 0
(1 row)
Connection to the database was tested through the pod using pgbouncer port 6432 and it works normally. Then the connection through envoy port 7432 was not working.
These are the logs messages from envoy:
[2021-03-24 10:56:06.574][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:11.574][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:16.575][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:21.577][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:26.577][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:30.893][116][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:236] [C298064] new tcp proxy session
[2021-03-24 10:56:30.893][116][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:380] [C298064] Creating connection to cluster postgres_cluster
[2021-03-24 10:56:30.893][116][debug][pool] [source/common/tcp/original_conn_pool.cc:98] creating a new connection
[2021-03-24 10:56:30.894][116][debug][pool] [source/common/tcp/original_conn_pool.cc:383] [C298065] connecting
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:753] [C298065] connecting to 127.0.0.1:6432
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:769] [C298065] connection in progress
[2021-03-24 10:56:30.894][116][debug][pool] [source/common/tcp/original_conn_pool.cc:125] queueing request due to no available connections
[2021-03-24 10:56:30.894][116][debug][conn_handler] [source/server/connection_handler_impl.cc:422] [C298064] new connection
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:616] [C298065] connected
[2021-03-24 10:56:30.894][116][debug][pool] [source/common/tcp/original_conn_pool.cc:303] [C298065] assigning connection
[2021-03-24 10:56:30.894][116][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:631] TCP:onUpstreamEvent(), requestedServerName:
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:584] [C298065] remote close
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:208] [C298065] closing socket: 0
[2021-03-24 10:56:30.894][116][debug][pool] [source/common/tcp/original_conn_pool.cc:140] [C298065] client disconnected
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:112] [C298062] closing data_to_write=0 type=0
[2021-03-24 10:56:30.894][116][debug][connection] [source/common/network/connection_impl.cc:208] [C298062] closing socket: 1
[2021-03-24 10:56:30.894][116][debug][conn_handler] [source/server/connection_handler_impl.cc:111] [C298062] adding to cleanup list
[2021-03-24 10:56:30.894][116][debug][pool] [source/common/tcp/original_conn_pool.cc:255] [C298065] connection destroyed
[2021-03-24 10:56:31.579][75][debug][main] [source/server/server.cc:189] flushing stats
[2021-03-24 10:56:33.995][75][debug][http] [source/common/http/conn_manager_impl.cc:261] [C0] new stream
[2021-03-24 10:56:33.995][75][debug][http] [source/common/http/conn_manager_impl.cc:808] [C0][S14922136635809977346] request headers complete (end_stream=true):
as you can see in the pgbouncer pool there weren't any connections to the database and the envoy message says
queueing request due to no available connections
Then connection using 7433 was working normally.
Possible Solution
The solution to solve the problem was to restart the envoy service and then the service started to work normally.
Environment
- StackGres version:
0.9.4
- Kubernetes version: 1.17
- Cloud provider or hardware configuration: GKE
Edited by Kadaffy Talavera