Skip to content

fix(notifications): retrying sink does not stop after threshold is reached

The retryingSink in the notifications system does not stop after reaching the threshold, instead it keeps attempting to make the connection. This has unintended consequences as the Go routine is leaking because we never stop.

ERRO[1043] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1043] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1044] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1044] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1045] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1045] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1046] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1046] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1047] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1047] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1048] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1048] httpSink{http://registry.test:3333/} encountered too many errors, backing off
ERRO[1049] retryingsink: error writing events: httpSink{http://registry.test:3333/}: error posting: Post "http://registry.test:3333/": dial tcp 172.16.123.1:3333: connect: connection refused, retrying
WARN[1049] httpSink{http://registry.test:3333/} encountered too many errors, backing off

Solution

We have introduced a maxretries parameter to the notifications section via !1606 (merged). To fix this issues, users will need to wait for the new release of the registry and the updated version of the Linux and Helm Charts installation.

The MR also deprecates threshold.

Checklist

Edited by Jaime Martinez