Skip to content

structerr: Fix test flake caused by async logging

Patrick Steinhardt requested to merge pks-structerr-fix-async-logging-flake into master

The default server setup for Gitaly uses the PropagationMessageProducer to generate the log messages. This type is responsible for writing the log message via logrus after we have seen grpc/stats.End, which is propagated by the gRPC framework after we have sent the response to the client. This is done so that we can count all request and response bytes for a specific RPC call and attach it to the log message.

The downside of this framework is that grpc/stats.End is propagated only after we have sent the error to the client. As a result it can happen that we have already observed the error in our tests, but the log message wasn't yet generated. This causes tests for the structerr's fields producer to be quite flaky as we frequently don't see the log message at all.

Fix this by converting the test setup to use the DefaultMessageProducer directly instead of using the PropagationMessageProducer. Like this we directly log the error message when we see the error as we don't wait for grpc/stats.End first, so we can be sure that the log message has been generated when we observe the client-side error.

Note that in production systems this works as intended. While the log message may get deferred a bit, we will ultimately see it and produce the expected output.

Merge request reports