Skip to content

updateref: Fix handling of context cancellation errors

Processing reference updates when the context gets cancelled is not guaranteed to return the cancellation error. Instead, the caller will often see I/O errors that are caused by us not being able to either read from or write to the underlying git-update-ref(1) process anymore. This race is visible in one of our flaky tests:

=== FAIL: internal/gitaly TestTransactionManager/propose_returns_if_transaction_processing_stops_before_transaction_acceptance (0.14s)
    transaction_manager_test.go:1553:
                Error Trace:        /builds/gitlab-org/gitaly/internal/gitaly/transaction_manager_test.go:1553
                                                        /builds/gitlab-org/gitaly/internal/gitaly/transaction_manager_test.go:1554
                Error:              Target error should be in err chain:
                                expected: "transaction processing stopped"
                                in chain: "verify references: verify references with git: prepare reference transaction: start: state update to \"start\" failed: EOF, stderr: \"\""
                                        "verify references with git: prepare reference transaction: start: state update to \"start\" failed: EOF, stderr: \"\""
                                        "prepare reference transaction: start: state update to \"start\" failed: EOF, stderr: \"\""
                                        "start: state update to \"start\" failed: EOF, stderr: \"\""
                                        "state update to \"start\" failed: EOF, stderr: \"\""
                                        "EOF"
                Test:               TestTransactionManager/propose_returns_if_transaction_processing_stops_before_transaction_acceptance

As you can see, we only see the EOF error without and standard error at all. This error is both annoying due to the flake, but will also cause us to misclassify the actual error condition. And last but not least it is quite unhelpful for the poor reader of the error as they won't know why it's being caused in the first case.

Fix this bug by specifically handling context cancellation errors when handling I/O errors.

Part of #4740 (closed).

Edited by Patrick Steinhardt

Merge request reports