git: Fix race between cancelling git-version(1) and reading its output
When detecting the currently running Git version we spawn the command, read its output and then wait for it to finish. This behaviour is racy with context cancellation though:
Goroutine 1 Goroutine 2 git-version
1. We spawn git-version(1).
2. We start reading from the
pipe.
3. The context gets cancelled.
4. The command package sends
SIGTERM to the command.
5. The command package calls
`Wait()` on the command and
thus closes its stdout pipe.
6. git-version tries to
print the version.
As the stdout pipe has already been closed at this point in time, Git will fail to print the Git version. It does not have an error check for the printf(3P) statement though and thus exits successfully.
7. The async signal gets
delivered, but Git has
already finished and
exited successfully.
8. We finish reading the
now-empty Git version.
9. We wait for the command
to finish and observe a
successful exit code.
Even though this race seems unlikely, we frequently hit it in the transaction manager:
=== FAIL: internal/gitaly TestTransactionManager/propose_returns_if_transaction_processing_stops_before_transaction_acceptance (0.13s)
transaction_manager_test.go:1553:
Error Trace: /builds/gitlab-org/gitaly/internal/gitaly/transaction_manager_test.go:1553
/builds/gitlab-org/gitaly/internal/gitaly/transaction_manager_test.go:1554
Error: Target error should be in err chain:
expected: "transaction processing stopped"
in chain: "verify references: resolve revision: getting Git version: invalid version format: \"\""
"resolve revision: getting Git version: invalid version format: \"\""
"getting Git version: invalid version format: \"\""
"invalid version format: \"\""
Test: TestTransactionManager/propose_returns_if_transaction_processing_stops_before_transaction_acceptance
Fix the issue by providing a buffer as stdout to git-version(1). Like
this, the os/exec
package will not close stdout when we call Wait()
on the command, which should fix the race as Git will now either have
printed something to the buffer, or it will have exited with an error
code.
Note that I have not been able to come up with a test that reproduced the issue on my machine. I thus refrained from adding one.
Changelog: fixed
Fixes #4740 (closed).