Skip to content

praefect/server: Fix Goroutine leak in readiness test

One of our tests for the Praefect server's readiness checks verifies that the readiness check fails when one of the Gitaly nodes is not reachable. The test setup uses a random DNS entry that we hope is unresolvable, which is already fragile by itself. But this also causes a Goroutine leak when using Cgo to resolve DNS addresses on some systems:

panic: goroutines running: found unexpected goroutines:
[Goroutine 132 in state syscall, with net._C2func_getaddrinfo on top of the stack:
goroutine 132 [syscall]:
net._C2func_getaddrinfo(0xc004520da0, 0x0, 0xc004529a40, 0xc000902b10)
        _cgo_gotypes.go:94 +0x56
net.cgoLookupIPCNAME.func1({0xc004520da0, 0x200?, 0x22?}, 0xc00451c96f?, 0x636dfb?)
        /nix/store/ixhyjwl4qwdy0dr9k6c1zhphjr1b00gn-go-1.19.6/share/go/src/net/cgo_unix.go:160 +0x9f
net.cgoLookupIPCNAME({0x192d1ba, 0x3}, {0xc00451c96f, 0xc})
        /nix/store/ixhyjwl4qwdy0dr9k6c1zhphjr1b00gn-go-1.19.6/share/go/src/net/cgo_unix.go:160 +0x173
net.cgoIPLookup(0x3478a50?, {0x192d1ba?, 0xc004520c80?}, {0xc00451c96f?, 0xc000662000?})
        /nix/store/ixhyjwl4qwdy0dr9k6c1zhphjr1b00gn-go-1.19.6/share/go/src/net/cgo_unix.go:217 +0x3b
created by net.cgoLookupIP
        /nix/store/ixhyjwl4qwdy0dr9k6c1zhphjr1b00gn-go-1.19.6/share/go/src/net/cgo_unix.go:227 +0x12a
]

goroutine 1 [running]:
gitlab.com/gitlab-org/gitaly/v15/internal/testhelper.mustHaveNoGoroutines()
        /home/pks/Development/gitlab/gdk/gitaly/internal/testhelper/leakage.go:40 +0x3d6
gitlab.com/gitlab-org/gitaly/v15/internal/testhelper.Run.func1({0x0, 0x0, 0x0?}, 0x19c8f65?)
        /home/pks/Development/gitlab/gdk/gitaly/internal/testhelper/configure.go:74 +0x295
gitlab.com/gitlab-org/gitaly/v15/internal/testhelper.Run(0xffffffffffffffff?, {0x0?, 0x442b65?, 0xc00008c718?})
        /home/pks/Development/gitlab/gdk/gitaly/internal/testhelper/configure.go:75 +0x31
gitlab.com/gitlab-org/gitaly/v15/internal/praefect/service/server_test.TestMain(...)
        /home/pks/Development/gitlab/gdk/gitaly/internal/praefect/service/server/testhelper_test.go:10
main.main()
        _testmain.go:53 +0x1db
FAIL	gitlab.com/gitlab-org/gitaly/v15/internal/praefect/service/server	2.326s

So it seems like getaddrinfo(3P) does not return in time and thus causes the Goroutine to be stuck.

Fix this by instead creating a temporary localhost listener that allocates a random TCP port. This listener is then immediately closed and its address is used as the Gitaly server address. This ensures that we know the address would be resolvable, but that it will never be reachable.

Closes #4551 (closed)

Edited by Will Chandler (ex-GitLab)

Merge request reports