Skip to content

Gitaly crashes

Gitaly crashed today with a segment violation at 09/17 5:55:46 UTC, following with a looong stack trace of thousands of goroutines, starting with:

fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0x4c101b pc=0x4c101b]

goroutine 215065130 [running]:
runtime.throw(0xd4b229, 0x5)
        /usr/local/go/src/runtime/panic.go:617 +0x72 fp=0xc020736648 sp=0xc020736618 pc=0x42f362
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:397 +0x401 fp=0xc020736678 sp=0xc020736648 pc=0x4448d1
internal/poll.(*pollDesc).waitRead(...)
        /usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc00af1bbc0, 0xc00ea69000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b fp=0xc0207366d0 sp=0xc020736678 pc=0x4c101b
os.(*File).read(...)
        /usr/local/go/src/os/file_unix.go:263
os.(*File).Read(0xc031d2ca78, 0xc00ea69000, 0x1000, 0x1000, 0x44077b, 0xc00bc330e0, 0xc00e6dbf20)
        /usr/local/go/src/os/file.go:108 +0x70 fp=0xc020736740 sp=0xc0207366d0 pc=0x4c80f0
gitlab.com/gitlab-org/gitaly/internal/command.(*Command).Read(0xc0001b42d0, 0xc00ea69000, 0x1000, 0x1000, 0x4310bf, 0xdc6be8, 0xc000044f00)
        /var/cache/omnibus/src/gitaly/internal/command/command.go:101 +0x5a fp=0xc020736788 sp=0xc020736740 pc=0xa1929a
bufio.(*Reader).fill(0xc00583c240)
        /usr/local/go/src/bufio/bufio.go:100 +0x10f fp=0xc0207367d8 sp=0xc020736788 pc=0x554ccf
bufio.(*Reader).ReadSlice(0xc00583c240, 0xc01742fd0a, 0xc000ecd638, 0xc020736880, 0x42e981, 0xdc6d68, 0xc020736890)
        /usr/local/go/src/bufio/bufio.go:356 +0x3d fp=0xc020736820 sp=0xc0207367d8 pc=0x555a1d
bufio.(*Reader).ReadBytes(0xc00583c240, 0xc02073680a, 0xc020736950, 0x474aba, 0x158f5e0, 0xc016ccf180, 0x0)
        /usr/local/go/src/bufio/bufio.go:434 +0x70 fp=0xc0207368e0 sp=0xc020736820 pc=0x555ec0
bufio.(*Reader).ReadString(...)
        /usr/local/go/src/bufio/bufio.go:474
gitlab.com/gitlab-org/gitaly/internal/git/catfile.ParseObjectInfo(0xc00583c240, 0xc031d2ca70, 0xc020736a98, 0x1)
        /var/cache/omnibus/src/gitaly/internal/git/catfile/objectinfo.go:33 +0x49 fp=0xc0207369f0 sp=0xc0207368e0 pc=0xab6b09
gitlab.com/gitlab-org/gitaly/internal/git/catfile.(*batchProcess).reader(0xc00b4482d0, 0xc00ad93440, 0x28, 0xd4bf7d, 0x6, 0x0, 0x0, 0x0, 0x0)
        /var/cache/omnibus/src/gitaly/internal/git/catfile/batch.go:85 +0x26c fp=0xc020736ae8 sp=0xc0207369f0 pc=0xab3a8c
gitlab.com/gitlab-org/gitaly/internal/git/catfile.(*Batch).Commit(0xc00b4484e0, 0xc00ad93440, 0x28, 0xc018bff3b0, 0x0, 0x0, 0x31)
        /var/cache/omnibus/src/gitaly/internal/git/catfile/catfile.go:95 +0xce fp=0xc020736b40 sp=0xc020736ae8 pc=0xab5dee
gitlab.com/gitlab-org/gitaly/internal/git/log.GetCommitCatfile(0xc00b4484e0, 0xc019d0a090, 0x28, 0x28, 0xc019d0a090, 0x28)
        /var/cache/omnibus/src/gitaly/internal/git/log/commit.go:37 +0xb5 fp=0xc020736b90 sp=0xc020736b40 pc=0xab8965
gitlab.com/gitlab-org/gitaly/internal/service/ref.newFindLocalBranchesWriter.func1(0xc0258c6900, 0x14, 0x20, 0xc0208d7301, 0xc0208d7380)
        /var/cache/omnibus/src/gitaly/internal/service/ref/util.go:88 +0x14f fp=0xc020736c80 sp=0xc020736b90 pc=0xac670f

That was followed by a loop of hundreds of gitaly restarts by the supervisor between 05:55:48 and 06:03:04 because Gitaly always immedeately crashed again with another SIGSEGV:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xa9ee4f]

goroutine 151 [running]:
gitlab.com/gitlab-org/gitaly/internal/helper/housekeeping.fixDirectoryPermissions.func1(0xc003d1e230, 0x44, 0x0, 0x0, 0xe971c0, 0xc0015661b0, 0x10, 0xc52cc0)
        /var/cache/omnibus/src/gitaly/internal/helper/housekeeping/housekeeping.go:70 +0x2f
path/filepath.Walk(0xc003d1e230, 0x44, 0xc0065bc020, 0xd0f880, 0xc001566180)
        /usr/local/go/src/path/filepath/path.go:402 +0x6a
gitlab.com/gitlab-org/gitaly/internal/helper/housekeeping.fixDirectoryPermissions(0xc003d1e230, 0x44, 0xc001566180, 0xc003d1e230, 0x44)
        /var/cache/omnibus/src/gitaly/internal/helper/housekeeping/housekeeping.go:69 +0x6f
gitlab.com/gitlab-org/gitaly/internal/helper/housekeeping.FixDirectoryPermissions(...)
        /var/cache/omnibus/src/gitaly/internal/helper/housekeeping/housekeeping.go:63
gitlab.com/gitlab-org/gitaly/internal/tempdir.clean(0xc0001fe080, 0x31, 0x2, 0xc0001fe080)
        /var/cache/omnibus/src/gitaly/internal/tempdir/tempdir.go:142 +0x278
gitlab.com/gitlab-org/gitaly/internal/tempdir.StartCleaning.func1(0xc0005321e2, 0x7, 0xc0005321f3, 0x25)
        /var/cache/omnibus/src/gitaly/internal/tempdir/tempdir.go:102 +0xdf
created by gitlab.com/gitlab-org/gitaly/internal/tempdir.StartCleaning
        /var/cache/omnibus/src/gitaly/internal/tempdir/tempdir.go:100 +0x91
{"gitaly":2886,"level":"warning","msg":"forwarding signal","signal":17,"time":"2019-09-17T05:55:49Z","wrapper":2879}
{"error":"os: process already finished","gitaly":2886,"level":"error","msg":"can't forward the signal","signal":17,"time":"2019-09-17T05:55:49Z","wrapper":2879}
{"gitaly":2886,"level":"error","msg":"wrapper for gitaly shutting down","time":"2019-09-17T05:55:49Z","wrapper":2879}

One thing that stood out on file-33 is that /tmp had wrong permissions (gitlab-com/gl-infra/production#1159 (closed)), but i'm not sure yet if that is related to the Gitaly crashes.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information