Reduce memory allocations in diff parser (!3576) · Merge requests · GitLab.org / gitaly

Jacob Vosmaer requested to merge jv-diff-parser-memory into master Jun 09, 2021

I was looking at the continuous profile for Gitaly, in part because of https://gitlab.com/gitlab-com/gl-infra/production/-/issues/4830, and I noticed this profile:

That looks like we allocate a lot of memory for diff parsing. I added a benchmark which gives the following baseline. Note that the benchmark allocates 600MB across 1.3M allocations.

% go test -bench=. -benchmem -memprofile mem.out
goos: linux
goarch: amd64
pkg: gitlab.com/gitlab-org/gitaly/v14/internal/gitaly/diff
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkParser/parse-8         	       4	 254846574 ns/op	608224402 B/op	 1338066 allocs/op
PASS
ok  	gitlab.com/gitlab-org/gitaly/v14/internal/gitaly/diff	2.267s

With the allocation-preventing changes in this MR, those numbers drop to 10.5MB across 178K allocations.

% go test -bench=. -benchmem -memprofile mem.out 
goos: linux
goarch: amd64
pkg: gitlab.com/gitlab-org/gitaly/v14/internal/gitaly/diff
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkParser/parse-8         	       7	 146350076 ns/op	10457980 B/op	  178041 allocs/op
PASS
ok  	gitlab.com/gitlab-org/gitaly/v14/internal/gitaly/diff	1.380s

Reusing memory has a higher chance of bugs so this MR only targets the function that the profile highlighted: consumeChunkLine. The RPC that uses this parser loops over the diffs one at a time and sends them off as gRPC messages, meaning the bytes in parser.Diff() get copied after each call to parser.Parse().

Edited Jun 10, 2021 by Jacob Vosmaer

Reduce memory allocations in diff parser

Merge request reports