-
Zeger-Jan van de Weg authored
FindCommit is called very often, to an extend that it's a problem to have all these requests go through Gitaly at the moment. This made the Gitaly team invest a lot of time in clientside N + 1 problems. Eventhough this was fruitful, the optimalisations weren't enough to bring the number of RPC/s down to a level where the RPC could be called 100% of the time. The current way of obtaining the commit information is by shelling out to the git binary, using `git log -z` with extensive use of format options. Shelling out comes at a runtime cost, and by using a native Golang implementation of git this cost could be avoided. The parent commit introduced src-d/go-git as a dependency. The intent is to swap out the git implemenation without the need for any proto, or client-side changes, and also be fully compatible with the shelling out. Things to check, before these commits can be merged to master include: 1. Shelling out includes `GIT_OBJECT_DIRECTORY` and `GIT_ALTERNATE_OBJECT_DIRECTORY`, and sets the values in the execution environment. To what extend FindCommit requires these values, and how to set these values when using go-git are unanswered questions at the moment. 2. The full test suite of GitLab-CE and GitLab-EE should be able to pass with mininal changes to those codebases. The main reason to swap out implemenations is performance, so gitaly-bench was updated to be able benchmark the FindCommit RPC in: gitlab-org/gitaly-bench!5 Both Gitaly's were started with the same configuration, apart from the port. The shell out implemenation was listening on :9999, the go-git implementation on :19999. Output is truncated, but the commands are not for reproducibilty. ``` $ go version go version go1.10.1 darwin/amd64 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.3454 Average QPS: 57.65 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 0.9546 Average QPS: 1047.55 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "4a24d82dbca5c11c61556f3b35ca472b7463187e" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.7700 Average QPS: 56.27 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "4a24d82dbca5c11c61556f3b35ca472b7463187e" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 1.3640 Average QPS: 733.12 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "HEAD~25" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.5492 Average QPS: 56.98 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "HEAD~25" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 3.2684 Average QPS: 305.96 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "feature_conflict" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 18.8795 Average QPS: 52.97 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "feature_conflict" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 1.4138 Average QPS: 707.33 Errors: 0 Percent errors: 0.00 ``` Data shows that across the board the go-git implementation is faster than shelling out to the git binary. The order of magnitude faster varies strongly however. From being 18 times faster, to 'just' 5 times faster. The difference can be explained by the type of revision that is passed as argument to the RPC. The relative revision, HEAD~25, is slowest, my asumption being that the implemenation of walking the history is not optimal. The shelling out implemenation is highly consistent in its timings. This change has one notable side effect; logging is greatly reduced, as shelling out is limited. The internal wrappers around shelling out log heavily, improving visibilty.
Zeger-Jan van de Weg authoredFindCommit is called very often, to an extend that it's a problem to have all these requests go through Gitaly at the moment. This made the Gitaly team invest a lot of time in clientside N + 1 problems. Eventhough this was fruitful, the optimalisations weren't enough to bring the number of RPC/s down to a level where the RPC could be called 100% of the time. The current way of obtaining the commit information is by shelling out to the git binary, using `git log -z` with extensive use of format options. Shelling out comes at a runtime cost, and by using a native Golang implementation of git this cost could be avoided. The parent commit introduced src-d/go-git as a dependency. The intent is to swap out the git implemenation without the need for any proto, or client-side changes, and also be fully compatible with the shelling out. Things to check, before these commits can be merged to master include: 1. Shelling out includes `GIT_OBJECT_DIRECTORY` and `GIT_ALTERNATE_OBJECT_DIRECTORY`, and sets the values in the execution environment. To what extend FindCommit requires these values, and how to set these values when using go-git are unanswered questions at the moment. 2. The full test suite of GitLab-CE and GitLab-EE should be able to pass with mininal changes to those codebases. The main reason to swap out implemenations is performance, so gitaly-bench was updated to be able benchmark the FindCommit RPC in: gitlab-org/gitaly-bench!5 Both Gitaly's were started with the same configuration, apart from the port. The shell out implemenation was listening on :9999, the go-git implementation on :19999. Output is truncated, but the commands are not for reproducibilty. ``` $ go version go version go1.10.1 darwin/amd64 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.3454 Average QPS: 57.65 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 0.9546 Average QPS: 1047.55 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "4a24d82dbca5c11c61556f3b35ca472b7463187e" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.7700 Average QPS: 56.27 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "4a24d82dbca5c11c61556f3b35ca472b7463187e" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 1.3640 Average QPS: 733.12 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "HEAD~25" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 17.5492 Average QPS: 56.98 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "HEAD~25" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 3.2684 Average QPS: 305.96 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:9999 find-commit -revision "feature_conflict" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 18.8795 Average QPS: 52.97 Errors: 0 Percent errors: 0.00 $ go run gitaly-bench.go -iterations 100 -repo gitlab-org/gitlab-test.git -host tcp://localhost:19999 find-commit -revision "feature_conflict" Stats: Average: 0.000000 Total requests: 1000 Elapsed Time (sec): 1.4138 Average QPS: 707.33 Errors: 0 Percent errors: 0.00 ``` Data shows that across the board the go-git implementation is faster than shelling out to the git binary. The order of magnitude faster varies strongly however. From being 18 times faster, to 'just' 5 times faster. The difference can be explained by the type of revision that is passed as argument to the RPC. The relative revision, HEAD~25, is slowest, my asumption being that the implemenation of walking the history is not optimal. The shelling out implemenation is highly consistent in its timings. This change has one notable side effect; logging is greatly reduced, as shelling out is limited. The internal wrappers around shelling out log heavily, improving visibilty.
Loading