Update MergeRequestDiffFile#utf8_diff to not call #diff multiple times
What does this MR do and why?
This MR fixes a performance issue with MergeRequestDiffFile#utf8_diff
which was calling #diff
method 3 times, instead of 1. This is particularly evident when diff is externally stored, and calling utf8_diff
results in us fetching it from object storage (via network requests) 3 times, which triples its execution time.
This issue can be observed by profiling this method (figured from staging rails console):
result = RubyProf.profile do
MergeRequestDiff.find(8184405).merge_request_diff_files.each do |f|
f.utf8_diff
end
end
printer = RubyProf::GraphPrinter.new(result)
printer.print(STDOUT, {})
...
49.673 0.004 0.000 49.670 385/385 Array#each
99.81% 0.01% 49.673 0.004 0.000 49.670 385 MergeRequestDiffFile#utf8_diff /opt/gitlab/embedded/service/gitlab-rails/app/models/merge_request_diff_file.rb:17
49.666 0.011 0.000 49.655 1149/1149 MergeRequestDiffFile#diff
0.003 0.001 0.000 0.001 382/382 Gitlab::EncodingHelper#encode_utf8
0.001 0.001 0.000 0.000 382/44681 Kernel#respond_to?
0.001 0.001 0.000 0.000 385/3895 String#blank?
Note that MergeRequestDiffFile#utf8_diff
is called 385 times, but underlying diff
is called 1149 times.
This change is aimed to improve Project Export performance, which exports MR diffs.
Mentions #227653 (closed)
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Numbered steps to set up and validate the change are strongly suggested.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.