Skip to content

Update MergeRequestDiffFile#utf8_diff to not call #diff multiple times

What does this MR do and why?

This MR fixes a performance issue with MergeRequestDiffFile#utf8_diff which was calling #diff method 3 times, instead of 1. This is particularly evident when diff is externally stored, and calling utf8_diff results in us fetching it from object storage (via network requests) 3 times, which triples its execution time.

This issue can be observed by profiling this method (figured from staging rails console):

result = RubyProf.profile do
  MergeRequestDiff.find(8184405).merge_request_diff_files.each do |f|
    f.utf8_diff
  end
end

printer = RubyProf::GraphPrinter.new(result)
printer.print(STDOUT, {})

...

                     49.673      0.004      0.000     49.670          385/385     Array#each
  99.81%   0.01%     49.673      0.004      0.000     49.670              385     MergeRequestDiffFile#utf8_diff /opt/gitlab/embedded/service/gitlab-rails/app/models/merge_request_diff_file.rb:17
                     49.666      0.011      0.000     49.655        1149/1149     MergeRequestDiffFile#diff
                      0.003      0.001      0.000      0.001          382/382     Gitlab::EncodingHelper#encode_utf8
                      0.001      0.001      0.000      0.000        382/44681     Kernel#respond_to?
                      0.001      0.001      0.000      0.000         385/3895     String#blank?

Note that MergeRequestDiffFile#utf8_diff is called 385 times, but underlying diff is called 1149 times.

This change is aimed to improve Project Export performance, which exports MR diffs.

Mentions #227653 (closed)

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by George Koltsov

Merge request reports