Skip to content

Speed up generation of commit stats by using Rugged native methods

Stan Hu requested to merge sh-optimize-commit-stats into master

The previous implementation iterated across the entire patch set to determine the number of lines added, deleted, and changed. Rugged has a native method Rugged::Diff#stat that does this already, which appears to be a little faster and require less RAM than doing this ourselves.

Improves performance in #41524

Using Rugged native methods

require 'rugged'

repo = Rugged::Repository.new('.')

commit_object = repo.rev_parse('master')
original_oid = repo.rev_parse('93efff945215a4407afcaf0cba15ac601b56df0d')
diff_commits = commit_object.parents[0].diff(original_oid)

puts diff_commits.stat

Previous implementation

require 'rugged'

repo = Rugged::Repository.new('.')

commit_object = repo.rev_parse('master')
original_oid = repo.rev_parse('93efff945215a4407afcaf0cba15ac601b56df0d')
diff_commits = commit_object.parents[0].diff(original_oid)

additions = 0
deletions = 0
total = 0

diff_commits.each_patch do |p|
  additions += p.stat[0]
  deletions += p.stat[1]
  total += p.changes
end

puts [additions, deletions, total]

Performance Comparison

  • Time: 1.32 s vs 1.52 s
  • RAM: 144992 vs. 195104 kbytes
$ /usr/bin/time -v ruby diff_rugged_native.rb
14251
6
1070973
        Command being timed: "ruby diff_rugged_native.rb"
        User time (seconds): 1.32
        System time (seconds): 0.48
        Percent of CPU this job got: 92%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.94
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 144992
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 5
        Minor (reclaiming a frame) page faults: 17916
        Voluntary context switches: 41
        Involuntary context switches: 194
        Swaps: 0
        File system inputs: 152
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
$ /usr/bin/time -v ruby diff_original.rb
6
1070973
1070979
        Command being timed: "ruby diff_original.rb"
        User time (seconds): 1.52
        System time (seconds): 0.40
        Percent of CPU this job got: 98%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.96
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 195104
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 29048
        Voluntary context switches: 36
        Involuntary context switches: 225
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
Edited by Stan Hu

Merge request reports