New Gitaly RPC: efficiently retrieve the last commits that touched each entry in a tree
In support of https://gitlab.com/gitlab-org/gitlab-ce/issues/37433
Currently, in the GitLab codebase we have this pattern: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/lib/gitlab/tree_summary.rb#L77
Gitlab::GitalyClient.allow_n_plus_1_calls do
entries.each do |entry|
raw_commit = repository.last_commit_for_path(commit.id, entry_path(entry))
...
end
I think we need to introduce a new RPC, to pull down the last commit in bulk.
A naive extension might look like:
paths = entries.map { |entry| entry_path(entry) }
commit_map = repository.last_commits_for_paths(commit.id, path)
# {'foo' => <Gitlab::Git::Commit...>, 'bar' => Gitlab::Git::Commit...> }
However, we're actually interested in iterating over the entire tree in batches of 25. So perhaps it makes more sense to do something like:
commits = repository.last_commits_for_tree(commit.id, path, offset: n, limit: m) # order: trees, blobs, submodules
# [<Gitlab::Git::Commit...>, <Gitlab::Git::Commit...>]
/cc @zj @jacobvosmaer-gitlab @tiagonbotelho for input on which design is preferred. For me, the second does away with more work in gitlab-rails, and is slightly less specific to a single caller, so I prefer it.
Edited by Nick Thomas