Projects::RefsController#log_tree can take a long time to complete
e.g.: https://sentry.gitlap.com/gitlab/gitlabcom-clientside/issues/31136/
In particular, we see this causing problems when a large directory is viewed, like /fdroid/fdroiddata/refs/master/logs_tree/metadata
, which corresponds to https://gitlab.com/fdroid/fdroiddata/tree/master/metadata
Here's the controller action:
def logs_tree
@offset = if params[:offset].present?
params[:offset].to_i
else
0
end
@limit = 25
@path = params[:path]
contents = []
contents.push(*tree.trees)
contents.push(*tree.blobs)
contents.push(*tree.submodules)
@logs = contents[@offset, @limit].to_a.map do |content|
file = @path ? File.join(@path, content.name) : content.name
last_commit = @repo.last_commit_for_path(@commit.id, file)
{
file_name: content.name,
commit: last_commit
}
end
offset = (@offset + @limit)
if contents.size > offset
@more_log_url = logs_file_project_ref_path(@project, @ref, @path || '', offset: offset)
end
respond_to do |format|
format.html { render_404 }
format.js
end
end
I expect the limit
and offset
logic isn't preventing the tree from loading a large amount of data, as we're ranging over the combination of trees, blobs and submodules, and finding the index necessarily means loading all that data.
It may be that we need to restrict this action so you can only specify one type of object at a time. This may allow for more efficient ranging. I've not looked at how it's used clientside yet, though. It's also a nonsense to do all this calculation then simply throw it away for the html format.
Eventually this data must come from Gitaly.
/cc @lbennett