Skip to content

Populate paths into changed blobs

Igor Drozdov requested to merge id-extend-changed-blobs-with-paths into master

What

Within Introduce ChangedBlobs service for push checks (!142838 - merged) we introduced ChangedBlobs service that returns the blobs that were just pushed. This MR introduces with_paths option to this service to populate the paths into the returned blobs.

Why

To detect the blobs that have been just pushed, Gitaly uses quarantine repositories. As a result, we end up using something like:

if ignore_alternate_directories?
  project.repository.list_all_blobs(
    bytes_limit: bytes_limit,
    dynamic_timeout: timeout,
    ignore_alternate_object_directories: true
  ).to_a
else
  project.repository.list_blobs(
    ['--not', '--all', '--not'] + revisions,
    bytes_limit: bytes_limit,
    with_paths: true,
    dynamic_timeout: timeout
  ).to_a
end

The advantage of this approach is that it's much faster because it doesn't iterate through the history, the disadvantage is that it doesn't contain filenames for the blobs because it doesn't iterate through the history.

Within this MR:

  • with_paths option is specified if ListBlobs RPC is called
  • For ListAllBlobs, we perform an extra RPC call to git-diff-tree to retrieve changed paths and associate them with the blobs. The FindChangedPaths RPC has been extended to return blob ids: Extend FindChangedPaths with blob ids (gitaly!6640 - merged)
Edited by Igor Drozdov

Merge request reports