Skip to content

Refactor blob service to use Go pipelines for LFS objects

Patrick Steinhardt requested to merge pks-blob-lfs-pointer-pipeline into master

In several places, we can improve our existing interfaces to more intelligently filter down to a set of objects going from a set of revisions. While we already have such code for LFS pointers, the current implementation is quite inflexible and thus hard to reuse for the other implementations. I've painfully did this conversion just to share the same code between finding LFS pointers and blobs, which should've been one of the easiest to do. But even that one was hard to do and the resulting patches looked awful.

I've thus decided once again to try and convert the code into a Go pipeline. The previous tries I did were all a bit naive and thus failed, but this time the result is something I'm actually quite happy about. The pipeline has three different steps which can be plugged together as required:

  1. Walking the object graph starting from a set of revisions via git-rev-list(1).
  2. For each revision, we retrieve the object info without reading the whole object yet. This includes things like object type and object size.
  3. For each object info, we read the corresponding object.

Between these steps it's trivially possible to filter down results via a set of filtering pipeline steps.

This architecture should allow us to filter down arbitrary Git objects based on a set of revisions. As an example, the ListLFSPointers() has been converted to make use of this new pipeline and works as expected: no test changes were required. The next step would be to convert the other LFS pointer RPCs, and then to use it for blobs, commits and trees as required.

Merge request reports