This project is mirrored from https://github.com/git/git.git. Updated .
  1. 22 Nov, 2017 1 commit
    • Jeff Hostetler's avatar
      list-objects: filter objects in traverse_commit_list · 25ec7bca
      Jeff Hostetler authored
      Create traverse_commit_list_filtered() and add filtering
      interface to allow certain objects to be omitted from the
      traversal.
      
      Update traverse_commit_list() to be a wrapper for the above
      with a null filter to minimize the number of callers that
      needed to be changed.
      
      Object filtering will be used in a future commit by rev-list
      and pack-objects for partial clone and fetch to omit unwanted
      objects from the result.
      
      traverse_bitmap_commit_list() does not work with filtering.
      If a packfile bitmap is present, it will not be used.  It
      should be possible to extend such support in the future (at
      least to simple filters that do not require object pathnames),
      but that is beyond the scope of this patch series.
      Signed-off-by: default avatarJeff Hostetler <[email protected]>
      Reviewed-by: default avatarJonathan Tan <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      25ec7bca
  2. 16 Mar, 2016 2 commits
    • Jeff King's avatar
      list-objects: pass full pathname to callbacks · 2824e184
      Jeff King authored
      When we find a blob at "a/b/c", we currently pass this to
      our show_object_fn callbacks as two components: "a/b/" and
      "c". Callbacks which want the full value then call
      path_name(), which concatenates the two. But this is an
      inefficient interface; the path is a strbuf, and we could
      simply append "c" to it temporarily, then roll back the
      length, without creating a new copy.
      
      So we could improve this by teaching the callsites of
      path_name() this trick (and there are only 3). But we can
      also notice that no callback actually cares about the
      broken-down representation, and simply pass each callback
      the full path "a/b/c" as a string. The callback code becomes
      even simpler, then, as we do not have to worry about freeing
      an allocated buffer, nor rolling back our modification to
      the strbuf.
      
      This is theoretically less efficient, as some callbacks
      would not bother to format the final path component. But in
      practice this is not measurable. Since we use the same
      strbuf over and over, our work to grow it is amortized, and
      we really only pay to memcpy a few bytes.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      2824e184
    • Jeff King's avatar
      list-objects: drop name_path entirely · dc06dc88
      Jeff King authored
      In the previous commit, we left name_path as a thin wrapper
      around a strbuf. This patch drops it entirely. As a result,
      every show_object_fn callback needs to be adjusted. However,
      none of their code needs to be changed at all, because the
      only use was to pass it to path_name(), which now handles
      the bare strbuf.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      dc06dc88
  3. 12 Feb, 2016 2 commits
    • Jeff King's avatar
      list-objects: pass full pathname to callbacks · de1e67d0
      Jeff King authored
      When we find a blob at "a/b/c", we currently pass this to
      our show_object_fn callbacks as two components: "a/b/" and
      "c". Callbacks which want the full value then call
      path_name(), which concatenates the two. But this is an
      inefficient interface; the path is a strbuf, and we could
      simply append "c" to it temporarily, then roll back the
      length, without creating a new copy.
      
      So we could improve this by teaching the callsites of
      path_name() this trick (and there are only 3). But we can
      also notice that no callback actually cares about the
      broken-down representation, and simply pass each callback
      the full path "a/b/c" as a string. The callback code becomes
      even simpler, then, as we do not have to worry about freeing
      an allocated buffer, nor rolling back our modification to
      the strbuf.
      
      This is theoretically less efficient, as some callbacks
      would not bother to format the final path component. But in
      practice this is not measurable. Since we use the same
      strbuf over and over, our work to grow it is amortized, and
      we really only pay to memcpy a few bytes.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      de1e67d0
    • Jeff King's avatar
      list-objects: drop name_path entirely · bd64516a
      Jeff King authored
      In the previous commit, we left name_path as a thin wrapper
      around a strbuf. This patch drops it entirely. As a result,
      every show_object_fn callback needs to be adjusted. However,
      none of their code needs to be changed at all, because the
      only use was to pass it to path_name(), which now handles
      the bare strbuf.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      bd64516a
  4. 28 Aug, 2013 1 commit
  5. 01 Sep, 2011 1 commit
    • Junio C Hamano's avatar
      list-objects: pass callback data to show_objects() · 49473672
      Junio C Hamano authored
      The traverse_commit_list() API takes two callback functions, one to show
      commit objects, and the other to show other kinds of objects. Even though
      the former has a callback data parameter, so that the callback does not
      have to rely on global state, the latter does not.
      
      Give the show_objects() callback the same callback data parameter.
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      49473672
  6. 13 Apr, 2009 2 commits
    • Linus Torvalds's avatar
      process_{tree,blob}: show objects without buffering · 8d2dfc49
      Linus Torvalds authored
      Here's a less trivial thing, and slightly more dubious one.
      
      I was looking at that "struct object_array objects", and wondering why we
      do that. I have honestly totally forgotten. Why not just call the "show()"
      function as we encounter the objects? Rather than add the objects to the
      object_array, and then at the very end going through the array and doing a
      'show' on all, just do things more incrementally.
      
      Now, there are possible downsides to this:
      
       - the "buffer using object_array" _can_ in theory result in at least
         better I-cache usage (two tight loops rather than one more spread out
         one). I don't think this is a real issue, but in theory..
      
       - this _does_ change the order of the objects printed. Instead of doing a
         "process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
         over the commits (which puts all the root trees _first_ in the object
         list, this patch just adds them to the list of pending objects, and
         then we'll traverse them in that order (and thus show each root tree
         object together with the objects we discover under it)
      
         I _think_ the new ordering actually makes more sense, but the object
         ordering is actually a subtle thing when it comes to packing
         efficiency, so any change in order is going to have implications for
         packing. Good or bad, I dunno.
      
       - There may be some reason why we did it that odd way with the object
         array, that I have simply forgotten.
      
      Anyway, now that we don't buffer up the objects before showing them
      that may actually result in lower memory usage during that whole
      traverse_commit_list() phase.
      
      This is seriously not very deeply tested. It makes sense to me, it seems
      to pass all the tests, it looks ok, but...
      
      Does anybody remember why we did that "object_array" thing? It used to be
      an "object_list" a long long time ago, but got changed into the array due
      to better memory usage patterns (those linked lists of obejcts are
      horrible from a memory allocation standpoint). But I wonder why we didn't
      do this back then. Maybe there's a reason for it.
      
      Or maybe there _used_ to be a reason, and no longer is.
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      8d2dfc49
    • Linus Torvalds's avatar
      show_object(): push path_name() call further down · cf2ab916
      Linus Torvalds authored
      In particular, pushing the "path_name()" call _into_ the show() function
      would seem to allow
      
       - more clarity into who "owns" the name (ie now when we free the name in
         the show_object callback, it's because we generated it ourselves by
         calling path_name())
      
       - not calling path_name() at all, either because we don't care about the
         name in the first place, or because we are actually happy walking the
         linked list of "struct name_path *" and the last component.
      
      Now, I didn't do that latter optimization, because it would require some
      more coding, but especially looking at "builtin-pack-objects.c", we really
      don't even want the whole pathname, we really would be better off with the
      list of path components.
      
      Why? We use that name for two things:
       - add_preferred_base_object(), which actually _wants_ to traverse the
         path, and now does it by looking for '/' characters!
       - for 'name_hash()', which only cares about the last 16 characters of a
         name, so again, generating the full name seems to be just unnecessary
         work.
      
      Anyway, so I didn't look any closer at those things, but it did convince
      me that the "show_object()" calling convention was crazy, and we're
      actually better off doing _less_ in list-objects.c, and giving people
      access to the internal data structures so that they can decide whether
      they want to generate a path-name or not.
      
      This patch does that, and then for people who did use the name (even if
      they might do something more clever in the future), it just does the
      straightforward "name = path_name(path, component); .. free(name);" thing.
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      cf2ab916
  7. 08 Apr, 2009 1 commit
    • Christian Couder's avatar
      list-objects: add "void *data" parameter to show functions · 11c211fa
      Christian Couder authored
      The goal of this patch is to get rid of the "static struct rev_info
      revs" static variable in "builtin-rev-list.c".
      
      To do that, we need to pass the revs to the "show_commit" function
      in "builtin-rev-list.c" and this in turn means that the
      "traverse_commit_list" function in "list-objects.c" must be passed
      functions pointers to functions with 2 parameters instead of one.
      
      So we have to change all the callers and all the functions passed
      to "traverse_commit_list".
      
      Anyway this makes the code more clean and more generic, so it
      should be a good thing in the long run.
      Signed-off-by: Christian Couder's avatarChristian Couder <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      11c211fa
  8. 07 Sep, 2006 2 commits