1. 14 Feb, 2018 1 commit
  2. 28 Dec, 2017 1 commit
  3. 22 Nov, 2017 1 commit
    • Jeff Hostetler's avatar
      list-objects: filter objects in traverse_commit_list · 25ec7bca
      Jeff Hostetler authored
      Create traverse_commit_list_filtered() and add filtering
      interface to allow certain objects to be omitted from the
      traversal.
      
      Update traverse_commit_list() to be a wrapper for the above
      with a null filter to minimize the number of callers that
      needed to be changed.
      
      Object filtering will be used in a future commit by rev-list
      and pack-objects for partial clone and fetch to omit unwanted
      objects from the result.
      
      traverse_bitmap_commit_list() does not work with filtering.
      If a packfile bitmap is present, it will not be used.  It
      should be possible to extend such support in the future (at
      least to simple filters that do not require object pathnames),
      but that is beyond the scope of this patch series.
      Signed-off-by: default avatarJeff Hostetler <jeffhost@microsoft.com>
      Reviewed-by: default avatarJonathan Tan <jonathantanmy@google.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      25ec7bca
  4. 24 Sep, 2017 1 commit
    • Martin Ågren's avatar
      object_array: add and use `object_array_pop()` · 71992039
      Martin Ågren authored
      In a couple of places, we pop objects off an object array `foo` by
      decreasing `foo.nr`. We access `foo.nr` in many places, but most if not
      all other times we do so read-only, e.g., as we iterate over the array.
      But when we change `foo.nr` behind the array's back, it feels a bit
      nasty and looks like it might leak memory.
      
      Leaks happen if the popped element has an allocated `name` or `path`.
      At the moment, that is not the case. Still, 1) the object array might
      gain more fields that want to be freed, 2) a code path where we pop
      might start using names or paths, 3) one of these code paths might be
      copied to somewhere where we do, and 4) using a dedicated function for
      popping is conceptually cleaner.
      
      Introduce and use `object_array_pop()` instead. Release memory in the
      new function. Document that popping an object leaves the associated
      elements in limbo.
      
      The converted places were identified by grepping for "\.nr\>" and
      looking for "--".
      
      Make the new function return NULL on an empty array. This is consistent
      with `pop_commit()` and allows the following:
      
      	while ((o = object_array_pop(&foo)) != NULL) {
      		// do something
      	}
      
      But as noted above, we don't need to go out of our way to avoid reading
      `foo.nr`. This is probably more readable:
      
      	while (foo.nr) {
      		... o = object_array_pop(&foo);
      		// do something
      	}
      
      The name of `object_array_pop()` does not quite align with
      `add_object_array()`. That is unfortunate. On the other hand, it matches
      `object_array_clear()`. Arguably it's `add_...` that is the odd one out,
      since it reads like it's used to "add" an "object array". For that
      reason, side with `object_array_clear()`.
      Signed-off-by: default avatarMartin Ågren <martin.agren@gmail.com>
      Reviewed-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      71992039
  5. 20 Jul, 2017 1 commit
  6. 08 May, 2017 1 commit
    • brian m. carlson's avatar
      object: convert parse_object* to take struct object_id · c251c83d
      brian m. carlson authored
      Make parse_object, parse_object_or_die, and parse_object_buffer take a
      pointer to struct object_id.  Remove the temporary variables inserted
      earlier, since they are no longer necessary.  Transform all of the
      callers using the following semantic patch:
      
      @@
      expression E1;
      @@
      - parse_object(E1.hash)
      + parse_object(&E1)
      
      @@
      expression E1;
      @@
      - parse_object(E1->hash)
      + parse_object(E1)
      
      @@
      expression E1, E2;
      @@
      - parse_object_or_die(E1.hash, E2)
      + parse_object_or_die(&E1, E2)
      
      @@
      expression E1, E2;
      @@
      - parse_object_or_die(E1->hash, E2)
      + parse_object_or_die(E1, E2)
      
      @@
      expression E1, E2, E3, E4, E5;
      @@
      - parse_object_buffer(E1.hash, E2, E3, E4, E5)
      + parse_object_buffer(&E1, E2, E3, E4, E5)
      
      @@
      expression E1, E2, E3, E4, E5;
      @@
      - parse_object_buffer(E1->hash, E2, E3, E4, E5)
      + parse_object_buffer(E1, E2, E3, E4, E5)
      Signed-off-by: brian m. carlson's avatarbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c251c83d
  7. 08 Feb, 2017 1 commit
    • Jeff King's avatar
      fetch-pack: cache results of for_each_alternate_ref · 41a078c6
      Jeff King authored
      We may run for_each_alternate_ref() twice, once in
      find_common() and once in everything_local(). This operation
      can be expensive, because it involves running a sub-process
      which must freshly load all of the alternate's refs from
      disk.
      
      Let's cache and reuse the results between the two calls. We
      can make some optimizations based on the particular use
      pattern in fetch-pack to keep our memory usage down.
      
      The first is that we only care about the sha1s, not the refs
      themselves. So it's OK to store only the sha1s, and to
      suppress duplicates. The natural fit would therefore be a
      sha1_array.
      
      However, sha1_array's de-duplication happens only after it
      has read and sorted all entries. It still stores each
      duplicate. For an alternate with a large number of refs
      pointing to the same commits, this is a needless expense.
      
      Instead, we'd prefer to eliminate duplicates before putting
      them in the cache, which implies using a hash. We can
      further note that fetch-pack will call parse_object() on
      each alternate sha1. We can therefore keep our cache as a
      set of pointers to "struct object". That gives us a place to
      put our "already seen" bit with an optimized hash lookup.
      And as a bonus, the object stores the sha1 for us, so
      pointer-to-object is all we need.
      
      There are two extra optimizations I didn't do here:
      
        - we actually store an array of pointer-to-object.
          Technically we could just walk the obj_hash table
          looking for entries with the ALTERNATE flag set (because
          our use case doesn't care about the order here).
      
          But that hash table may be mostly composed of
          non-ALTERNATE entries, so we'd waste time walking over
          them. So it would be a slight win in memory use, but a
          loss in CPU.
      
        - the items we pull out of the cache are actual "struct
          object"s, but then we feed "obj->sha1" to our
          sub-functions, which promptly call parse_object().
      
          This second parse is cheap, because it starts with
          lookup_object() and will bail immediately when it sees
          we've already parsed the object. We could save the extra
          hash lookup, but it would involve refactoring the
          functions we call. It may or may not be worth the
          trouble.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      41a078c6
  8. 13 Jun, 2016 1 commit
  9. 20 Nov, 2015 3 commits
  10. 19 Oct, 2014 1 commit
  11. 16 Oct, 2014 2 commits
    • Jeff King's avatar
      make add_object_array_with_context interface more sane · 9e0c3c4f
      Jeff King authored
      When you resolve a sha1, you can optionally keep any context
      found during the resolution, including the path and mode of
      a tree entry (e.g., when looking up "HEAD:subdir/file.c").
      
      The add_object_array_with_context function lets you then
      attach that context to an entry in a list. Unfortunately,
      the interface for doing so is horrible. The object_context
      structure is large and most object_array users do not use
      it. Therefore we keep a pointer to the structure to avoid
      burdening other users too much. But that means when we do
      use it that we must allocate the struct ourselves. And the
      struct contains a fixed PATH_MAX-sized buffer, which makes
      this wholly unsuitable for any large arrays.
      
      We can observe that there is only a single user of the
      "with_context" variant: builtin/grep.c. And in that use
      case, the only element we care about is the path. We can
      therefore store only the path as a pointer (the context's
      mode field was redundant with the object_array_entry itself,
      and nobody actually cared about the surrounding tree). This
      still requires a strdup of the pathname, but at least we are
      only consuming the minimum amount of memory for each string.
      
      We can also handle the copying ourselves in
      add_object_array_*, and free it as appropriate in
      object_array_release_entry.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      9e0c3c4f
    • Jeff King's avatar
      object_array: add a "clear" function · 46be8231
      Jeff King authored
      There's currently no easy way to free the memory associated
      with an object_array (and in most cases, we simply leak the
      memory in a rev_info's pending array). Let's provide a
      helper to make this easier to handle.
      
      We can make use of it in list-objects.c, which does the same
      thing by hand (but fails to free the "name" field of each
      entry, potentially leaking memory).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      46be8231
  12. 10 Sep, 2014 1 commit
  13. 28 Jul, 2014 2 commits
    • Jeff King's avatar
      add object_as_type helper for casting objects · c4ad00f8
      Jeff King authored
      When we call lookup_commit, lookup_tree, etc, the logic goes
      something like:
      
        1. Look for an existing object struct. If we don't have
           one, allocate and return a new one.
      
        2. Double check that any object we have is the expected
           type (and complain and return NULL otherwise).
      
        3. Convert an object with type OBJ_NONE (from a prior
           call to lookup_unknown_object) to the expected type.
      
      We can encapsulate steps 2 and 3 in a helper function which
      checks whether we have the expected object type, converts
      OBJ_NONE as appropriate, and returns the object.
      
      Not only does this shorten the code, but it also provides
      one central location for converting OBJ_NONE objects into
      objects of other types. Future patches will use that to
      enforce type-specific invariants.
      
      Since this is a refactoring, we would want it to behave
      exactly as the current code. It takes a little reasoning to
      see that this is the case:
      
        - for lookup_{commit,tree,etc} functions, we are just
          pulling steps 2 and 3 into a function that does the same
          thing.
      
        - for the call in peel_object, we currently only do step 3
          (but we want to consolidate it with the others, as
          mentioned above). However, step 2 is a noop here, as the
          surrounding conditional makes sure we have OBJ_NONE
          (which we want to keep to avoid an extraneous call to
          sha1_object_info).
      
        - for the call in lookup_commit_reference_gently, we are
          currently doing step 2 but not step 3. However, step 3
          is a noop here. The object we got will have just come
          from deref_tag, which must have figured out the type for
          each object in order to know when to stop peeling.
          Therefore the type will never be OBJ_NONE.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c4ad00f8
    • Jeff King's avatar
      move setting of object->type to alloc_* functions · fe24d396
      Jeff King authored
      The "struct object" type implements basic object
      polymorphism.  Individual instances are allocated as
      concrete types (or as a union type that can store any
      object), and a "struct object *" can be cast into its real
      type after examining its "type" enum.  This means it is
      dangerous to have a type field that does not match the
      allocation (e.g., setting the type field of a "struct blob"
      to "OBJ_COMMIT" would mean that a reader might read past the
      allocated memory).
      
      In most of the current code this is not a problem; the first
      thing we do after allocating an object is usually to set its
      type field by passing it to create_object. However, the
      virtual commits we create in merge-recursive.c do not ever
      get their type set. This does not seem to have caused
      problems in practice, though (presumably because we always
      pass around a "struct commit" pointer and never even look at
      the type).
      
      We can fix this oversight and also make it harder for future
      code to get it wrong by setting the type directly in the
      object allocation functions.
      
      This will also make it easier to fix problems with commit
      index allocation, as we know that any object allocated by
      alloc_commit_node will meet the invariant that an object
      with an OBJ_COMMIT type field will have a unique index
      number.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      fe24d396
  14. 14 Jul, 2014 2 commits
    • Jeff King's avatar
      add object_as_type helper for casting objects · 8ff226a9
      Jeff King authored
      When we call lookup_commit, lookup_tree, etc, the logic goes
      something like:
      
        1. Look for an existing object struct. If we don't have
           one, allocate and return a new one.
      
        2. Double check that any object we have is the expected
           type (and complain and return NULL otherwise).
      
        3. Convert an object with type OBJ_NONE (from a prior
           call to lookup_unknown_object) to the expected type.
      
      We can encapsulate steps 2 and 3 in a helper function which
      checks whether we have the expected object type, converts
      OBJ_NONE as appropriate, and returns the object.
      
      Not only does this shorten the code, but it also provides
      one central location for converting OBJ_NONE objects into
      objects of other types. Future patches will use that to
      enforce type-specific invariants.
      
      Since this is a refactoring, we would want it to behave
      exactly as the current code. It takes a little reasoning to
      see that this is the case:
      
        - for lookup_{commit,tree,etc} functions, we are just
          pulling steps 2 and 3 into a function that does the same
          thing.
      
        - for the call in peel_object, we currently only do step 3
          (but we want to consolidate it with the others, as
          mentioned above). However, step 2 is a noop here, as the
          surrounding conditional makes sure we have OBJ_NONE
          (which we want to keep to avoid an extraneous call to
          sha1_object_info).
      
        - for the call in lookup_commit_reference_gently, we are
          currently doing step 2 but not step 3. However, step 3
          is a noop here. The object we got will have just come
          from deref_tag, which must have figured out the type for
          each object in order to know when to stop peeling.
          Therefore the type will never be OBJ_NONE.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8ff226a9
    • Jeff King's avatar
      move setting of object->type to alloc_* functions · d36f51c1
      Jeff King authored
      The "struct object" type implements basic object
      polymorphism.  Individual instances are allocated as
      concrete types (or as a union type that can store any
      object), and a "struct object *" can be cast into its real
      type after examining its "type" enum.  This means it is
      dangerous to have a type field that does not match the
      allocation (e.g., setting the type field of a "struct blob"
      to "OBJ_COMMIT" would mean that a reader might read past the
      allocated memory).
      
      In most of the current code this is not a problem; the first
      thing we do after allocating an object is usually to set its
      type field by passing it to create_object. However, the
      virtual commits we create in merge-recursive.c do not ever
      get their type set. This does not seem to have caused
      problems in practice, though (presumably because we always
      pass around a "struct commit" pointer and never even look at
      the type).
      
      We can fix this oversight and also make it harder for future
      code to get it wrong by setting the type directly in the
      object allocation functions.
      
      This will also make it easier to fix problems with commit
      index allocation, as we know that any object allocated by
      alloc_commit_node will meet the invariant that an object
      with an OBJ_COMMIT type field will have a unique index
      number.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      d36f51c1
  15. 25 Mar, 2014 2 commits
  16. 28 Feb, 2014 1 commit
  17. 02 Jun, 2013 1 commit
    • Michael Haggerty's avatar
      object_array_entry: fix memory handling of the name field · 31faeb20
      Michael Haggerty authored
      Previously, the memory management of the object_array_entry::name
      field was inconsistent and undocumented.  object_array_entries are
      ultimately created by a single function, add_object_array_with_mode(),
      which has an argument "const char *name".  This function used to
      simply set the name field to reference the string pointed to by the
      name parameter, and nobody on the object_array side ever freed the
      memory.  Thus, it assumed that the memory for the name field would be
      managed by the caller, and that the lifetime of that string would be
      at least as long as the lifetime of the object_array_entry.  But
      callers were inconsistent:
      
      * Some passed pointers to constant strings or argv entries, which was
        OK.
      
      * Some passed pointers to newly-allocated memory, but didn't arrange
        for the memory ever to be freed.
      
      * Some passed the return value of sha1_to_hex(), which is a pointer to
        a statically-allocated buffer that can be overwritten at any time.
      
      * Some passed pointers to refnames that they received from a
        for_each_ref()-type iteration, but the lifetimes of such refnames is
        not guaranteed by the refs API.
      
      Bring consistency to this mess by changing object_array to make its
      own copy for the object_array_entry::name field and free this memory
      when an object_array_entry is deleted from the array.
      
      Many callers were passing the empty string as the name parameter, so
      as a performance optimization, treat the empty string specially.
      Instead of making a copy, store a pointer to a statically-allocated
      empty string to object_array_entry::name.  When deleting such an
      entry, skip the free().
      
      Change the callers that were already passing copies to
      add_object_array_with_mode() to either skip the copy, or (if the
      memory needed to be allocated anyway) freeing the memory itself.
      
      A part of this commit effectively reverts
      
          70d26c6e read_revisions_from_stdin: make copies for handle_revision_arg
      
      because the copying introduced by that commit (which is still
      necessary) is now done at a deeper level.
      Signed-off-by: default avatarMichael Haggerty <mhagger@alum.mit.edu>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      31faeb20
  18. 28 May, 2013 2 commits
  19. 10 May, 2013 1 commit
  20. 17 Mar, 2013 1 commit
    • Jeff King's avatar
      avoid segfaults on parse_object failure · 75a95490
      Jeff King authored
      Many call-sites of parse_object assume that they will get a
      non-NULL return value; this is not the case if we encounter
      an error while parsing the object.
      
      This patch adds a wrapper function around parse_object that
      handles dying automatically, and uses it anywhere we
      immediately try to access the return value as a non-NULL
      pointer (i.e., anywhere that we would currently segfault).
      
      This wrapper may also be useful in other places. The most
      obvious one is code like:
      
        o = parse_object(sha1);
        if (!o)
      	  die(...);
      
      However, these should not be mechanically converted to
      parse_object_or_die, as the die message is sometimes
      customized. Later patches can address these sites on a
      case-by-case basis.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      75a95490
  21. 30 Mar, 2012 1 commit
  22. 14 Mar, 2011 1 commit
  23. 30 Aug, 2010 1 commit
  24. 18 Jan, 2010 1 commit
  25. 18 Jan, 2009 1 commit
  26. 10 Sep, 2008 1 commit
  27. 26 Feb, 2008 1 commit
  28. 07 Jun, 2007 1 commit
    • Junio C Hamano's avatar
      War on whitespace · a6080a0a
      Junio C Hamano authored
      This uses "git-apply --whitespace=strip" to fix whitespace errors that have
      crept in to our source files over time.  There are a few files that need
      to have trailing whitespaces (most notably, test vectors).  The results
      still passes the test, and build result in Documentation/ area is unchanged.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a6080a0a
  29. 24 Apr, 2007 1 commit
  30. 17 Apr, 2007 1 commit
  31. 16 Apr, 2007 1 commit
  32. 27 Feb, 2007 2 commits
    • Nicolas Pitre's avatar
      get rid of lookup_object_type() · 0ab17950
      Nicolas Pitre authored
      This function is called only once in the whole source tree.  Let's move
      its code inline instead, which is also in the spirit of removing as much
      object type char arrays as possible (not that this patch does anything for
      that but at least it is now a local matter).
      Signed-off-by: default avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      0ab17950
    • Nicolas Pitre's avatar
      convert object type handling from a string to a number · 21666f1a
      Nicolas Pitre authored
      We currently have two parallel notation for dealing with object types
      in the code: a string and a numerical value.  One of them is obviously
      redundent, and the most used one requires more stack space and a bunch
      of strcmp() all over the place.
      
      This is an initial step for the removal of the version using a char array
      found in object reading code paths.  The patch is unfortunately large but
      there is no sane way to split it in smaller parts without breaking the
      system.
      Signed-off-by: default avatarNicolas Pitre <nico@cam.org>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      21666f1a