Skip to content
  • Jeff King's avatar
    cache-tree: reject entries with null sha1 · a96d3cc3
    Jeff King authored and Junio C Hamano's avatar Junio C Hamano committed
    We generally disallow null sha1s from entering the index,
    due to 4337b585 (do not write null sha1s to on-disk index,
    2012-07-28). However, we loosened that in 83bd7437
    
    
    (write_index: optionally allow broken null sha1s,
    2013-08-27) so that tools like filter-branch could be used
    to repair broken history.
    
    However, we should make sure that these broken entries do
    not get propagated into new trees. For most entries, we'd
    catch them with the missing-object check (since presumably
    the null sha1 does not exist in our object database). But
    gitlink entries do not need reachability, so we may blindly
    copy the entry into a bogus tree.
    
    This patch rejects all null sha1s (with the same "invalid
    entry" message that missing objects get) when building trees
    from the index. It does so even for non-gitlinks, and even
    when "write-tree" is given the --missing-ok flag. The null
    sha1 is a special sentinel value that is already rejected in
    trees by fsck; whether the object exists or not, it is an
    error to put it in a tree.
    
    Note that for this to work, we must also avoid reusing an
    existing cache-tree that contains the null sha1. This patch
    does so by just refusing to write out any cache tree when
    the index contains a null sha1. This is blunter than we need
    to be; we could just reject the subtree that contains the
    offending entry. But it's not worth the complexity. The
    behavior is unchanged unless you have a broken index entry,
    and even then we'd refuse the whole index write unless the
    emergency GIT_ALLOW_NULL_SHA1 is in use. And even then the
    end result is only a performance drop (any write-tree will
    have to generate the whole cache-tree from scratch).
    
    The tests bear some explanation.
    
    The existing test in t7009 doesn't catch this problem,
    because our index-filter runs "git rm --cached", which will
    try to rewrite the updated index and barf on the bogus
    entry. So we never even make it to write-tree.  The new test
    there adds a noop index-filter, which does show the problem.
    
    The new tests in t1601 are slightly redundant with what
    filter-branch is doing under the hood in t7009. But as
    they're much more direct, they're easier to reason about.
    And should filter-branch ever change or go away, we'd want
    to make sure that these plumbing commands behave sanely.
    
    Signed-off-by: default avatarJeff King <peff@peff.net>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    a96d3cc3