Skip to content
  • Karsten Blees's avatar
    name-hash.c: fix endless loop with core.ignorecase=true · 2092678c
    Karsten Blees authored and Junio C Hamano's avatar Junio C Hamano committed
    
    
    With core.ignorecase=true, name-hash.c builds a case insensitive index of
    all tracked directories. Currently, the existing cache entry structures are
    added multiple times to the same hashtable (with different name lengths and
    hash codes). However, there's only one dir_next pointer, which gets
    completely messed up in case of hash collisions. In the worst case, this
    causes an endless loop if ce == ce->dir_next (see t7062).
    
    Use a separate hashtable and separate structures for the directory index
    so that each directory entry has its own next pointer. Use reference
    counting to track which directory entry contains files.
    
    There are only slight changes to the name-hash.c API:
    - new free_name_hash() used by read_cache.c::discard_index()
    - remove_name_hash() takes an additional index_state parameter
    - index_name_exists() for a directory (trailing '/') may return a cache
      entry that has been removed (CE_UNHASHED). This is not a problem as the
      return value is only used to check if the directory exists (dir.c) or to
      normalize casing of directory names (read-cache.c).
    
    Getting rid of cache_entry.dir_next reduces memory consumption, especially
    with core.ignorecase=false (which doesn't use that member at all).
    
    With core.ignorecase=true, building the directory index is slightly faster
    as we add / check the parent directory first (instead of going through all
    directory levels for each file in the index). E.g. with WebKit (~200k
    files, ~7k dirs), time spent in lazy_init_name_hash is reduced from 176ms
    to 130ms.
    
    Signed-off-by: default avatarKarsten Blees <blees@dcon.de>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    2092678c