1. 26 Jun, 2018 1 commit
  2. 02 May, 2018 1 commit
  3. 25 Mar, 2018 1 commit
    • Kuniwak's avatar
      filter-branch: fix errors caused by refs that point at non-committish · f78ab355
      Kuniwak authored
      "git filter-branch -- --all" prints error messages when processing refs that
      point at objects that are not committish. Such refs can be created by
      "git replace" with trees or blobs. And also "git tag" with trees or blobs can
      create such refs.
      
      Filter these problematic refs out early, before they are seen by the logic to
      see which refs have been modified and which have been left intact (which is
      where the unwanted error messages come from), and warn that these refs are left
      unwritten while doing so.
      Signed-off-by: Kuniwak's avatarYuki Kokubun <orga.chem.job@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f78ab355
  4. 19 Mar, 2018 1 commit
    • Michele Locati's avatar
      filter-branch: use printf instead of echo -e · 206a6ae0
      Michele Locati authored
      In order to echo a tab character, it's better to use printf instead of
      "echo -e", because it's more portable (for instance, "echo -e" doesn't work
      as expected on a Mac).
      
      This solves the "fatal: Not a valid object name" error in git-filter-branch
      when using the --state-branch option.
      
      Furthermore, let's switch from "/bin/echo" to just "echo", so that the
      built-in echo command is used where available.
      Signed-off-by: Michele Locati's avatarMichele Locati <michele@locati.it>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      206a6ae0
  5. 15 Mar, 2018 1 commit
  6. 18 Oct, 2017 1 commit
  7. 22 Sep, 2017 4 commits
  8. 12 Jun, 2017 2 commits
  9. 12 May, 2017 1 commit
  10. 03 Mar, 2017 1 commit
    • Devin J. Pohly's avatar
      filter-branch: fix --prune-empty on parentless commits · a582a82d
      Devin J. Pohly authored
      Previously, the git_commit_non_empty_tree function would always pass any
      commit with no parents to git-commit-tree, regardless of whether the
      tree was nonempty.  The new commit would then be recorded in the
      filter-branch revision map, and subsequent commits which leave the tree
      untouched would be correctly filtered.
      
      With this change, parentless commits with an empty tree are correctly
      pruned, and an empty file is recorded in the revision map, signifying
      that it was rewritten to "no commits."  This works naturally with the
      parent mapping for subsequent commits.
      Signed-off-by: default avatarDevin J. Pohly <djpohly@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a582a82d
  11. 19 Jan, 2016 1 commit
    • Jeff King's avatar
      filter-branch: resolve $commit^{tree} in no-index case · 1dc413eb
      Jeff King authored
      Commit 348d4f2f (filter-branch: skip index read/write when
      possible, 2015-11-06) taught filter-branch to optimize out
      the final "git write-tree" when we know we haven't touched
      the tree with any of our filters. It does by simply putting
      the literal text "$commit^{tree}" into the "$tree" variable,
      avoiding a useless rev-parse call.
      
      However, when we pass this to git_commit_non_empty_tree(),
      it gets confused; it resolves "$commit^{tree}" itself, and
      compares our string to the 40-hex sha1, which obviously
      doesn't match. As a result, "--prune-empty" (or any custom
      filter using git_commit_non_empty_tree) will fail to drop
      an empty commit (when filter-branch is used without a tree
      or index filter).
      
      Let's resolve $tree to the 40-hex ourselves, so that
      git_commit_non_empty_tree can work. Unfortunately, this is a
      bit slower due to the extra process overhead:
      
        $ cd t/perf && ./run 348d4f2f HEAD p7000-filter-branch.sh
        [...]
        Test                  348d4f2f           HEAD
        --------------------------------------------------------------
        7000.2: noop filter   3.76(0.24+0.26)   4.54(0.28+0.24) +20.7%
      
      We could try to make git_commit_non_empty_tree more clever.
      However, the value of $tree here is technically
      user-visible. The user can provide arbitrary shell code at
      this stage, which could itself have a similar assumption to
      what is in git_commit_non_empty_tree. So the conservative
      choice to fix this regression is to take the 20% hit and
      give the pre-348d4f2f behavior. We still end up much faster
      than before the optimization:
      
        $ cd t/perf && ./run 348d4f2f^ HEAD p7000-filter-branch.sh
        [...]
        Test                  348d4f2f^          HEAD
        --------------------------------------------------------------
        7000.2: noop filter   9.51(4.32+0.40)   4.51(0.28+0.23) -52.6%
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      1dc413eb
  12. 24 Nov, 2015 1 commit
    • SZEDER Gábor's avatar
      filter-branch: deal with object name vs. pathname ambiguity in tree-filter · 4d2a3646
      SZEDER Gábor authored
      'git filter-branch' fails complaining about an ambiguous argument, if
      a tree-filter renames a path and the new pathname happens to match an
      existing object name.
      
      After the tree-filter has been applied, 'git filter-branch' looks for
      changed paths by running:
      
        git diff-index -r --name-only --ignore-submodules $commit
      
      which then, because of the lack of disambiguating double-dash, can't
      decide whether to treat '$commit' as revision or path and errors out.
      
      Add that disambiguating double-dash after 'git diff-index's revision
      argument to make sure that '$commit' is interpreted as a revision.
      Signed-off-by: default avatarSZEDER Gábor <szeder@ira.uka.de>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      4d2a3646
  13. 06 Nov, 2015 1 commit
    • Jeff King's avatar
      filter-branch: skip index read/write when possible · 348d4f2f
      Jeff King authored
      If the user specifies an index filter but not a tree filter,
      filter-branch cleverly avoids checking out the tree
      entirely. But we don't do the next level of optimization: if
      you have no index or tree filter, we do not need to read the
      index at all.
      
      This can greatly speed up cases where we are only changing
      the commit objects (e.g., cementing a graft into place).
      Here are numbers from the newly-added perf test:
      
        Test                  HEAD^              HEAD
        ---------------------------------------------------------------
        7000.2: noop filter   13.81(4.95+0.83)   5.43(0.42+0.43) -60.7%
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      348d4f2f
  14. 12 Oct, 2015 1 commit
    • James McCoy's avatar
      filter-branch: remove multi-line headers in msg filter · a5a4b3ff
      James McCoy authored
      df062010 (filter-branch: avoid passing commit message through sed)
      introduced a regression when filtering commits with multi-line headers,
      if the header contains a blank line.  An example of this is a gpg-signed
      commit:
      
        $ git cat-file commit signed-commit
        tree 3d4038e029712da9fc59a72afbfcc90418451630
        parent 110eac945dc1713b27bdf49e74e5805db66971f0
        author A U Thor <author@example.com> 1112912413 -0700
        committer C O Mitter <committer@example.com> 1112912413 -0700
        gpgsig -----BEGIN PGP SIGNATURE-----
         Version: GnuPG v1
      
         iEYEABECAAYFAlYXADwACgkQE7b1Hs3eQw23CACgldB/InRyDgQwyiFyMMm3zFpj
         pUsAnA+f3aMUsd9mNroloSmlOgL6jIMO
         =0Hgm
         -----END PGP SIGNATURE-----
      
        Adding gpg
      
      As a consequence, "filter-branch --msg-filter cat" (which should leave the
      commit message unchanged) spills the signature (after the internal blank
      line) into the original commit message.
      
      The reason is that although the signature is indented, making the line a
      whitespace only line, the "read" call is splitting the line based on
      the shell's IFS, which defaults to <space><tab><newline>.  The leading
      space is consumed and $header_line is empty, causing the "skip header
      lines" loop to exit.
      
      The rest of the commit object is then re-used as the rewritten commit
      message, causing the new message to include the signature of the
      original commit.
      
      Set IFS to an empty string for the "read" call, thus disabling the word
      splitting, which causes $header_line to be set to the non-empty value ' '.
      This allows the loop to fully consume the header lines before
      emitting the original, intact commit message.
      
      [jc: this is literally based on MJG's suggestion]
      Signed-off-by: default avatarMichael J Gruber <git@drmicha.warpmail.net>
      Signed-off-by: default avatarJames McCoy <vega.james@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a5a4b3ff
  15. 21 Sep, 2015 2 commits
  16. 29 Apr, 2015 1 commit
    • Jeff King's avatar
      filter-branch: avoid passing commit message through sed · df062010
      Jeff King authored
      On some systems (like OS X), if sed encounters input without
      a trailing newline, it will silently add it. As a result,
      "git filter-branch" on such systems may silently rewrite
      commit messages that omit a trailing newline. Even though
      this is not something we generate ourselves with "git
      commit", it's better for filter-branch to preserve the
      original data as closely as possible.
      
      We're using sed here only to strip the header fields from
      the commit object. We can accomplish the same thing with a
      shell loop. Since shell "read" calls are slow (usually one
      syscall per byte), we use "cat" once we've skipped past the
      header. Depending on the size of your commit messages, this
      is probably faster (you pay the cost to fork, but then read
      the data in saner-sized chunks). This idea is shamelessly
      stolen from Junio.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      df062010
  17. 01 Jul, 2014 1 commit
    • CB Bailey's avatar
      filter-branch: eliminate duplicate mapped parents · 79bc4ef3
      CB Bailey authored
      When multiple parents of a merge commit get mapped to the same
      commit, filter-branch used to pass all instances of the parent
      commit to the parent and commit filters and to "git commit-tree" or
      "git_commit_non_empty_tree".
      
      This can often happen when extracting a small project from a large
      repository; merges can join history with no commits on any branch
      which affect the paths being retained.  Once the intermediate
      commits have been filtered out, all the immediate parents of the
      merge commit can end up being mapped to the same commit - either the
      original merge-base or an ancestor of it.
      
      "git commit-tree" would display an error but write the commit with
      the normalized parents in any case.  "git_commit_non_empty_tree"
      would fail to notice that the commit being made was in fact a
      non-merge commit and would retain it even if a further pass with
      "--prune-empty" would discard the commit as empty.
      
      Ensure that duplicate parents are pruned before the parent filter to
      make "--prune-empty" idempotent, removing all empty non-merge
      commits in a singe pass.
      Signed-off-by: CB Bailey's avatarCharles Bailey <cbailey32@bloomberg.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      79bc4ef3
  18. 12 Sep, 2013 1 commit
    • Lee Carver's avatar
      Allow git-filter-branch to process large repositories with lots of branches. · 3361a548
      Lee Carver authored
      A recommended way to move trees between repositories is to use
      git-filter-branch to revise the history for a single tree:
      
      However, this can lead to "argument list too long" errors when the
      original repository has many retained branches (>6k)
      
          /usr/local/git/libexec/git-core/git-filter-branch: line 270:
          /usr/local/git/libexec/git-core/git: Argument list too long
          Could not get the commits
      
      Saving the output from rev-parse and feeding it into rev-list from
      its standard input avoids this problem, since the rev-parse output
      is not processed as a command line argument.
      Signed-off-by: default avatarLee Carver <Lee.Carver@servicenow.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3361a548
  19. 29 Aug, 2013 1 commit
    • Jeff King's avatar
      write_index: optionally allow broken null sha1s · 83bd7437
      Jeff King authored
      Commit 4337b585 (do not write null sha1s to on-disk index,
      2012-07-28) added a safety check preventing git from writing
      null sha1s into the index. The intent was to catch errors in
      other parts of the code that might let such an entry slip
      into the index (or worse, a tree).
      
      Some existing repositories may have invalid trees that
      contain null sha1s already, though.  Until 4337b585, a common
      way to clean this up would be to use git-filter-branch's
      index-filter to repair such broken entries.  That now fails
      when filter-branch tries to write out the index.
      
      Introduce a GIT_ALLOW_NULL_SHA1 environment variable to
      relax this check and make it easier to recover from such a
      history.
      
      It is tempting to not involve filter-branch in this commit
      at all, and instead require the user to manually invoke
      
      	GIT_ALLOW_NULL_SHA1=1 git filter-branch ...
      
      to perform an index-filter on a history with trees with null
      sha1s.  That would be slightly safer, but requires some
      specialized knowledge from the user.  So let's set the
      GIT_ALLOW_NULL_SHA1 variable automatically when checking out
      the to-be-filtered trees.  Advice on using filter-branch to
      remove such entries already exists on places like
      stackoverflow, and this patch makes it Just Work again on
      recent versions of git.
      
      Further commands that touch the index will still notice and
      fail, unless they actually remove the broken entries.  A
      filter-branch whose filters do not touch the index at all
      will not error out (since we complain of the null sha1 only
      on writing, not when making a tree out of the index), but
      this is acceptable, as we still print a loud warning, so the
      problem is unlikely to go unnoticed.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Reviewed-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      83bd7437
  20. 02 Apr, 2013 1 commit
    • Jeff King's avatar
      filter-branch: return to original dir after filtering · 97276019
      Jeff King authored
      The first thing filter-branch does is to create a temporary
      directory, either ".git-rewrite" in the current directory
      (which may be the working tree or the repository if bare),
      or in a directory specified by "-d". We then chdir to
      $tempdir/t as our temporary working directory in which to run
      tree filters.
      
      After finishing the filter, we then attempt to go back to
      the original directory with "cd ../..". This works in the
      .git-rewrite case, but if "-d" is used, we end up in a
      random directory. The only thing we do after this chdir is
      to run git-read-tree, but that means that:
      
        1. The working directory is not updated to reflect the
           filtered history.
      
        2. We dump random files into "$tempdir/.." (e.g., if you
           use "-d /tmp/foo", we dump junk into /tmp).
      
      Fix it by recording the full path to the original directory
      and returning there explicitly.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      97276019
  21. 18 Oct, 2012 1 commit
    • Jeff King's avatar
      filter-branch: use git-sh-setup's ident parsing functions · 3c730fab
      Jeff King authored
      This saves us some code, but it also reduces the number of
      processes we start for each filtered commit. Since we can
      parse both author and committer in the same sed invocation,
      we save one process. And since the new interface avoids tr,
      we save 4 processes.
      
      It also avoids using "tr", which has had some odd
      portability problems reported with from Solaris's xpg6
      version.
      
      We also tweak one of the tests in t7003 to double-check that
      we are properly exporting the variables (because test-lib.sh
      exports GIT_AUTHOR_NAME, it will be automatically exported
      in subprograms. We override this to make sure that
      filter-branch handles it properly itself).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3c730fab
  22. 10 Jul, 2012 1 commit
  23. 15 Sep, 2011 1 commit
    • Jeff King's avatar
      filter-branch: use require_clean_work_tree · 5347a50f
      Jeff King authored
      Filter-branch already requires that we have a clean work
      tree before starting. However, it failed to refresh the
      index before checking, which means it could be wrong in the
      case of stat-dirtiness.
      
      Instead of simply adding a call to refresh the index, let's
      switch to using the require_clean_work_tree function
      provided by git-sh-setup. It does exactly what we want, and
      with fewer lines of code and more specific output messages.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5347a50f
  24. 08 Aug, 2011 1 commit
    • Michael Witten's avatar
      filter-branch: Export variable `workdir' for --commit-filter · 0906f6e1
      Michael Witten authored
      According to `git help filter-branch':
      
             --commit-filter <command>
                 ...
                 You can use the _map_ convenience function in this filter,
                 and other convenience functions, too...
                 ...
      
      However, it turns out that `map' hasn't been usable because it depends
      on the variable `workdir', which is not propogated to the environment
      of the shell that runs the commit-filter <command> because the
      shell is created via a simple-command rather than a compound-command
      subshell:
      
       @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
                       $(git write-tree) $parentstr < ../message > ../map/$commit ||
                               die "could not write rewritten commit"
      
      One solution is simply to export `workdir'. However, it seems rather
      heavy-handed to export `workdir' to the environments of all commands,
      so instead this commit exports `workdir' for only the duration of the
      shell command in question:
      
       workdir=$workdir @SHELL_PATH@ -c "$filter_commit" "git commit-tree" \
                       $(git write-tree) $parentstr < ../message > ../map/$commit ||
                               die "could not write rewritten commit"
      Signed-off-by: default avatarMichael Witten <mfwitten@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      0906f6e1
  25. 05 Aug, 2011 2 commits
  26. 27 Aug, 2010 1 commit
  27. 12 Feb, 2010 1 commit
  28. 28 Jan, 2010 1 commit
    • Michal Sojka's avatar
      filter-branch: Fix to allow replacing submodules with another content · 03ca8395
      Michal Sojka authored
      When git filter-branch is used to replace a submodule with another
      content, it always fails on the first commit.
      
      Consider a repository with submod directory containing a submodule.  The
      following command to remove the submodule and replace it with a file fails:
      
          git filter-branch --tree-filter 'rm -rf submod &&
                                           git rm -q submod &&
                                           mkdir submod &&
                                           touch submod/file'
      
      with an error:
      
          error: submod: is a directory - add files inside instead
      
      The reason is that git diff-index, which generates the first part of the
      list of files updated by the tree filter, emits also the removed submodule
      even if it was replaced by a real directory.
      Signed-off-by: Michal Sojka's avatarMichal Sojka <sojkam1@fel.cvut.cz>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      03ca8395
  29. 26 Jan, 2010 1 commit
  30. 16 Dec, 2009 1 commit
  31. 24 Nov, 2009 1 commit
    • Junio C Hamano's avatar
      Protect scripted Porcelains from GREP_OPTIONS insanity · e1622bfc
      Junio C Hamano authored
      If the user has exported the GREP_OPTIONS environment variable, the output
      from "grep" and "egrep" in scripted Porcelains may be different from what
      they expect.  For example, we may want to count number of matching lines,
      by "grep" piped to "wc -l", and GREP_OPTIONS=-C3 will break such use.
      
      The approach taken by this change to address this issue is to protect only
      our own use of grep/egrep.  Because we do not unset it at the beginning of
      our scripts, hook scripts run from the scripted Porcelains are exposed to
      the same insanity this environment variable causes when grep/egrep is used
      to implement logic (e.g. "grep | wc -l"), and it is entirely up to the
      hook scripts to protect themselves.
      
      On the other hand, applypatch-msg hook may want to show offending words in
      the proposed commit log message using grep to the end user, and the user
      might want to set GREP_OPTIONS=--color to paint the match more visibly.
      The approach to protect only our own use without unsetting the environment
      variable globally will allow this use case.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      e1622bfc
  32. 13 Nov, 2009 2 commits
    • Thomas Rast's avatar
      filter-branch: nearest-ancestor rewriting outside subdir filter · f2f3a6b8
      Thomas Rast authored
      Since a0e46390 (filter-branch: fix ref rewriting with
      --subdirectory-filter, 2008-08-12) git-filter-branch has done
      nearest-ancestor rewriting when using a --subdirectory-filter.
      
      However, that rewriting strategy is also a useful building block in
      other tasks.  For example, if you want to split out a subset of files
      from your history, you would typically call
      
        git filter-branch -- <refs> -- <files>
      
      But this fails for all refs that do not point directly to a commit
      that affects <files>, because their referenced commit will not be
      rewritten and the ref remains untouched.
      
      The code was already there for the --subdirectory-filter case, so just
      introduce an option that enables it independently.
      Signed-off-by: default avatarThomas Rast <trast@student.ethz.ch>
      Signed-off-by: default avatarJohannes Sixt <j6t@kdbg.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f2f3a6b8
    • Thomas Rast's avatar
      filter-branch: stop special-casing $filter_subdir argument · 2c1d2d81
      Thomas Rast authored
      Handling $filter_subdir in the usual way requires a separate case at
      every use, because the variable is empty when unused.
      
      Furthermore, --subdirectory-filter supplies its own '--', and if the user
      provided one himself, such as in
      
        git filter-branch --subdirectory-filter subdir -- --all -- subdir/file
      
      	an extra '--' was used as path filter in the call to git-rev-list that
      determines the commits that shall be rewritten.
      
      To keep the argument handling sane, we filter $@ to contain only the
      non-revision arguments, and store all revisions in $ref_args.  The
      $ref_args are easy to handle since only the SHA1s are needed; the
      actual branch names have already been stored in $tempdir/heads at this
      point.
      
      An extra separating -- is only required if the user did not provide
      any non-revision arguments, as the latter disambiguate the
      $filter_subdir following after them (or fail earlier because they are
      ambiguous themselves).
      
      Thanks to Johannes Sixt for suggesting this solution.
      Signed-off-by: default avatarThomas Rast <trast@student.ethz.ch>
      Signed-off-by: default avatarJohannes Sixt <j6t@kdbg.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      2c1d2d81
  33. 18 Aug, 2009 1 commit