Skip to content

cleanup: Add RewriteHistory RPC

Will Chandler (ex-GitLab) requested to merge wc/filter-repo into master

Historically we have advised users who need to rewrite history to do so locally and force push their change to Gitlab. However, upcoming changes may prevent a user from pushing in scenarios where they need to remove a large blob from their repository's history.

To handle this scenario, we introduce a new RewriteHistory RPC which will invoke git-filer-repo(1) on the target repository. filter-repo has a large number of options, but we will support only two:

  • --strip-blogs-with-ids: Given a file containing a list of newline-delimited object ids, rewrite history to remove them from all commits.
  • --replace-text: Given a file of literals and patterns, replace all matching instances in history with '***REMOVED***'.

filter-repo works by fetching the repository contents via git-fast-export(1), making the requested changes, and writing the changes back via git-fast-import(1).

As filter-repo uses the '--force' flag the repository must be made read-only before calling this RPC.

fast-import to import the rewritten repository history. This will unpack the new objects, then iterate over references serially and update them using reference transactions. This does not atomically update the references, so an interruption during this final stage will result in partially applied changes.

To mitigate this risk, create a temporary staging repository to write the updated history into, then atomically force fetch that into the original repo.

This has the downside of being slower than modifying the repository in-place, but improving safety for a high-risk operation like this is a greater priority.

Related to #5730

Merge request reports