Skip to content

objectpool: Fix conflicting references when fetching into pools

Patrick Steinhardt requested to merge pks-objectpool-fetch-df-conflict into master

In order to update pool repositories we use the FetchIntoObjectPool RPC. This RPC first fetches from the primary pool member into the pool, then rescues any objects that have become dangling because of the reference updates, and finally it kicks off repository maintenance for the pool.

We have seen multiple cases now though where we consistently failed to update pool repositories with the following error:

error: cannot lock ref 'refs/remotes/origin/heads/branch/conflict': 'refs/remotes/origin/heads/branch' exists; cannot create 'refs/remotes/origin/heads/branch/conflict'

So there is a reference refs/remotes/origin/heads/branch that exists in the pool repository which obstructs fetching of the the conflicting reference refs/heads/branch/conflict in the pool member. The root cause is that we don't ever prune references in pools even when they have been removed on the remote side, but in fact this condition can even trigger in case we would execute git fetch --prune because we also use the --atomic flag, which doesn't cope well with a conflicting reference being deleted at the same time as the new reference is added.

While it sounds a bit scary, pruning references in object pools should be totally fine because of our dangling-references mechanism: after we have fetched changes from the remote, we check whether there were any force-updates that have led to objects becoming unreachable. Because some other pool members might still refer those objects we must make sure that those aren't deleted, and so we keep dangling references to keep those objects alive. So when we start to prune references now we would recover these objects via such dangling references exactly the same as we do with force-updated references right now.

That still leaves the issue of using --atomic and --prune together, which doesn't work. We can't get rid of --atomic because it's an important optimization so we don't execute reference-transaction hooks for every changed reference twice. We can make this a two-step process though by first executing git remote prune to prune deleted branches without fetching any objects, and only then fetching any new references. But this again has similiar ramifications because the command doesn't support --atomic and may thus perform really slow when many references are deleted at the same point in time. So we instead use the dry-run mode, parse its output, and then use git-update-ref(1) to perform the change with manual voting.

All of this is not exactly ideal or elegant, but it works to fix the original bug as demonstrated by our tests. We should ultimately try to upstream patches to either make --atomic and --prune work nicely together, or to add a --atomic flag to git-remote(1).

Fixes #4373 (closed).

Merge request reports