Skip to content

operations: Fix missing votes on squashed commits

Patrick Steinhardt requested to merge pks-operations-squash-commit-voting into master

The UserSquash RPC computes a squashed commit by first rebasing a range of commits on top of another commit, and then collapsing these into a single commit. This RPC is notably different from almost all of our other RPCs because it never writes any references to disk, and neither does it ever execute any access checks as the other User RPCs do. This design is quite weird:

- There is a known race window where the new objects are not
  referenced, so they could be pruned by maintenance calls.

- We accept objects into the repository which may not be sanctioned
  by our access checks.

- Replication jobs cannot replicate the squashed commit because they
  aren't referenced.

- We never perform transactional voting because no references are
  updated.

Together these problems show that the RPC call is misdesigned, but fixing this design would require a bigger refactoring to make it work alright in Rails.

In this commit we fix the last bullet point though: because this RPC never performs transactional voting, we're always creating replication jobs after the call finishes because Praefect didn't observe any transactional votes. As mentioned though, we don't have any reference to vote on, so the best we can do is vote on the commit ID of the newly written squash commit to make sure that it is the same across nodes.

This commit does so by introducing a quarantine directory that is used to stage all new objects first before they're migrated to the final repository. We then vote on the object ID of the staged squash commit. Only if this vote is successful will we successfully commit the object to disk.

This commit thus solves two things: first it fixes the missing transactional voting. And second it causes us to discard all objects in case the RPC errors.

Changelog: fixed

Fixes #4109 (closed)

Merge request reports