Skip to content

Derive virtual storage's filesystem id from its name

Sami Hiltunen requested to merge smh-generate-filesystem-id into master

Gitaly storages contain a UUID filesystem ID that is generated by the Gitaly for each of its storages. The ID is used to determine which storages can be accessed by Rails directly when rugged patches are enabled and to see whether two different storages point to the same directory when doing repository moves.

When repository moves are performed, the worker first checks whether the repository's destination and source storage are the same. If they are, the move is not performed. The check is performed by comparing the filesystem IDs of the storages'. As Praefect is currently routing the server info RPC to a random Gitaly node, the filesystem ID can differ between calls as each of the Gitalys have their own ID. This causes the repository moving worker to occasionally delete repositories from the virtual storage as it receives two different IDs on sequential calls.

The filesystem ID can identify cases when two storages refer to the same directory on a Gitaly node as the id is stored in a file in the storage. This is not really possible with Praefect. The storage's are only identified by the virtual storage's name. If the name changes, we can't really correlate the ID between the different names as Praefect would consider them different storages. Praefect also supports multiple virtual storages so it's not possible to generate a single ID and use it for all of the virtual storages. Given this, the approach taken here is to derive a stable filesystem ID from the virtual storage's name. This guarantees calls to a given virtual storage always return the same filesystem ID.

Configuring two storages that point to the same filesystem should be considered an invalid configuration anyway. Historically, there's been cases when that has been done for plain Gitalys. This is not done for Praefect and wouldn't work as Praefect wouldn't find the repositories with an alternative virtual storage name. With that in mind, we don't have to consider the case where two virtual storages of different names point to the same backing Gitaly storages.

The use cases for the filesystem ID seem to be limited and we may be able to remove it in the future once the rugged patches are removed.

Closes #3752 (closed)

Merge request reports