Split brain might lead to data loss
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
When the database is in a situation where two nodes think they are the primary server, it could happen that a user creates a repository on one primary database and gets an ID for the project X. A second user can create a project while connected to the second primary database and also get the ID X. Given GitLab now uses hashed storage, which is based on the project_id column, the repository path is not always unique. But when a backup is inconsistent, for example the database is more recent than the Git repository snapshot, these situations can happen too.
The project repository is an obvious case where this would happen; but there are others. For example the PoolRepository model has the same naming scheme and also depends on a unique ID. Other assets I can think of that depend on uniqueness of keys are:
- Project Logos
- Notes attachments
- CI Artifacts
Although most probably there are more.
Possible fixes
Add a random nonce to the project ID before hashing the path, and storing the path in the database. Or don't use a ID as base of the hashed storage path, but an UUID.