Preventing a filesystem metadata registry from enabling the database before import
Context
As we move towards Container Registry Self-Managed Rollout::BlocksOn by Default, it becomes more crucial to have a safety mechanism in place that can stop a filesystem metadata registry from enabling the metadata database before the import/migration process has been completed.
Problem
Currently, the import tool does not ensure data consistency. The user must manage their import procedure and reconfiguration of the registry appropriately to ensure safe data access.
This means that an admin can accidentally enable the metadata database without performing the import process yet. This will also be significantly important as we move to automate the provisioning process. Therefore, we need a safety mechanism that can prevent the registry from data loss/corruption because the settings were enabled in the wrong order or at the wrong stage in the import process.
Solution
In Investigate Using Lock Files to Preserve Data C... (#918 - closed), we introduced the concept of Lock files. This prevents the registry to run in filesystem metadata mode when a file database-in-use already exists.
One possible solution using the lock files could follow the pseudo-code below:
- if the database is enabled:
- Does the
database-in-usefile exist? - If yes -> proceed with booting normally.
- Else ->
- Does the
storagebackend contain data? (needs functionality) - If yes -> stop the registry from booting with an explicit error
data has not been imported - Else -> Likely a new installation, proceed!
- Does the
- Does the
The lock file is written at the end of the import step 2, when all the tags are imported. This means that we can safely enable the metadata database.
Also new installations would have an empty filesystem metadata, so we should test to ensure these registries can run without complications using the lock files.
Iterations
While working on this change, I realized there are a lot of breaking tests that will need to be adjusted. I decided to gate the change via REGISTRY_FF_ENFORCE_LOCKFILES so that we can fix the tests in smaller chunks as the original MR !1771 (merged) was growing out of control.
The aim is to have the following MRs
- implmentation !1771 (merged)
- docs.gitlab.com gitlab!168613 (merged)
- fix remaining tests in (any number) smaller MRs