Skip to content

repocleaner: Log warning message for repositories not known to praefect

Pavlo Strokov requested to merge ps-crowler into master

The problem with repositories missing in the praefect database becomes more actual as more customers migrate to the cluster setup. In order to verify state of the things in the cluster the new background job is implemented.

It runs over all repositories on all healthy gitaly storages. For each repository it checks if it exists in the praefect database. And if repository doesn't exist it will be issued to the "action" to be performed on it (actually we do it in batches (the batch size is configurable with repositories_cleanup.repositories_in_batch option)).

As the first iteration the "action" is a waring log message with the information about repository location: vitrual storage, gitaly storage and relative path to the repository. Based on the log info the administrator could: remove repository from the storage or manually create missed data in database to make the repository available for the praefect to serve.

As there are multiple instances of the praefect service in the cluster running in parallel to support HA we need to coordinate the scan process. It make no sense to do scan of the same storage by multiple instances and create additional load on the cluster.
The new table storage_cleanups was introduced to manage this process. It stores information about scan activity for each gitaly storage. The implementation is based on the continuously updated timestamps. If the scan is in progress the triggered_at column will be set with current timestamp. The value is updated during the scan process and is used to identify if the process is still running or it was terminated for some reason. Once the processing is completed the value will be set to NULL. That means there is no processing happens for this storage and it can be picked up for the next round.

The background task runs with configurable time period (repositories_cleanup.check_interval option), but the scan happens only once a day (configurable with repositories_cleanup.run_interval). It is done to omit the situation when the instance is restarted before the scan happens.

Closes: #3719 (closed)

Edited by Pavlo Strokov

Merge request reports