Resume Import Enumerations: Step Three

Context

Repository Enumerations start at the beginning and continue to the end, with no option to skip any repositories along the way.

Problems

Unlike the solution in step one there is no underlying skip logic that we can surface for blob enumeration. Additionally, users may perform this step while operating the DB. Therefore, the most recently inserted blob will almost certainly point to a blob inserted via the API and, as such, will not be the correct insertion point for the importer to resume.

Solution

Read the most recently inserted repository in the database and resume the import starting with that repository.

Discussion

This step carries relatively low risk to implement — we're only importing danging blobs, making them available for the online garbage collector. Blobs which are expected to be needed by the API calls should have already been imported by this point.

Compared to the other steps, we have less to gain from implementing this. However, we will still benefit this work as it would enable being able to auto retry at a resume point during one-shot import. For the multi-step import process, this step can be ran while the registry is operational with the DB, so there is less of a sense of urgency for the step to complete.

Edited Dec 02, 2023 by Hayley Swimelar