Resume Import Enumerations: Step Two

Context

Repository Enumerations start at the beginning and continue to the end, with no option to skip any repositories along the way.

Problems

Unlike the solution in step one we need to be concerned with users trying to split up step two between different periods of downtime with writes being made in between these periods.

Solution

This will likely be similar to Resume Import Enumerations: Step One (#1162 - closed) • Hayley Swimelar • 18.5 • On track building on some of the more fundamental changes to repository enumeration we need to make there.

Discussion

This step is the most risky of all three to implement — unintentionally skipping over data here will lead to data loss.

We need to investigate and describe what happens when a user does try to break up this step over multiple periods of downtime, with new writes between. Unfortunately, there are a number of complex interactions here for each individual write:

Write Inserted Before After Resume Point	Write to DB or FS	New Tag or Tag Swap	Result
Before	FS	New	Tag Not Present in DB
Before	FS	Swap	Tag Points to old manifest in DB ❗
Before	DB	New	✅
Before	DB	Swap	✅
After	FS	New	✅
After	FS	Swap	✅
After	DB	New	❓
After	DB	Swap	❓

Similar for tag deletes:

Delete Before After Resume Point	Delete on DB or FS	Result)
Before	FS	Importer Restores the Tag
Before	DB	✅
After	FS	✅
After	DB	Importer Restores the Tag, Potentially Points it at Old Manifest ❗

Among these errors, pointing a tag at a previously referenced manifest is deeply insidious, since it may go unnoticed for some time and could have any number of ill effects. Namely, services relying on a single tag which is updated on the registry side undergoing a spontaneous, unplanned downgrade. Not pinning a specific tag version is considered a bad practice due in part to this possibility.

Edited Dec 02, 2023 by Hayley Swimelar