Handle bad link files gracefully during garbage collection
This MR explores the possibility of ignoring bad link files during the garbage collection mark stage. Related blobs (not referenced somewhere else) should be automatically swept away during the subsequent sweep stage.
Related to gitlab#37611 (closed) and gitlab#32907.
Validation
To validate the change I have done as follows (requires proper testing once/if we agree on this approach).
Setup
- Build binaries;
- Deploy the registry server using the local filesystem driver;
- Set the following environment variables:
-
REGISTRY_FS_PATH
: Path to the registry filesystem (make sure the repository is empty); -
REGISTRY_ADDR
: Thehost:port
where the registry server is listening for requests; -
REGISTRY_CONFIG
: Path to the registry config file.
-
- Tag and push sample images:
docker pull alpine
docker tag alpine $REGISTRY_ADDR/myorg/myproj/myimg:1.0.0
docker push $REGISTRY_ADDR/myorg/myproj/myimg:1.0.0
docker pull golang:alpine
docker tag golang:alpine $REGISTRY_ADDR/myorg/myproj/myimg:2.0.0
docker push $REGISTRY_ADDR/myorg/myproj/myimg:2.0.0
docker pull redis:alpine
docker tag redis:alpine $REGISTRY_ADDR/myorg/myproj/myimg2
docker push $REGISTRY_ADDR/myorg/myproj/myimg2
docker pull nginx
docker tag nginx $REGISTRY_ADDR/myorg/myproj2/myimg
docker push $REGISTRY_ADDR/myorg/myproj2/myimg
docker pull ruby
docker tag ruby $REGISTRY_ADDR/myorg2/myproj/myimg
docker push $REGISTRY_ADDR/myorg2/myproj/myimg
- After this, we should have the following images in the registry:
-
myorg/myproj/myimg
, tags1.0.0
and2.0.0
-
myorg/myproj/myimg2
, taglatest
-
myorg/myproj2/myimg
, taglatest
-
myorg2/myproj/myimg
, taglatest
-
I’ve added multiple levels to ensure that we can see the path walk during the mark stage going through them without being interrupted.
Procedure
Note: For demonstration purposes, making it easier to visualize the garbage collection, I’ve added some temporary prints (which start with
WALK:
) and a log file.
- Run the garbage collector. There are no bad link files in the repository at this point, so everything should work:
./bin/registry garbage-collect $REGISTRY_CONFIG -m -d
This is the output that you should see. No manifests or blobs were marked as eligible for deletion, as expected.
- Erase the first manifest revision of
myorg/myproj/myimg
, which corresponds to the1.0.0
tag. This will simulate theinvalid checksum digest format
error scenario:
: > $REGISTRY_FS_PATH/docker/registry/v2/repositories/myorg/myproj/myimg/_manifests/revisions/sha256/ddf407284440a94889dc139bbe1be1daa19d99e166d6b1f2dfc6919846810b4e/link
- Rerun the garbage collector (same command as above). This time we can see that the program doesn’t break with the
invalid checksum digest format
error and the blobs associated with the bad revision are marked as eligible for deletion.
Results
Here you can see the difference between the two runs. The blobs used exclusively by the broken image revision (myorg/myproj/myimg:1.0.0
, ddf40728
) are marked as eligible for deletion. Running the garbage collector without the dry-run flag (-d
) will erase them.