Skip to content

Handle bad link files gracefully during garbage collection

João Pereira requested to merge 37611-gc-handle-bad-link-files into release/2.7-gitlab

This MR explores the possibility of ignoring bad link files during the garbage collection mark stage. Related blobs (not referenced somewhere else) should be automatically swept away during the subsequent sweep stage.

Related to gitlab#37611 (closed) and gitlab#32907.

Validation

To validate the change I have done as follows (requires proper testing once/if we agree on this approach).

Setup

  1. Build binaries;
  2. Deploy the registry server using the local filesystem driver;
  3. Set the following environment variables:
    • REGISTRY_FS_PATH : Path to the registry filesystem (make sure the repository is empty);
    • REGISTRY_ADDR: The host:port where the registry server is listening for requests;
    • REGISTRY_CONFIG: Path to the registry config file.
  4. Tag and push sample images:
docker pull alpine
docker tag alpine $REGISTRY_ADDR/myorg/myproj/myimg:1.0.0
docker push $REGISTRY_ADDR/myorg/myproj/myimg:1.0.0

docker pull golang:alpine
docker tag golang:alpine $REGISTRY_ADDR/myorg/myproj/myimg:2.0.0
docker push $REGISTRY_ADDR/myorg/myproj/myimg:2.0.0

docker pull redis:alpine
docker tag redis:alpine $REGISTRY_ADDR/myorg/myproj/myimg2
docker push $REGISTRY_ADDR/myorg/myproj/myimg2

docker pull nginx
docker tag nginx $REGISTRY_ADDR/myorg/myproj2/myimg
docker push $REGISTRY_ADDR/myorg/myproj2/myimg

docker pull ruby
docker tag ruby $REGISTRY_ADDR/myorg2/myproj/myimg
docker push $REGISTRY_ADDR/myorg2/myproj/myimg
  1. After this, we should have the following images in the registry:
    • myorg/myproj/myimg, tags 1.0.0 and 2.0.0
    • myorg/myproj/myimg2 , tag latest
    • myorg/myproj2/myimg , tag latest
    • myorg2/myproj/myimg , tag latest

I’ve added multiple levels to ensure that we can see the path walk during the mark stage going through them without being interrupted.

Procedure

Note: For demonstration purposes, making it easier to visualize the garbage collection, I’ve added some temporary prints (which start with WALK:) and a log file.

  1. Run the garbage collector. There are no bad link files in the repository at this point, so everything should work:
./bin/registry garbage-collect $REGISTRY_CONFIG -m -d

This is the output that you should see. No manifests or blobs were marked as eligible for deletion, as expected.

  1. Erase the first manifest revision of myorg/myproj/myimg, which corresponds to the 1.0.0 tag. This will simulate the invalid checksum digest format error scenario:
: > $REGISTRY_FS_PATH/docker/registry/v2/repositories/myorg/myproj/myimg/_manifests/revisions/sha256/ddf407284440a94889dc139bbe1be1daa19d99e166d6b1f2dfc6919846810b4e/link
  1. Rerun the garbage collector (same command as above). This time we can see that the program doesn’t break with the invalid checksum digest format error and the blobs associated with the bad revision are marked as eligible for deletion.

Results

Here you can see the difference between the two runs. The blobs used exclusively by the broken image revision (myorg/myproj/myimg:1.0.0, ddf40728) are marked as eligible for deletion. Running the garbage collector without the dry-run flag (-d) will erase them.

Edited by João Pereira

Merge request reports