Skip to content

Add a "janitor" daemon to clean up orphaned blobs from S3

Adam Coldrick requested to merge sotk/features/cas-janitor into master

Description

This MR adds a bgd janitor start command which runs a daemon which compares the contents of matching S3 buckets with the contents of an SQL or Redis CAS index, and removes anything not in the index from the S3 bucket. This can be used to clean up any blobs which slip past the regular cleanup daemon in S3 for whatever reason. This is particularly useful with a Redis index, where the regular cleaner will remove keys regardless of S3 deletion success/failure.

Validation

Use the provided docker compose examples to validate the janitor's behaviour.

SQL

docker compose -f docker-compose-examples/s3-cas.yml up --build

Upload something to CAS using casupload, or trexe, or the bgd cli. For example

tox -e venv -- bgd execute --remote-cas=http://localhost:50052 command data ls

Head to http://localhost:9001/ (credentials are minioadmin:minioadmin) and see the contents of the buildgrid bucket.

Connect to postgres (bgd:insecure@localhost:5432) using your preferred client, and delete some rows from the index table. Within 10 seconds the janitor should delete the corresponding blobs from S3, which can be checked in the minio UI.

Redis

docker compose -f docker-compose-examples/redis-index.yml up --build

Upload something to CAS and check the objects are in Minio, see the SQL section.

View the redis index using the redis-commander UI at http://localhost:8081/. Delete some keys here and the janitor should delete the corresponding blobs from S3.

Merge request reports