Implement LRU cache expiry
Overview
buildbox-casd manages a local cache of CAS blobs. Cache expiry is thus designed for local filesystems and not for distributed blob storage (the BuildGrid CAS server is expected to cover the latter use case).
- Blob-based LRU expiry strategy: Delete the blobs with the oldest timestamps without taking references into account. I.e., no special handling for file and subdirectory objects referenced by directory objects. All code must be able to handle dangling references.
- Configurable low/high watermark for disk usage of cache: Trigger expiry when cache disk usage reaches the high watermark. Delete the oldest blobs until disk usage falls below low watermark.
- Keep track of disk usage in memory (atomic integer). This is possible as only a single process (casd) is allowed to create or delete files in the cache.
- Use filesystem mtime as blob timestamp (atime is not consistently available). No separate database/index.
- Update blob timestamp on every operation that refers to a particular blob
- REAPI methods
FindMissingBlobs
,BatchReadBlobs
,BatchUpdateBlobs
, and ByteStreamRead
/Write
methods - LocalCAS methods
FetchMissingBlobs
,UploadMissingBlobs
,FetchTree
,UploadTree
,StageTree
,CaptureTree
, andCaptureFiles
internally build on top of the functionality provided by the REAPI methods and will also update blob timestamps.
- REAPI methods
- casd clients directly reading blobs from the filesystem (bypassing gRPC) are not expected to update timestamp on access, however, they are expected to call
FetchMissingBlobs
orFetchTree
before direct blob access. - Synchronization/locking: No locking across processes (clients), casd-internal synchronization to be determined.
Related work
The planned approach is similar to what is currently implemented in BuildStream's bst-artifact-server
, which is working much better than the previous approach that was based on deleting the oldest top-level references followed by (expensive) pruning of unreachable objects. BuildStream local cache expiry still uses that previous expiry approach. The goal is to migrate the BuildStream local cache to buildbox-casd and thus using the approach described above.
Edited by Jürg Billeter