Improve buildbox-casd startup times
Context
See https://lists.buildgrid.build/pipermail/buildgrid/2020-March/000194.html and its followups for more context and related discussion.
To summarize, buildbox-casd
currently calculates disk usage during startup, which may take quite a while depending on the size of the cache. This causes issues with clients (like BuildStream) that expect to make a connection without waiting too much. See this BuildStream issue for details on that.
Task Description
One idea proposed on the list was to write a metadata file at the root of the store. Here's how I think this could work:
- If the file is there, casd trusts it and assumes the disk usage is what's written in the file.
- If the file is not there, casd calculates disk usage, similar to how it's done today
- To protect against corruption or ungraceful shutdowns, casd can delete the metadata file on the first write.
- When casd is shutting down (gracefully), it atomically writes the metadata file.
Separately, the mechanism to calculate disk usage and write the metadata file can be made into a standalone script as well.
Acceptance Criteria
buildbox-casd
startup is almost instantaneous in the happy path (i.e. when the previous casd process shut down gracefully).