Skip to content

Improve buildbox-casd startup times

Context

See https://lists.buildgrid.build/pipermail/buildgrid/2020-March/000194.html and its followups for more context and related discussion.

To summarize, buildbox-casd currently calculates disk usage during startup, which may take quite a while depending on the size of the cache. This causes issues with clients (like BuildStream) that expect to make a connection without waiting too much. See this BuildStream issue for details on that.

Task Description

One idea proposed on the list was to write a metadata file at the root of the store. Here's how I think this could work:

  • If the file is there, casd trusts it and assumes the disk usage is what's written in the file.
  • If the file is not there, casd calculates disk usage, similar to how it's done today
  • To protect against corruption or ungraceful shutdowns, casd can delete the metadata file on the first write.
  • When casd is shutting down (gracefully), it atomically writes the metadata file.

Separately, the mechanism to calculate disk usage and write the metadata file can be made into a standalone script as well.

Acceptance Criteria

buildbox-casd startup is almost instantaneous in the happy path (i.e. when the previous casd process shut down gracefully).