Skip to content

Strengthen the public cache

Context

Unfortunately the public cache is currently fragile, unable to handle even a single freedesktop-sdk pipeline. While I think this is mostly due to it being empty, and freedesktop-sdk caching stuff so that the pipeline has upwards of 16GiB being pushed at once, but we should investigate the source of this and strengthen things.

The current behaviour is 1) pushing is very slow, and 2) bb-storages get evicted due to memory pressure. At first we thought this was a bug in bb-storage, but patching what we thought was responsible did not help. The current working hypothesis is that the poor load balancing (almost everything is routed to the same bb-storage) and long S3 push times conspire to eventually destroy the cache under high work loads.

I suspect that this won't happen once the cache is populated a bit, as the throughput won't need to be as high.

Requirements

A cache solution needs to be the following:

  • High throughput.
  • High availability.
  • High capacity.

Also see https://github.com/buildbarn/bb-adrs/blob/master/0002-storage.md for a comprehensive discussion from a buildbarn standpoint.

Existing cache solutions for freedesktop connect to a fast (SSD?). This fufills the throughput and capacity requirements but not availability. It represents a single point of failure and is not scalable, when moving to a situation with high numbers of clients.

Edited by Chris Phang