Add `read_at` data to the dependency proxy
Proposal
Based on the discussion linked below, we will:
- Add a
read_at
column todependency_proxy_blobs
anddependency_proxy_manifests
. - Update the dependency proxy services to update this value anytime an image is pulled.
- Update the cleanup workers to expire based off of
read_at
instead ofupdated_at
.
The following discussion from !70029 (merged) should be addressed:
-
@10io started a discussion: (+4 comments) Qualified blobs/manifests are based on their
created_at
attribute.Shouldn't we take into account the last time they were read?
Imagine the situation of a single blob accessed over and over all the time by CI pipelines. Why would we want to remove that blob from the cache?
It seems more accurate to remove blobs that have not been read for
ttl
days. That means that each time a blob/manifest is read we should update a timestamp. Ideally, we should have aread_at
attribute for that but we could also use theupdated_at
for that although it's not really an update that we're doing.WDYT?
Edited by Tim Rizzi