Caching in cleanup policy background jobs
⛲ Context
With time, the container registry accumulates container image tags. Those tags take physical space on object storage.
To counter that, we implemented cleanup policies. Basically, users will create a set of filters that will allow the backend to distinguish between a tag that needs to be kept and a tag that can be destroyed.
Cleanup policies are executed by background jobs periodically. There is a cadence value that users can set but to keep simple, let's say that backend will run the policies daily.
Now, container image tags don't live in the rails backend database. Those objects live directly in the container registry. So, when the backend runs a policy against a container image, it has to contact the container registry (through an API) and get all the information/data about tags.
Let's detail, these interactions (simplified)
sequenceDiagram
autonumber
rails cleanup tags service->>rails cleanup tags service: Run the cleanup policy on this container image X
rails cleanup tags service->>container registry: Give me all the tags of container image X
container registry->>rails cleanup tags service: Array of tag names
rails cleanup tags service->>rails cleanup tags service: Apply filters F1 on the array of tag names
loop **For each tag name**
rails cleanup tags service->>container registry: get the created at timestamp
container registry->>rails cleanup tags service: return the created at timestamp
end
rails cleanup tags service->>rails cleanup tags service: Apply filters F2 on tags with created at
rails cleanup tags service->>rails delete tags service: hey delete these tags!
rails delete tags service->>rails cleanup tags service: ok
- F1: this set of filters is set on the tag names (such as a regex)
- F2: these need the
created_at
as they are filters that need to work on the tag list ordered bycreated_at
You can see in the interactions above that F2
triggers a loop where the backend will trigger one API call per tag to get the created_at
of a given tag. Please understand, that these are the constraints we need to work with. At some point, container registry updates will unlock evolutions such as returning the list of tags with their name and their created_at
field in a single API call. Until we have those updates, we need to work with two API endpoints: one to get the list of tags (only names) and one to get the created_at
of a tag.
To complete this context, understand that container image tags are mutable. Users can delete them and make them point to a different image. The constraint here is that they do this by interacting directly with the container registry. As such, the rails backend has no clue of what happened with a tag.
💥 Problem
The main issue is that F2 removes so many tags that the resulting list of tags (eg. list of tags to destroy) is empty.
See this Kibana dashboard. We're not going into the details of that dashboard but look at how much Deleting tags
we have. The related ratio with that section is super small. In other words, in the majority of the jobs execution, we do all these API calls to the container registry for.....
This is not efficient at all.
Here is the p95 of external_http_count
of the last 24 hours for jobs that didn't destroy any tag.
🔨 Solution
The proposed solution here aims to reduce the loop of pings to get created_at
s. This will make the background job more efficient and so the backend will make a better use of its background resources.
The idea is to have a cache for some tags (not all of them). We need to be extra cautious with what we cache because as presented in the context, tags are mutable objects and the rails backend is not notified when a tag is "updated". In other words, caching a tag can be dangerous because we easily have stale data in cache.
Let's see if we can workaround those limitations. Here are the updated steps around F2 in the cleanup_tags_service.rb
:
- Receives the result of F1 (array of tag names)
- Read the
older_than
parameter from the cleanup policy - Remove from the cache, all entries older than
older_than
- For each tag, check the cache and fill the
created_at
- For each tag without the
created_at
, ping the container registry - Apply F2 filters
- For each tag filtered out, create a new cache entry for the tag with their
created_at
if a cache entry doesn't exist
I will emphasize here that cache entries are only created for tags that are kept or filtered out by F2. Why? Because, there is no sense in creating a cache entry for a tag that is going to be destroyed.
A few things to note:
- Cache entries have a TTL. I choose to use the cleanup policy
older_than
parameter but we could chose something shorter such asolder_than
/ 2. - The job itself will not evict cache entries
- Optional: the job could log the cache hit ratio
Why this custom cache logic works with the mutable nature of tags. Because, we snapshot the created_at
field for tags that we want to keep and when it's time to destroy the tag, we don't destroy the tag itself but the cache entry. This is to "force" the job to ping the container registry to get the latest created_at
= the job destroys tags with information that comes only from the container registry.
By avoiding using the stale data, the job will support any tag mutation. Here is an example (assuming older_than
= 90 days):
-
T
: the tagt
is created. -
T + 1.day
: the job runs and cachesT
for tagt
. -
T + 10.days
: the tag is destroyed. -
T + 20.days
: the tag is recreated (same name). -
T + 91.days
- the job runs and remove the cache entry for
t
(T is older thatolder_than
.). - the job pings the container registry for
t
and receivesT + 20.days
. - the job filter out
t
(T + 20.days
is withinolder_than
). - the job creates a new cache entry with
T + 20.days
for tagt
.
- the job runs and remove the cache entry for
The tag mutation didn't lead the job to a wrong decision. The cache was properly updated with the new created_at
.
💡 Technical details ideas
The cache itself can be implemented in two ways:
- redis
- regular database table
It has to support the following operations:
- Create an entry with: container image id, tag name,
created_at
- Get the entry for a given container image id and tag name
- Given a container image id, remove all the cache entries where
created_at
is <Time.now - policy.older_than
Here are some numbers from gitlab.com:
- ~50K container images are processed by cleanup policies
- keep in mind that we're rolling out the cleanup policies for all projects and this number will only increase
- before applying F1, we truncate the list of tags to max 200
- so, the worst case scenario will be handling 50K * 200 cache entries
Redis feels better geared at this problem than a database. Operation (3.) can be implemented with a TTL in the redis key directly = the job doesn't have to evict the cache entries, redis will do automatically that for us.
The amount of data to be cached can be a concern. As such, I would suggest using a feature flag to "mark" projects (or namespaces) that can use this cache feature.
⚙ Implementation
Following #339129 (comment 660056458), redis is a better candidate for what we want to do.
- Step (3.) There is no need for this step with redis. Using SET, we can set an expiration time. In other words, redis itself will handle this part for us.
- Step (4.) To get the value from the cache, we can use GET. Nothing special here.
- Step (7.) To set a value, we can use SET as described above.
- Regarding the TTL, it should be
created_at + older_than
- Regarding the TTL, it should be
The key for SET and GET should be: container_repository:{<id>}:cleanup_tag:<tag name>
. The container image id and the tag name must be part of the key. The value is, well, the created_at
value
- Avoid n+1 redis operations since we're looping on a list of tags.
- For this use redis pipelines.