Make CI build chunks in Redis compatible with Redis Cluster
https://docs.gitlab.com/ee/administration/job_logs.html#new-incremental-logging-architecture describes a new logging architecture, where we append CI job logs in Redis while the job is running, then archive from Redis to object storage when it finishes.
This is not currently enabled on GitLab.com, but we're working on that in #34781 (closed) and the related issues.
Currently, the key names are not compatible with Redis Cluster. We aren't currently planning to use Redis Cluster (gitlab-com/gl-infra&211 (comment 364207522)), but if we do want that option in future, it would be good to get new large uses of multi-key commands addressed early (we are adding a validator for this in gitlab-com/gl-infra/scalability#305 (closed)). In this case, as the feature is not live yet, it could be a good candidate.
The problem with Redis Cluster compatibility here is that we perform multi-key operations on keys that would hash to different slots. One example is in Ci::BuildTraceChunks::Redis#delete_keys
, where we can delete multiple keys at once. Those keys look like this:
"gitlab:ci:trace:#{build_id.to_i}:chunks:#{chunk_index.to_i}"
It's actually not clear to me when we'd call this with multiple keys, apart from in specs. If the keys are always for the same build ID (as in the specs), to make this compatible with Redis Cluster would just be a case of wrapping the common elements in curly braces:
"{gitlab:ci:trace:#{build_id.to_i}}:chunks:#{chunk_index.to_i}"
The potential issues there are:
- Do we need to migrate any existing keys to this new format if we change it? My hope is that the off-by-default nature of this means we don't, but maybe that's too optimistic.
- Is my assumption that this is always for the same build ID correct?