Skip to content

CI trace chunks using lots of memory in redis-persistent

On gitlab.com we saw a recent growth in memory utilization: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13044.

The analysis there found that a large portion of memory is being consumed by gitlab:ci:trace:$ID:chunks. We suspect but have not confirmed that this is what is responsible for the recent growth.

The top key patterns shows 7 GB consumed by trace chunks (source):

gitlab:ci:trace:$ID:chunks	7156296774
session:gitlab:2::$ID	2717414786
session:user:gitlab:$PATTERN	646131774
projects/$ID/pushes_since_gc	502382174
projects/$ID/fetches_since_gc	212051119
etag:$PATH	64249403
...

We need to figure out:

  • Are CI trace chunks getting stuck in redis?
  • Are we at risk of going OOM, and are there short term mitigations we can apply to avoid disaster?

And as follow up items:

cc @andrewn @grzesiek