Skip to content

context deadline exceeded for Sidekiq pod performing background migration

On a pod that is performing a background migration, we are seeing a sequence of readiness probe failures and containers recreating.

https://log.gprd.gitlab.net/goto/995225d65af4f15873a5db29795fd151

  • Aug 30, 2021 @ 08:38:59 Started container Sidekiq
  • readines probe failures ...
  • Aug 30, 2021 @ 09:50 Started container Sidekiq
  • readiness probe failures ...
  • Aug 30, 2021 @ 11:28 Started container Sidekiq

The background migration worker is unable to do anything:

image

https://log.gprd.gitlab.net/goto/9c37d069df86c54d24cde49fe24f3ceb

we don't have any limits on CPU, but we do on memory.

based on https://dashboards.gitlab.net/d/sidekiq-kube-containers/sidekiq-kube-containers-detail?viewPanel=9&orgId=1&from=now-10h&to=now&var-PROMETHEUS_DS=Global&var-environment=gprd&var-stage=main

and the configured limits https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/4f3c388a8b1decbabe78543b942c180e7b6cce80/releases/gitlab/values/gprd.yaml.gotmpl#L231-232

Should we increase the memory req/limits?