Kubernetes evicting job pods because of memory consumption
Summary
Lately Kubernetes (v1.12.6) is evicting pods from backup cronjob after a while they've been running due to memory consumption. Just scaled up the cluster and this keep happening even though pod is placed in a node with enough memory room.
Steps to reproduce
GitLab on kubernetes backing up to Azure storage using minio as gateway with the task-runner cron job
Configuration used
...
task-runner:
backups:
objectStorage:
config:
secret: gitlab-secrets
key: s3-config
cron:
enabled: true
schedule: "0 */8 * * *"
...
Current behavior
Cronjob OOM killed
Expected behavior
Job to upload backup properly
Versions
- Chart: 2.2.6
- Platform:
- Cloud: AKS
- Kubernetes: (
kubectl version
)- Client:
version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T12:36:28Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"darwin/amd64"}
- Server:
version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.6", GitCommit:"ab91afd7062d4240e95e51ac00a18bd58fddd365", GitTreeState:"clean", BuildDate:"2019-02-26T12:49:28Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
- Client:
- Helm: (
helm version
)- Client:
&version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
- Server:
&version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
- Client:
Relevant logs
$ kubectl describe po gitlab-task-runner-backup-1568592000-shkw6 -n gitlab
...
Status: Failed
Reason: Evicted
Message: The node was low on resource: memory. Container task-runner-backup was using 3715668Ki, which exceeds its request of 350M.
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Evicted 68m kubelet, aks-opsminion-38596230-5 The node was low on resource: memory. Container task-runner-backup was using 3715668Ki, which exceeds its request of 350M.
Normal Killing 68m kubelet, aks-opsminion-38596230-5 Killing container with id docker://task-runner-backup:Need to kill Pod
Workaround
Manually increased cronjob memory.
Edited by Pierre