Skip to content

Many defunct processes on Sidekiq/Webservice Pods running on Kubernetes Infrastructure

Problem Statement

While researching sidekiq related memory usage, I stumbled across an odd behavior that is present on all of the catchall fleet of Sidekiq Pods running GitLab.com. Over the course of time, we appear to gather up a rather large amount of defunct processes. Here's an example:

F S UID          PID    PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
0 Z git           98       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgconf] <defunct>
4 Z git          102       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpg] <defunct>
4 Z git          104       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgsm] <defunct>
5 Z git          146       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg-agent] <defunct>
Here's a full example, however. Click to expand
F S UID          PID    PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S git            1       0  0  80   0 - 38785 do_sys 14:42 ?        00:00:01 ruby /srv/gitlab/bin/sidekiq-cluster -r /srv/gitlab -e production --min-concurrency 15 --max-concurrency 15 -t 25 default,mailers,project_import_schedule                                                                                                                                                                                                     4 S git           32       1  0  80   0 - 176474 futex_ 14:42 ?       00:00:33 /usr/local/bin/gitlab-logger /var/log/gitlab
4 S git           37       1 43  80   0 - 1205987 do_sel 14:42 ?      00:33:58 sidekiq 6.4.0 queues:default,mailers,project_import_schedule [0 of 15 busy]
5 S git           39       1  1  80   0 - 66971 do_sys 14:42 ?        00:01:18 sidekiq_exporter
0 Z git           98       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgconf] <defunct>
4 Z git          100       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgconf] <defunct>
4 Z git          102       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpg] <defunct>
4 Z git          104       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgsm] <defunct>
0 Z git          106       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpgconf] <defunct>
0 Z git          108       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpg] <defunct>
4 Z git          110       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpg] <defunct>
0 Z git          112       1  0  80   0 -     0 -      14:43 ?        00:00:00 [gpg] <defunct>
0 Z git          133       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
4 Z git          135       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
4 Z git          137       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
5 Z git          139       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          142       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
4 Z git          144       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
5 Z git          146       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          149       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
0 Z git          151       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
4 Z git          153       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
0 Z git          155       1  0  80   0 -     0 -      14:45 ?        00:00:00 [gpg] <defunct>
0 Z git          209       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          211       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          213       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
5 Z git          215       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          218       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          220       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
5 Z git          222       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          225       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          227       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          229       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          231       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          233       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          235       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          237       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
5 Z git          239       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          242       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          244       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
5 Z git          246       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          249       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          251       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
4 Z git          253       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          255       1  0  80   0 -     0 -      14:49 ?        00:00:00 [gpg] <defunct>
0 Z git          311       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          313       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          315       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          317       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          320       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          322       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          324       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          327       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          329       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          331       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          333       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          335       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          337       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          339       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          341       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          344       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          346       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          348       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          351       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          353       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          355       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          357       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          359       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>                                                                                                                                                                                                                                                                                                                                               4 Z git          361       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          363       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          365       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          368       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          370       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          372       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          375       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          377       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          379       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          382       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          386       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          388       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          390       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          392       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          395       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          397       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
5 Z git          399       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          402       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          404       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
4 Z git          406       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          408       1  0  80   0 -     0 -      14:53 ?        00:00:00 [gpg] <defunct>
0 Z git          423       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
4 Z git          425       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
4 Z git          427       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
5 Z git          429       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          432       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
4 Z git          434       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
5 Z git          436       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          439       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
0 Z git          441       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
4 Z git          443       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
0 Z git          445       1  0  80   0 -     0 -      14:54 ?        00:00:00 [gpg] <defunct>
0 Z git          955       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
4 Z git          957       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
4 Z git          959       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
5 Z git          961       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg-agent] <defunct>
0 Z git          964       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
4 Z git          966       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
5 Z git          968       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg-agent] <defunct>
4 Z git          971       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
0 Z git          973       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
4 Z git          975       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
0 Z git          977       1  0  80   0 -     0 -      15:30 ?        00:00:00 [gpg] <defunct>
0 Z git         1022       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1024       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1026       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
5 Z git         1028       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg-agent] <defunct>
0 Z git         1031       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1033       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
5 Z git         1035       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg-agent] <defunct>
4 Z git         1038       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
0 Z git         1040       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1042       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
0 Z git         1044       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
0 Z git         1048       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1050       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1052       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
5 Z git         1054       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg-agent] <defunct>
0 Z git         1057       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1059       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
5 Z git         1061       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg-agent] <defunct>
4 Z git         1064       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
0 Z git         1066       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 Z git         1068       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
0 Z git         1070       1  0  80   0 -     0 -      15:33 ?        00:00:00 [gpg] <defunct>
4 R git         1454       0  0  80   0 -  2146 -      16:00 ?        00:00:00 ps -efl

At the time of this writing catchall is utilizing the following configuration for routing Sidekiq jobs:

While defunct processes are normally safe to ignore, I ponder if sidekiq may be doing something wrong and we are potentially hiding errors or disregarding a subprocess that was spawned. This should be investigated to determine if this is indeed harmless, or if something bad is happening that we need to better manage. While it is common for us to run into occasional OOMKill events, which could lead to this, we do not trigger enough events to cause the high amount of defunct processes.

As an example, the above output is a single Pod. At the time I performed this investigation, we were running 188 Pods of this workload. There were 41,783 defunct processes on this single workload.

% grep -c PID output.txt
188
% grep -c defunct output.txt
41783
Edited by John Skarbek