rake gettext:regenerate is slow and inefficient
Regenerating gitlab.pot
takes minutes to complete. (e.g. 360s in CI). We should improve the performance of this commonly run developer task.
Original description
Regenerating gitlab.pot
takes minutes to complete. To see where all the time was going, I ran:
strace -ttt -Tvvv -yyy -f -o /tmp/gettext.txt -s 2048 bin/rake gettext:regenerate
Then used https://gitlab.com/gitlab-com/support/toolbox/json_stats to see this data:
Details of Top 25 PIDs by Active Time
-----------
PID 31746
1031593 syscalls, active time: 51411.633ms, user time: 274978.656ms, total time: 327644.125ms
syscall count total (ms) max (ms) avg (ms) min (ms) errors
----------------- -------- ---------- ---------- ---------- ---------- --------
read 47176 30574.146 11326.794 0.648 0.007
openat 217801 5918.407 15.908 0.027 0.009 ENOENT: 95779 ENOTDIR: 132
getdents 164298 3214.238 9.525 0.020 0.006
close 121948 2442.468 23.501 0.020 0.007
fcntl 164259 2289.536 7.665 0.014 0.006 EBADF: 3
lstat 94787 2072.847 11.641 0.022 0.007 ENOENT: 343
fstat 106126 1677.676 5.720 0.016 0.006
stat 63493 1324.402 10.123 0.021 0.008 ENOENT: 2620
futex 9196 1251.504 9.059 0.136 0.007 EAGAIN: 1353
getcwd 7902 500.389 7.864 0.063 0.007
ioctl 16669 498.672 13.389 0.030 0.006 EBADF: 3 ENOTTY: 16655
clone 1478 223.597 12.202 0.151 0.043
lseek 8405 220.484 14.319 0.026 0.007
brk 2542 157.645 5.456 0.062 0.008
write 861 90.731 1.279 0.105 0.010
vfork 3 46.957 17.449 15.652 13.756
getpid 1730 42.586 1.397 0.025 0.006
execve 31 20.943 14.457 0.676 0.018
mprotect 341 14.177 0.287 0.042 0.013
mmap 465 13.722 0.201 0.030 0.008
unlink 140 12.129 0.322 0.087 0.017 ENOENT: 1
sched_yield 149 9.607 1.501 0.064 0.014
chdir 140 5.116 2.722 0.037 0.011
getuid 241 4.903 1.055 0.020 0.008
getgid 237 4.851 1.091 0.020 0.007
munmap 77 4.672 0.632 0.061 0.023
geteuid 288 4.365 0.079 0.015 0.007
writev 46 3.669 0.207 0.080 0.029
getegid 238 3.661 0.144 0.015 0.007
access 104 2.719 0.135 0.026 0.009 ENOENT: 86
wait4 13 2.303 0.821 0.177 0.011 ECHILD: 5
rt_sigaction 85 2.052 0.153 0.024 0.010
utimensat 46 2.000 0.145 0.043 0.012 ENOENT: 46
rt_sigprocmask 57 1.646 0.154 0.029 0.007
connect 32 1.464 0.165 0.046 0.030 ENOENT: 32
newfstatat 73 1.389 0.067 0.019 0.012 ENOENT: 37
socket 32 0.861 0.058 0.027 0.017
getrandom 10 0.329 0.064 0.033 0.016
pipe2 8 0.315 0.082 0.039 0.014
arch_prctl 8 0.282 0.095 0.035 0.011
pipe 5 0.268 0.125 0.054 0.023
prlimit64 9 0.254 0.090 0.028 0.011
dup2 7 0.192 0.076 0.027 0.012
mremap 2 0.173 0.154 0.086 0.019
rt_sigreturn 5 0.166 0.058 0.033 0.016
getppid 3 0.147 0.062 0.049 0.023
getpgrp 3 0.145 0.103 0.048 0.014
sysinfo 4 0.141 0.085 0.035 0.016
uname 6 0.140 0.070 0.023 0.008
getresgid 3 0.096 0.057 0.032 0.019
getresuid 3 0.090 0.062 0.030 0.011
sigaltstack 1 0.063 0.063 0.063 0.063
sched_getaffinity 2 0.045 0.026 0.023 0.019
poll 2 0.039 0.021 0.019 0.018
clock_gettime 1 0.026 0.026 0.026 0.026
set_robust_list 1 0.016 0.016 0.016 0.016
set_tid_address 1 0.016 0.016 0.016 0.016
exit_group 1 n/a n/a n/a n/a
---------------
Program Executed: /home/stanhu/.rbenv/versions/2.5.3/bin/ruby
Args: ["bin/rake" "gettext:regenerate"] = 0
Child PIDs: 31757, 31759, 31760, 31770, 31772, 31773, 32485, 1323, 1324, 1330 and 1471 more...
PID 1875
44519 syscalls, active time: 1375.683ms, user time: 8744.612ms, total time: 11369.586ms
syscall count total (ms) max (ms) avg (ms) min (ms) errors
----------------- -------- ---------- ---------- ---------- ---------- --------
epoll_wait 4077 918.708 17.142 0.225 0.010
getcwd 17104 590.539 3.308 0.035 0.007
futex 5598 330.306 3.664 0.059 0.009 EAGAIN: 78
lstat 3069 162.048 2.246 0.053 0.010
read 4234 156.204 10.810 0.037 0.008
openat 2376 122.347 1.336 0.051 0.010 ENOENT: 14
getdents 1656 88.887 4.194 0.054 0.008
fstat 2328 75.723 1.770 0.033 0.007
close 2361 74.860 0.868 0.032 0.008
stat 345 35.572 0.859 0.103 0.009 ENOENT: 163
mmap 346 19.486 0.312 0.056 0.009
munmap 251 14.597 1.026 0.058 0.008
brk 83 11.792 0.643 0.142 0.016
write 82 5.448 0.210 0.066 0.011
fcntl 254 4.542 0.218 0.018 0.011 EBADF: 243
pread64 37 2.809 0.439 0.076 0.008
getpid 61 2.027 0.468 0.033 0.007
rt_sigaction 97 1.848 0.118 0.019 0.012 EINVAL: 2
mprotect 44 1.502 0.124 0.034 0.011
ioctl 28 1.423 0.225 0.051 0.014 EBADF: 8 EINVAL: 1 ENOTTY: 3
clone 9 1.039 0.282 0.115 0.032
execve 1 0.571 0.571 0.571 0.571
access 9 0.476 0.167 0.053 0.018 ENOENT: 9
rt_sigprocmask 8 0.385 0.165 0.048 0.011
prlimit64 11 0.378 0.070 0.034 0.014
epoll_ctl 3 0.212 0.100 0.071 0.018
getuid 9 0.204 0.056 0.023 0.017
readlink 2 0.196 0.142 0.098 0.054
getgid 8 0.154 0.028 0.019 0.016
getegid 8 0.147 0.022 0.018 0.017
geteuid 8 0.147 0.021 0.018 0.017
eventfd2 1 0.084 0.084 0.084 0.084
epoll_create1 1 0.069 0.069 0.069 0.069
poll 1 0.065 0.065 0.065 0.065
pipe2 2 0.052 0.028 0.026 0.024
arch_prctl 1 0.022 0.022 0.022 0.022
clock_getres 1 0.020 0.020 0.020 0.020
dup3 1 0.018 0.018 0.018 0.018
sysinfo 1 0.017 0.017 0.017 0.017
set_robust_list 1 0.017 0.017 0.017 0.017
dup2 1 0.016 0.016 0.016 0.016
set_tid_address 1 0.016 0.016 0.016 0.016
exit_group 1 n/a n/a n/a n/a
---------------
Program Executed: /usr/bin/node
Args: ["scripts/frontend/extract_gettext_all.js" "--all"] <unfinished
Parent PID: 31746
Child PIDs: 1876, 1877, 1878, 1879, 1880, 1935, 1936, 1937, 1938
<snip>
Edited by Lukas Eipert