Most BCC tools broken on Ubuntu 18.04 when running kernel 5.4 or above
Problem summary
Many BCC tools are broken with kernel 5.4+ and libbcc 0.10
Our Chef-managed fleet currently runs Ubuntu 16.04 (Xenial), 18.04 (Bionic), and 20.04 (Focal).
When we initially added BCC, we chose to use the bcc-tools package from iovisor, because it provided a newer build than the equivalent Ubuntu-maintained package bpfcc-tools. Unfortunately the iovisor builds of the bcc-tools package have been broken for quite some time, so that on Ubuntu 20.04, Ubuntu's package bpfcc-tools is now newer. When we merge https://gitlab.com/gitlab-cookbooks/gitlab-server/-/merge_requests/243, we will start using the bpfcc-tools package on Ubuntu 20.04 hosts and continue using the bpf-tools package on Ubuntu 16.04 and 18.04.
However... unfortunately, both packages are broken on Ubuntu 18.04 if it's running kernel 5.4 or later.
Background
Kernel 5.4 introduced the asm_inline C macro. libbcc (a dependency of the BCC tools) knows how to handle that in version 0.12, but its older versions do not.
The package bcc-tools from iovisor is still stuck at using libbcc 0.10, so it can't work properly with kernel 5.4 and above. The package bpfcc-tools from ubuntu maintainers is similarly using an old libbcc version on Ubuntu 16.04 and 18.04, but on Ubuntu 20.04, it's using the newer libbcc 0.12 -- which is why the BCC tools work on Ubuntu 20.04.
This isn't a problem on Ubuntu 16.04 because it runs kernels older than 5.4.
However, Ubuntu 18.04 can optionally run kernel 5.4. And on Ubuntu 18.04, neither package bcc-tools nor bpfcc-tools pulls in a new enough version of libbpf. Consequently, any BPF script that depends on a library that depends on that asm_inline macro is likely to fail to compile its BPF C script.
For reference, here's the definition of that asm_line macro, copied from the kernel source tree in include/linux/compiler_types.h, added by commit eb111869301e15b737315a46c913ae82bd19eb9d:
#ifdef CONFIG_CC_HAS_ASM_INLINE
#define asm_inline asm __inline
#else
#define asm_inline asm
#endif
What does "broken" look like?
By "broken", I mean many (but not all) of the BPF programs fail to compile:
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ sudo /usr/share/bcc/tools/ext4slower 1
In file included from /virtual/main.c:2:
In file included from include/uapi/linux/ptrace.h:142:
In file included from ./arch/x86/include/asm/ptrace.h:5:
./arch/x86/include/asm/segment.h:266:2: error: expected '(' after 'asm'
alternative_io ("lsl %[seg],%[p]",
^
./arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 'alternative_io'
asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
^
include/linux/compiler_types.h:210:24: note: expanded from macro 'asm_inline'
#define asm_inline asm __inline
^
...
Which hosts are currently affected?
Currently we have only 4 Chef-managed hosts that are running Ubuntu 18.04 at kernel 5.4. I think this is probably the only way we'd be running the broken combination of a kernel >= 5.4 and an old libbcc version that can't handle asm_inline's injection of keyword __inline.
$ knife ssh -C 10 'lsb_release:18.04 AND os_version:5.4*' 'uname -r' 2> /dev/null
fe-registry-01-lb-pre.c.gitlab-pre.internal 5.4.0-1029-gcp
fe-01-lb-gstg.c.gitlab-staging-1.internal 5.4.0-1021-gcp
fe-01-lb-pre.c.gitlab-pre.internal 5.4.0-1025-gcp
camoproxy-01-sv-gstg.c.gitlab-staging-1.internal 5.4.0-1028-gcp
Hacky work-around
On affected hosts where the kernel >= 5.4 and libbcc < 0.12, we can work around it by overriding the C macro in the BPF utility's C script. For example, here's the work-around for BCC's ext4slower utility:
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ uname -r
5.4.0-1021-gcp
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ dpkg-query --show libbcc
libbcc 0.10.0-1
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ cp -p -i /usr/share/bcc/tools/ext4slower ~/
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ vim ext4slower
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ diff -U 2 /usr/share/bcc/tools/ext4slower ~/ext4slower
--- /usr/share/bcc/tools/ext4slower 2019-05-28 17:00:00.000000000 +0000
+++ /home/msmiley/ext4slower 2020-12-02 18:34:44.662992194 +0000
@@ -62,4 +62,6 @@
# define BPF program
bpf_text = """
+#define asm_inline asm
+
#include <uapi/linux/ptrace.h>
#include <linux/fs.h>
This works (with warnings about overriding the macro), but it's gross and requires hacking the packaged code.
msmiley@fe-01-lb-gstg.c.gitlab-staging-1.internal:~$ sudo ~/ext4slower 1
/virtual/main.c:2:9: warning: 'asm_inline' macro redefined [-Wmacro-redefined]
#define asm_inline asm
^
include/linux/compiler_types.h:210:9: note: previous definition is here
#define asm_inline asm __inline
^
1 warning generated.
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
18:39:58 mtail 8911 R 122 15327 1.17 haproxy.log
18:40:03 event_loop 2676 R 120 15383 1.39 haproxy.log
^C
Again, this hack is only required on hosts that are running the broken combination of:
- kernel 5.4 or above
- libbcc 0.10 or below