[RFC] Revert "sched: Add support for lazy preemption"
Hi.
This merge request reverts a commit which had bad side effects on BTF generation.
Moreover, I think the reverted commit is an RT one which was added to a non RT kernel, i.e.: kernel-5.14.0-362.el9
.
Due to problem in BTF, some load in eBPF occurred at bad places and programs were rejected by the verifier:
root@vm-amd64:~# /rhel-bug-reproducer
2024/02/07 07:46:01 opening tracepoint: cannot create bpf perf link: permission denied
root@vm-amd64:~# uname -a
Linux vm-amd64 5.14.0-dirty #10 SMP PREEMPT_DYNAMIC Wed Feb 7 14:41:42 +07 2024 x86_64 GNU/Linux
Indeed, if we compare the BTF of this RHEL kernel with another without this problem we can see that the added field added padding which causes troubles:
$ grep trace_event_raw_sys_exit -A 10 /tmp/good /tmp/bad (tags/v5.14^0) %
/tmp/good:struct trace_event_raw_sys_exit {
/tmp/good- struct trace_entry ent; /* 0 8 */
/tmp/good- long int id; /* 8 8 */
/tmp/good- long int ret; /* 16 8 */
/tmp/good- char __data[]; /* 24 0 */
/tmp/good-
/tmp/good- /* size: 24, cachelines: 1, members: 4 */
/tmp/good- /* last cacheline: 24 bytes */
/tmp/good-};
/tmp/good-struct trace_event_data_offsets_sys_enter {
/tmp/good-
--
/tmp/bad:struct trace_event_raw_sys_exit {
/tmp/bad- struct trace_entry ent; /* 0 12 */
/tmp/bad-
/tmp/bad- /* XXX last struct has 3 bytes of padding */
/tmp/bad- /* XXX 4 bytes hole, try to pack */
/tmp/bad-
/tmp/bad- long int id; /* 16 8 */
/tmp/bad- long int ret; /* 24 8 */
/tmp/bad- char __data[]; /* 32 0 */
/tmp/bad-
/tmp/bad- /* size: 32, cachelines: 1, members: 4 */
Indeed, the RT patch adds a field in trace_entry
: preempt_lazy_count
.
As a consequence, the verifier
rejects the program here:
https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/blob/kernel-5.14.0-362.el9/kernel/trace/trace_events.c#L209
Because the program max_offset
was set to 32 instead of 24 on kernel without this problem.
This was reported on Inspektor Gadget repository by https://github.com/matthyx and first investigated by https://github.com/mauriciovasquezbernal: https://github.com/inspektor-gadget/inspektor-gadget/issues/2444
With the patch reverted, we can now load eBPF programs:
root@vm-amd64:~# /rhel-bug-reproducer
Run sudo cat /sys/kernel/debug/tracing/trace_pipe in another terminal to see the output
Press Ctrl+C to close: root@vm-amd64:~#
Of course, the proposed solution may be too brutal if you really want to have this patch. But, with the problem understood, we can start to discuss about possible solution.
Best regards and thank you in advance.