Skip to content

x86/retbleed: Call depth tracking mitigation

Waiman Long requested to merge llong1/centos-stream-9:bz2190342_retstuff into main

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2190342
MR: !2448 (merged)
Omitted-fix: 51c4f2bf5397 ("tools headers cpufeatures: Sync with the kernel sources") This patch contains additional changes not covered by this MR.
Omitted-fix: a39818a3fb2b ("objtool/powerpc: Implement arch_pc_relative_reloc()") RHEL9 doesn't have commit e52ec98c5ab1 ("objtool/powerpc: Enable objtool to be built on ppc") which enabled objtool build in PPC and added tools/objtool/arch/powerpc directory.

Tested: See below

Prior to these patches function alignment is basically non-existent, as such any instruction fetch for the first instructions of a function will have (on average) half the fetch window filled with whatever comes before. By pushing the alignment up to 16 bytes this improves matters for chips that happen to have a 16 byte i-fetch window size (Intel) while not making matters worse for chips that have a larger 32 byte i-fetch window (AMD Zen). In fact, it improves the worst case for Zen from 31 bytes of garbage to 16 bytes of garbage.

As such the first many patches of the series fix up lots of alignment quirks.

Because the compiler managed to place two adjacent (in code) DEFINE_PER_CPU() variables in random cachelines (it is absolutely free to do so) the introduction of the per-cpu x86_call_depth variable sometimes introduced significant additional cache pressure, while other times it would sit nicely in the same line with preempt_count and not show up at all.

In order to alleviate this problem; introduce struct pcpu_hot and collect a number of hot per-cpu variables in a way the compiler can't mess up.

Aside from these changes; the core of the depth tracking is:

  • objtool creates a list of functions and a list of function call sites.

  • for every function the padding is overwritten with the call accounting thunk; for every call sbz2190342_retstuffite the call target is adjusted to point to this thunk.

  • the retbleed return thunk mechanism is used for a custom return thunk that includes return accounting and does RSB stuffing when required.

This ensures no new compiler is required and avoids almost all overhead for non affected machines. This new option can still be selected using:

"retbleed=stuff"

on the kernel command line.

The Return-Stack-Buffer (RSB) is a 16 deep stack that is filled on every call. On the return path speculation will "pop" an entry and takes that as the return target. Once the RSB is empty, the CPU falls back to other predictors, e.g. the Branch History Buffer, which can be mistrained by user space and misguides the (return) speculation path to a disclosure gadget of your choice -- as described in the retbleed paper.

Call depth tracking is designed to break this speculation path by stuffing speculation trap calls into the RSB whenver the RSB is running low. This way the speculation stalls and never falls back to other predictors.

The assumption is that stuffing at the 12th return is sufficient to break the speculation before it hits the underflow and the fallback to the other predictors. Testing confirms that it works. Johannes, one of the retbleed researchers, tried to attack this approach and confirmed that it brings the signal to noise ratio down to the crystal ball level.

Because of the forced 16-byte function alignment, the size of vmlinux increases from 50,539,033 bytes to 5,263,4338 bytes which is about 4%.

After booting the CKI artifact kernel with the "retbleed=stuff" command line option on a dual-socket Skylake system, the spectre_v2 and retbleed vulnerability files were respectively:

Mitigation: Retpolines, IBPB: conditional, IBRS_FW, RSB filling, PBRSB-eIBRS: Not affected
Mitigation: Stuffing

Without the "retbleed" boot command line parameter, the vulnerability files were:

Mitigation: IBRS, IBPB: conditional, RSB filling, PBRSB-eIBRS: Not affected
Mitigation: IBRS

Signed-off-by: Waiman Long longman@redhat.com

Edited by Waiman Long

Merge request reports