x86/boot: Ignore NMI during early boot
JIRA: https://issues.redhat.com/browse/RHEL-9380
Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
When an NMI-induced panic occurs, the arrival of a new NMI while console flushing may induce the risk of a deadlock on a different CPU while waiting for an atomic lock such as the printk_cpu_sync_owner locks.
In this case, this prevents the kernel from completing the setup of the kdump kernel. During this flushing/panic phase, it is also noted that no vmcores are generated.
To reproduce the issue:
- Obtain an IPMI-supported machine (acpi_ipmi, ipmi_devintf, ipmi_msghandler, ipmi_si, ipmi_ssif)
- Tune sysctl.conf (kernel.panic_on_io_nmi, kernel.panic_on_unrecovered_nmi, kernel.unknown_nmi_panic, kernel.panic all set to =1)
- Download, build and load the KMOD reproducer
- Cause a Panic via NMI with ipmitool power diag.
Please refer to the JIRA issue for the included KMOD files & reproducer specifics.
This NEC-provided patch ignores NMIs at early boot, insulating the kernel from NMIs until it is properly prepared to handle them.
There is the unwanted side effects of losing NMIs on early boot, but considering those would not be handled in a proper fashion anyway, we include the empty handler do_boot_nmi_trap() to ignore these NMIs.
Signed-off-by: Derek Barbosa debarbos@redhat.com