Skip to content

x86/boot: Ignore NMI during early boot

Derek Barbosa requested to merge debarbos/centos-stream-9:nec_patch into main

JIRA: https://issues.redhat.com/browse/RHEL-9380

Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

When an NMI-induced panic occurs, the arrival of a new NMI while console flushing may induce the risk of a deadlock on a different CPU while waiting for an atomic lock such as the printk_cpu_sync_owner locks.

In this case, this prevents the kernel from completing the setup of the kdump kernel. During this flushing/panic phase, it is also noted that no vmcores are generated.

To reproduce the issue:

  1. Obtain an IPMI-supported machine (acpi_ipmi, ipmi_devintf, ipmi_msghandler, ipmi_si, ipmi_ssif)
  2. Tune sysctl.conf (kernel.panic_on_io_nmi, kernel.panic_on_unrecovered_nmi, kernel.unknown_nmi_panic, kernel.panic all set to =1)
  3. Download, build and load the KMOD reproducer
  4. Cause a Panic via NMI with ipmitool power diag.

Please refer to the JIRA issue for the included KMOD files & reproducer specifics.

This NEC-provided patch ignores NMIs at early boot, insulating the kernel from NMIs until it is properly prepared to handle them.

There is the unwanted side effects of losing NMIs on early boot, but considering those would not be handled in a proper fashion anyway, we include the empty handler do_boot_nmi_trap() to ignore these NMIs.

Signed-off-by: Derek Barbosa debarbos@redhat.com

Edited by Derek Barbosa

Merge request reports