public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes
@ 2023-02-02  1:40 Zeng Heng
  2023-02-02  9:09 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Zeng Heng @ 2023-02-02  1:40 UTC (permalink / raw)
  To: mingo, bp, jroedel, vbabka, hpa, tglx, eric.devolder, bhe, tiwai,
	keescook, dave.hansen
  Cc: linux-kernel, x86, liwei391, xiexiuqi

If the cpu panics within the NMI interrupt context,
there could be unhandled NMI interrupts in the
background which are blocked by processor until
next IRET instruction executes. Since that, it
prevents nested execution of the NMI handler.

In case of IRET execution during kdump reboot and
no proper NMIs handler registered at that point
(such as during EFI loader), we need to handle these
blocked NMI interrupts in advance to avoid kdump
crashes.

Because asm_exc_nmi() has the ability to handle
nested NMIs, here call iret_to_self() and execute
IRET instruction in order to trigger and handle the
possible blocked NMIs interrupts in advance before
the IDT set invalidate.

Provide one of test case to reproduce the concerned
issue, and here is the steps:
  1. # cat uncorrected
     CPU 1 BANK 4
     STATUS uncorrected 0xc0
     MCGSTATUS  EIPV MCIP
     ADDR 0x1234
     RIP 0xdeadbabe
     RAISINGCPU 0
     MCGCAP SER CMCI TES 0x6
  2. # modprobe mce_inject
  3. # mce-inject uncorrected

Mce-inject would trigger kernel panic under NMI
interrupt context. In addition, we need another NMI
interrupt raise (such as from watchdog) during panic
process. Set proper watchdog threshold value and/or
add an artificial delay to make sure watchdog interrupt
raise during the panic procedure and the involved
issue would occur.

Fixes: ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
---
  v1: add dummy NMI interrupt handler in EFI loader
  v2: tidy up changelog, add comments (by Ingo Molnar)
  v3: add iret_to_self() to deal with blocked NMIs in advance

 arch/x86/kernel/crash.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 305514431f26..3aaca680a639 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -41,6 +41,7 @@
 #include <asm/intel_pt.h>
 #include <asm/crash.h>
 #include <asm/cmdline.h>
+#include <asm/sync_core.h>

 /* Used while preparing memory map entries for second kernel */
 struct crash_memmap_data {
@@ -143,6 +144,19 @@ void native_machine_crash_shutdown(struct pt_regs *regs)

 	crash_smp_send_stop();

+	/*
+	 * If the cpu panics within the NMI interrupt context,
+	 * there may be unhandled NMI interrupts which are
+	 * blocked by processor until next IRET instruction
+	 * executes.
+	 *
+	 * In case of IRET execution during kdump reboot and
+	 * no proper NMIs handler registered at that point,
+	 * we trigger and handle blocked NMIs in advance to
+	 * avoid kdump crashes.
+	 */
+	iret_to_self();
+
 	/*
 	 * VMCLEAR VMCSs loaded on this cpu if needed.
 	 */
--
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-02-15  3:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-02  1:40 [PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes Zeng Heng
2023-02-02  9:09 ` Peter Zijlstra
2023-02-14  9:30   ` Zeng Heng
2023-02-14  9:49     ` Peter Zijlstra
2023-02-15  1:01       ` Baoquan He
2023-02-15  3:05         ` Zeng Heng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox