Re: [RFC PATCH v4] x86/kdump: terminate watchdog NMI interrupt to avoid kdump crashes

Linux Perf Users
 help / color / mirror / Atom feed

From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Zeng Heng <zengheng4@huawei.com>,
	alexander.shishkin@linux.intel.com, tglx@linutronix.de,
	tiwai@suse.de, jolsa@kernel.org, vbabka@suse.cz,
	keescook@chromium.org, mingo@redhat.com, acme@kernel.org,
	namhyung@kernel.org, bp@alien8.de, bhe@redhat.com,
	eric.devolder@oracle.com, hpa@zytor.com, jroedel@suse.de,
	dave.hansen@linux.intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org, liwei391@huawei.com,
	x86@kernel.org, xiexiuqi@huawei.com
Subject: Re: [RFC PATCH v4] x86/kdump: terminate watchdog NMI interrupt to avoid kdump crashes
Date: Wed, 22 Feb 2023 12:39:22 -0600	[thread overview]
Message-ID: <87r0uh5yud.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <Y/ZMEesgPnRR3LsG@hirez.programming.kicks-ass.net> (Peter Zijlstra's message of "Wed, 22 Feb 2023 18:08:33 +0100")

Peter Zijlstra <peterz@infradead.org> writes:

> On Fri, Feb 17, 2023 at 08:06:04PM +0800, Zeng Heng wrote:
>> If the cpu panics within the NMI interrupt context, there could be
>> unhandled NMI interrupts in the background which are blocked by processor
>> until next IRET instruction executes. Since that, it prevents nested
>> NMI handler execution.
>> 
>> In case of IRET execution during kdump reboot and no proper NMIs handler
>> registered at that point (such as during EFI loader)

EFI loader?  kexec on panic is supposed to be kernel to kernel.
If someone is getting EFI involved that is a bug.

>>, we need to ensure
>> watchdog no work any more, or kdump would crash later. So call
>> perf_event_exit_cpu() at the very last moment in the panic shutdown.

Why can't the crash recovery kernel handle this?

Sometimes we very much do have cases where the crash recovery kernel
can not handle it and we can in the dying kernel.  But every line
of code that is added to the code path the crashing kernel takes
increases the probability that something will go wrong and a crash
will not be captured.

>> !! Here I know it's not allowed to call perf_event_exit_cpu() within nmi
>> context, because of mutex_lock, smp_call_function and so on.
>> Is there any experts know about the similar function which allowed to call
>> within atomic context (Neither x86_pmu_disable() nor x86_pmu_disable_all()
>> do work after my practice)?
>> 
>> Thank you in advance.
>> 
>> Here provide one of test case to reproduce the concerned issue:
>>   1. # cat uncorrected
>>      CPU 1 BANK 4
>>      STATUS uncorrected 0xc0
>>      MCGSTATUS  EIPV MCIP
>>      ADDR 0x1234
>>      RIP 0xdeadbabe
>>      RAISINGCPU 0
>>      MCGCAP SER CMCI TES 0x6
>>   2. # modprobe mce_inject
>>   3. # mce-inject uncorrected
>> 
>> Mce-inject would trigger kernel panic under NMI interrupt context. In
>> addition, we need another NMI interrupt raise (such as from watchdog)
>> during panic process. Set proper watchdog threshold value and/or add an
>> artificial delay to make sure watchdog interrupt raise during the panic
>> procedure and the involved issue would occur.
>> 
>> Fixes: ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table")
>> Signed-off-by: Zeng Heng <zengheng4@huawei.com>
>> ---
>>   v1: add dummy NMI interrupt handler in EFI loader
>>   v2: tidy up changelog, add comments (by Ingo Molnar)
>>   v3: add iret_to_self() to deal with blocked NMIs in advance
>>   v4: call perf_event_exit_cpu() to terminate watchdog in panic shutdown
>> 
>>  arch/x86/kernel/crash.c | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>> 
>> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
>> index 305514431f26..f46df94bbdad 100644
>> --- a/arch/x86/kernel/crash.c
>> +++ b/arch/x86/kernel/crash.c
>> @@ -25,6 +25,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/vmalloc.h>
>>  #include <linux/memblock.h>
>> +#include <linux/perf_event.h>
>> 
>>  #include <asm/processor.h>
>>  #include <asm/hardirq.h>
>> @@ -170,6 +171,15 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
>>  #ifdef CONFIG_HPET_TIMER
>>  	hpet_disable();
>>  #endif
>> +
>> +	/*
>> +	 * If the cpu panics within the NMI interrupt context,
>> +	 * we need to ensure no more NMI interrupts blocked by
>> +	 * processor. In case of IRET execution during kdump
>> +	 * path and no proper NMIs handler registered at that
>> +	 * point, here terminate watchdog in panic shutdown.
>> +	 */
>> +	perf_event_exit_cpu(smp_processor_id());
>
> This kills all of perf, including but not limited to the hardware
> watchdog. However, it does nothing to external NMI sources like the NMI
> button found on some HP machines.
>
> Still I suppose it is sufficient for the normal case.

Except the architecture appears to be wrong.  I don't see any
explanation and I can't think of one why we don't just leave
NMIs deliberately disabled until the crash recover kernel
figured out how to enable them safely.

Eric


>>  	crash_save_cpu(regs, safe_smp_processor_id());
>>  }
>> 
>> --
>> 2.25.1
>>

next prev parent reply	other threads:[~2023-02-22 19:27 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-17 12:06 [RFC PATCH v4] x86/kdump: terminate watchdog NMI interrupt to avoid kdump crashes Zeng Heng
2023-02-22 17:08 ` Peter Zijlstra
2023-02-22 18:39   ` Eric W. Biederman [this message]
2023-02-23  2:29     ` Zeng Heng
2023-02-23  3:14       ` Zeng Heng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0uh5yud.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=eric.devolder@oracle.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@kernel.org \
    --cc=jroedel@suse.de \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tiwai@suse.de \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    --cc=xiexiuqi@huawei.com \
    --cc=zengheng4@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox