From: Waiman Long <llong@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
"H. Peter Anvin" <hpa@zytor.com>, Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH v4] x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()
Date: Tue, 4 Feb 2025 23:03:22 -0500 [thread overview]
Message-ID: <50e7cfb4-4edd-4fa0-ba3e-b22d8d324b69@redhat.com> (raw)
In-Reply-To: <20241219150653.349177-1-longman@redhat.com>
On 12/19/24 10:06 AM, Waiman Long wrote:
> Depending on the type of panics, it was found that the
> __register_nmi_handler() function can be called in NMI context from
> nmi_shootdown_cpus() leading to a lockdep splat like the following.
>
> [ 1123.133573] ================================
> [ 1123.137845] WARNING: inconsistent lock state
> [ 1123.142118] 6.12.0-31.el10.x86_64+debug #1 Not tainted
> [ 1123.147257] --------------------------------
> [ 1123.151529] inconsistent {INITIAL USE} -> {IN-NMI} usage.
> :
> [ 1123.261544] Possible unsafe locking scenario:
> [ 1123.261544]
> [ 1123.267463] CPU0
> [ 1123.269915] ----
> [ 1123.272368] lock(&nmi_desc[0].lock);
> [ 1123.276122] <Interrupt>
> [ 1123.278746] lock(&nmi_desc[0].lock);
> [ 1123.282671]
> [ 1123.282671] *** DEADLOCK ***
> :
> [ 1123.314088] Call Trace:
> [ 1123.316542] <NMI>
> [ 1123.318562] dump_stack_lvl+0x6f/0xb0
> [ 1123.322230] print_usage_bug.part.0+0x3d3/0x610
> [ 1123.330618] lock_acquire.part.0+0x2e6/0x360
> [ 1123.357217] _raw_spin_lock_irqsave+0x46/0x90
> [ 1123.366193] __register_nmi_handler+0x8f/0x3a0
> [ 1123.374401] nmi_shootdown_cpus+0x95/0x120
> [ 1123.378509] kdump_nmi_shootdown_cpus+0x15/0x20
> [ 1123.383040] native_machine_crash_shutdown+0x54/0x160
> [ 1123.388095] __crash_kexec+0x10f/0x1f0
> [ 1123.421465] ? __ghes_panic.cold+0x4f/0x5d
> [ 1123.482648] </NMI>
>
> In this particular case, the following panic message was printed before.
>
> [ 1122.808188] Kernel panic - not syncing: Fatal hardware error!
>
> This message seemed to be given out from __ghes_panic() running in
> NMI context.
>
> The __register_nmi_handler() function which takes the nmi_desc lock
> with irq disabled shouldn't be called from NMI context as this can
> lead to deadlock.
>
> The nmi_shootdown_cpus() function can only be invoked once. After the
> first invocation, all other CPUs should be stuck in the newly added
> crash_nmi_callback() and cannot respond to a second NMI.
>
> One way to address this problem is to remove all the panic() calls from
> NMI context, but that can be too restrictive.
>
> Another way to fix this problem while allowing panic() calls from
> NMI context is by adding a new emergency NMI handler to the nmi_desc
> structure and provide a new set_emergency_nmi_handler() helper to
> atomically set crash_nmi_callback() in any context. The new emergency
> handler will preempt other handlers in the linked list. That will
> eliminate the need to take any lock and serve the panic in NMI use case.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> Acked-by: Rik van Riel <riel@surriel.com>
> ---
> arch/x86/include/asm/nmi.h | 2 ++
> arch/x86/kernel/nmi.c | 45 ++++++++++++++++++++++++++++++++++++++
> arch/x86/kernel/reboot.c | 11 ++++------
> 3 files changed, 51 insertions(+), 7 deletions(-)
>
> [v4] Just twist the C comments, no code change from v3.
>
> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> index 41a0ebb699ec..6715c123eff4 100644
> --- a/arch/x86/include/asm/nmi.h
> +++ b/arch/x86/include/asm/nmi.h
> @@ -56,6 +56,8 @@ int __register_nmi_handler(unsigned int, struct nmiaction *);
>
> void unregister_nmi_handler(unsigned int, const char *);
>
> +int set_emergency_nmi_handler(unsigned int type, nmi_handler_t handler);
> +
> void stop_nmi(void);
> void restart_nmi(void);
> void local_touch_nmi(void);
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index ed163c8c8604..2cb75a53c0c4 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -40,8 +40,12 @@
> #define CREATE_TRACE_POINTS
> #include <trace/events/nmi.h>
>
> +/*
> + * An emergency handler can be set in any context including NMI
> + */
> struct nmi_desc {
> raw_spinlock_t lock;
> + nmi_handler_t emerg_handler; /* Emergency handler */
> struct list_head head;
> };
>
> @@ -132,9 +136,22 @@ static void nmi_check_duration(struct nmiaction *action, u64 duration)
> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> {
> struct nmi_desc *desc = nmi_to_desc(type);
> + nmi_handler_t ehandler;
> struct nmiaction *a;
> int handled=0;
>
> + /*
> + * Call the emergency handler, if set
> + *
> + * In the case of crash_nmi_callback() emergency handler, it will
> + * return in the case of the crashing CPU to enable it to complete
> + * other necessary crashing actions ASAP. Other handlers in the
> + * linked list won't need to be run.
> + */
> + ehandler = READ_ONCE(desc->emerg_handler);
> + if (ehandler)
> + return ehandler(type, regs);
> +
> rcu_read_lock();
>
> /*
> @@ -224,6 +241,34 @@ void unregister_nmi_handler(unsigned int type, const char *name)
> }
> EXPORT_SYMBOL_GPL(unregister_nmi_handler);
>
> +/**
> + * set_emergency_nmi_handler - Set emergency handler
> + * @type: NMI type
> + * @handler: the emergency handler to be stored
> + * Return: 0 if success, -EEXIST if a handler had been stored
> + *
> + * Atomically set an emergency NMI handler which, if set, will preempt all
> + * the other handlers in the linked list. If a NULL handler is passed in,
> + * it will clear it.
> + */
> +int set_emergency_nmi_handler(unsigned int type, nmi_handler_t handler)
> +{
> + struct nmi_desc *desc = nmi_to_desc(type);
> + nmi_handler_t orig = NULL;
> +
> + if (!handler) {
> + orig = READ_ONCE(desc->emerg_handler);
> + WARN_ON_ONCE(!orig);
> + }
> +
> + if (try_cmpxchg(&desc->emerg_handler, &orig, handler))
> + return 0;
> + if (WARN_ON_ONCE(orig == handler))
> + return 0;
> + WARN_ONCE(1, "%s: failed to set emergency NMI handler!\n", __func__);
> + return -EEXIST;
> +}
> +
> static void
> pci_serr_error(unsigned char reason, struct pt_regs *regs)
> {
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 615922838c51..12df9e402d3c 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -926,15 +926,12 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback)
> shootdown_callback = callback;
>
> atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1);
> - /* Would it be better to replace the trap vector here? */
> - if (register_nmi_handler(NMI_LOCAL, crash_nmi_callback,
> - NMI_FLAG_FIRST, "crash"))
> - return; /* Return what? */
> +
> /*
> - * Ensure the new callback function is set before sending
> - * out the NMI
> + * Atomically set emergency handler to preempt other handlers.
> + * The action shouldn't fail or a warning will be printed.
> */
> - wmb();
> + set_emergency_nmi_handler(NMI_LOCAL, crash_nmi_callback);
>
> apic_send_IPI_allbutself(NMI_VECTOR);
>
Ping! Is further change needed for this patch?
Thanks,
Longman
next prev parent reply other threads:[~2025-02-05 4:03 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-19 15:06 [PATCH v4] x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus() Waiman Long
2025-02-05 4:03 ` Waiman Long [this message]
2025-02-05 9:20 ` Thomas Gleixner
2025-02-06 2:46 ` Waiman Long
2025-02-06 16:14 ` Thomas Gleixner
2025-02-06 17:00 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50e7cfb4-4edd-4fa0-ba3e-b22d8d324b69@redhat.com \
--to=llong@redhat.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox