From: Kiryl Shutsemau <kirill@shutemov.name>
To: Puranjay Mohan <puranjay12@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
Marc Zyngier <maz@kernel.org>,
Doug Anderson <dianders@chromium.org>,
Petr Mladek <pmladek@suse.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Usama Arif <usama.arif@linux.dev>,
Breno Leitao <leitao@debian.org>,
Julien Thierry <julien.thierry.kdev@gmail.com>,
Lecopzer Chen <lecopzer@gmail.com>,
Sumit Garg <sumit.garg@kernel.org>,
kernel-team@meta.com, kexec@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64
Date: Mon, 15 Jun 2026 14:15:40 +0100 [thread overview]
Message-ID: <ai_6pkm7fA8MLuV1@thinkstation> (raw)
In-Reply-To: <CANk7y0iBTYuhzLTJAX0yf8Cp8cyThOVpAQYMTDN9LZUKGThqJQ@mail.gmail.com>
On Mon, Jun 15, 2026 at 12:18:10PM +0200, Puranjay Mohan wrote:
> On Mon, Jun 15, 2026 at 4:35 AM Kiryl Shutsemau <kirill@shutemov.name> wrote:
> >
> > From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
> >
> > Deliver an NMI-like event to an interrupt-masked arm64 CPU via the
> > standard SDEI software-signalled event (event 0), without the pseudo-NMI
> > hot-path cost: register a handler for event 0 and poke a target with
> > sdei_event_signal(0, mpidr).
> >
> > First user is arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls,
> > hung-task/soft-lockup dumps), which otherwise rides an IPI that can't
> > reach a masked CPU. Falls back to the IPI path when SDEI is absent; no
> > watchdog backend yet, so the stock detector is untouched.
> >
> > Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
> > Reviewed-by: Douglas Anderson <dianders@chromium.org>
> > ---
> > MAINTAINERS | 2 +-
> > arch/arm64/include/asm/nmi.h | 24 +++++
> > arch/arm64/kernel/smp.c | 11 +++
> > drivers/firmware/Kconfig | 19 ++++
> > drivers/firmware/Makefile | 1 +
> > drivers/firmware/arm_sdei_nmi.c | 149 ++++++++++++++++++++++++++++++++
> > 6 files changed, 205 insertions(+), 1 deletion(-)
> > create mode 100644 arch/arm64/include/asm/nmi.h
> > create mode 100644 drivers/firmware/arm_sdei_nmi.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index c8d4b913f26c..b5ddfb85dce9 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -24797,7 +24797,7 @@ M: James Morse <james.morse@arm.com>
> > L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
> > S: Maintained
> > F: Documentation/devicetree/bindings/arm/firmware/sdei.txt
> > -F: drivers/firmware/arm_sdei.c
> > +F: drivers/firmware/arm_sdei*
> > F: include/linux/arm_sdei.h
> > F: include/uapi/linux/arm_sdei.h
> >
> > diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
> > new file mode 100644
> > index 000000000000..9366be419d18
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/nmi.h
> > @@ -0,0 +1,24 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef __ASM_NMI_H
> > +#define __ASM_NMI_H
> > +
> > +#include <linux/cpumask.h>
> > +
> > +/*
> > + * Cross-CPU NMI provider hooks, consulted by the arm64 arch code before
> > + * its regular-IRQ / pseudo-NMI IPI paths. The SDEI provider in
> > + * drivers/firmware/arm_sdei_nmi.c implements them when active; a future
> > + * FEAT_NMI provider could slot in here too. The stubs let callers stay
> > + * unconditional when ARM_SDEI_NMI is off.
> > + */
> > +#ifdef CONFIG_ARM_SDEI_NMI
> > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu);
> > +#else
> > +static inline bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
> > + int exclude_cpu)
> > +{
> > + return false;
> > +}
> > +#endif
> > +
> > +#endif /* __ASM_NMI_H */
> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 1aa324104afb..a670434a8cae 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -45,6 +45,7 @@
> > #include <asm/daifflags.h>
> > #include <asm/kvm_mmu.h>
> > #include <asm/mmu_context.h>
> > +#include <asm/nmi.h>
> > #include <asm/numa.h>
> > #include <asm/processor.h>
> > #include <asm/smp_plat.h>
> > @@ -927,6 +928,16 @@ static void arm64_backtrace_ipi(cpumask_t *mask)
> >
> > void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> > {
> > + /*
> > + * Prefer the SDEI cross-CPU NMI provider when active: firmware
> > + * dispatches the event out of EL3 and reaches CPUs that have
> > + * interrupts locally masked, without the per-IRQ-mask cost that
> > + * pseudo-NMI pays for the same reach. The plain IPI path below
> > + * can't reach such a CPU unless pseudo-NMI is enabled.
> > + */
> > + if (sdei_nmi_trigger_cpumask_backtrace(mask, exclude_cpu))
> > + return;
> > +
> > /*
> > * NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name,
> > * nothing about it truly needs to be implemented using an NMI, it's
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index bbd2155d8483..6501087ff90d 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -36,6 +36,25 @@ config ARM_SDE_INTERFACE
> > standard for registering callbacks from the platform firmware
> > into the OS. This is typically used to implement RAS notifications.
> >
> > +config ARM_SDEI_NMI
> > + bool "SDEI-based cross-CPU NMI service (arm64)"
> > + depends on ARM64 && ARM_SDE_INTERFACE
> > + help
> > + Provides SDEI-based cross-CPU NMI delivery for hooks that need
> > + to reach interrupt-masked CPUs on silicon that lacks FEAT_NMI:
> > +
> > + - arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls,
> > + hardlockup_all_cpu_backtrace, soft-lockup secondary dumps,
> > + hung-task auxiliary dumps)
> > +
> > + The driver registers a handler for the SDEI software-signalled
> > + event (event 0) and reaches a target CPU by signalling it with
> > + SDEI_EVENT_SIGNAL. Firmware delivers the event out of EL3
> > + regardless of the target's PSTATE.DAIF -- forced delivery into a
> > + CPU wedged with interrupts locally masked.
> > +
> > + If unsure, say N.
> > +
> > config EDD
> > tristate "BIOS Enhanced Disk Drive calls determine boot disk"
> > depends on X86
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 4ddec2820c96..be46f1e1dc77 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -4,6 +4,7 @@
> > #
> > obj-$(CONFIG_ARM_SCPI_PROTOCOL) += arm_scpi.o
> > obj-$(CONFIG_ARM_SDE_INTERFACE) += arm_sdei.o
> > +obj-$(CONFIG_ARM_SDEI_NMI) += arm_sdei_nmi.o
> > obj-$(CONFIG_DMI) += dmi_scan.o
> > obj-$(CONFIG_DMI_SYSFS) += dmi-sysfs.o
> > obj-$(CONFIG_EDD) += edd.o
> > diff --git a/drivers/firmware/arm_sdei_nmi.c b/drivers/firmware/arm_sdei_nmi.c
> > new file mode 100644
> > index 000000000000..a82776e7b55a
> > --- /dev/null
> > +++ b/drivers/firmware/arm_sdei_nmi.c
> > @@ -0,0 +1,149 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arm64 SDEI-based cross-CPU NMI service.
> > + *
> > + * Delivering an "NMI-shaped" event to an EL1 context that has locally
> > + * masked interrupts, on silicon without FEAT_NMI, can be done two ways:
> > + *
> > + * - pseudo-NMI: mask "interrupts" via the GIC priority register
> > + * (ICC_PMR_EL1) instead of PSTATE.DAIF, leaving a high-priority band
> > + * deliverable. Functionally this works -- but it reimplements every
> > + * local_irq_disable()/enable() and exception entry/exit as a PMR
> > + * write plus synchronisation, a cost paid on that hot path forever,
> > + * whether or not an NMI is ever delivered.
> > + *
> > + * - SDEI: leave interrupt masking as the cheap PSTATE.DAIF operation
> > + * and have the firmware bounce an EL3-routed Group-0 SGI back to
> > + * NS-EL1 as an event callback. The cost is a firmware round-trip,
> > + * but only at the rare moment delivery is actually needed.
> > + *
> > + * This driver takes the second path: it keeps the IRQ-mask hot path
> > + * free and pays only when it fires, which is what makes cross-CPU NMI
> > + * affordable on hardware where the pseudo-NMI tax isn't, until FEAT_NMI
> > + * makes NMI masking cheap in the architecture itself.
> > + *
> > + * Capabilities provided:
> > + *
> > + * - sdei_nmi_trigger_cpumask_backtrace() — override for arm64's
> > + * arch_trigger_cpumask_backtrace(), so sysrq-l, RCU stall dumps,
> > + * hardlockup_all_cpu_backtrace, soft-lockup/hung-task secondary
> > + * dumps all reach interrupt-masked CPUs.
> > + *
> > + * Delivery uses the standard SDEI software-signalled event (event 0) and
> > + * SDEI_EVENT_SIGNAL. We register a handler for event 0, enable it, and
> > + * poke a target CPU with sdei_event_signal(0, mpidr): firmware makes
> > + * event 0 pending on that PE and dispatches the handler NMI-like,
> > + * regardless of the target's DAIF.
> > + * Availability is simply whether event 0 registers and enables -- if SDEI
> > + * and its software-signalled event are present we use it, otherwise the
> > + * driver stays inert.
> > + */
> > +
> > +#define pr_fmt(fmt) "sdei_nmi: " fmt
> > +
> > +#include <linux/arm_sdei.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/init.h>
> > +#include <linux/kernel.h>
> > +#include <linux/kprobes.h>
> > +#include <linux/nmi.h>
> > +#include <linux/printk.h>
> > +#include <linux/ptrace.h>
> > +#include <linux/smp.h>
> > +#include <linux/types.h>
> > +
> > +#include <asm/nmi.h>
> > +#include <asm/smp_plat.h>
> > +
> > +static bool sdei_nmi_available;
> > +
> > +#define SDEI_NMI_EVENT 0
> > +
> > +static int sdei_nmi_handler(u32 event, struct pt_regs *regs, void *arg)
> > +{
> > + /*
> > + * nmi_cpu_backtrace() no-ops unless this CPU's bit is set in the
> > + * global backtrace mask (driven by nmi_trigger_cpumask_backtrace()),
> > + * so a fire that reaches a CPU not being backtraced is harmless.
> > + */
> > + nmi_cpu_backtrace(regs);
> > + return SDEI_EV_HANDLED;
> > +}
> > +NOKPROBE_SYMBOL(sdei_nmi_handler);
> > +
> > +static void sdei_nmi_fire(unsigned int target_cpu)
> > +{
> > + int err = sdei_event_signal(SDEI_NMI_EVENT, cpu_logical_map(target_cpu));
> > +
> > + if (err)
> > + pr_warn("SDEI_EVENT_SIGNAL to CPU %u failed: %d\n",
> > + target_cpu, err);
> > +}
> > +
> > +/*
> > + * Raise callback for nmi_trigger_cpumask_backtrace(): signal event 0
> > + * at every CPU still pending in @mask. The framework excludes the local
> > + * CPU from @mask before calling us.
> > + */
> > +static void sdei_nmi_raise_backtrace(cpumask_t *mask)
> > +{
> > + unsigned int cpu;
> > +
> > + for_each_cpu(cpu, mask)
> > + sdei_nmi_fire(cpu);
> > +}
> > +
> > +/*
> > + * Override hook for arch_trigger_cpumask_backtrace() (see
> > + * arch/arm64/kernel/smp.c). Returns true when SDEI handled the request,
> > + * which is the case whenever SDEI is active; on a false return the arch
> > + * falls back to its regular-IRQ (or pseudo-NMI, if enabled) IPI.
> > + *
> > + * On a kernel built without paying the pseudo-NMI hot-path cost (the
> > + * usual case for this driver's target), the IPI can't reach a CPU that
> > + * has interrupts masked -- so the backtrace of the one CPU you care
> > + * about comes back empty. SDEI is dispatched out of EL3 and lands
> > + * regardless of the target's DAIF, without taxing the IRQ-mask path.
> > + */
> > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> > +{
> > + if (!sdei_nmi_available)
> > + return false;
> > +
> > + nmi_trigger_cpumask_backtrace(mask, exclude_cpu,
> > + sdei_nmi_raise_backtrace);
> > + return true;
> > +}
> > +
> > +/*
> > + * device_initcall (after arch_initcall(sdei_init), so the SDEI subsystem
> > + * is up): probe the firmware, register the event, and turn on the
> > + * cross-CPU service. If the probe fails the driver stays inert and the
> > + * override hooks decline, leaving the arch's own paths in place.
> > + */
> > +static int __init sdei_nmi_init(void)
> > +{
> > + int err;
> > +
> > + err = sdei_event_register(SDEI_NMI_EVENT, sdei_nmi_handler, NULL);
> > + if (err) {
> > + pr_err("sdei_event_register(%u) failed: %d\n",
> > + SDEI_NMI_EVENT, err);
> > + return 0;
> > + }
>
> This initcall runs unconditionally whenever ARM_SDEI_NMI is built in,
> which includes the many arm64 systems that have no SDEI at all. On
> those, sdei_event_register() -> sdei_event_create() ->
> invoke_sdei_fn() returns -EIO, and the core already complains:
> pr_warn("Failed to create event %u: %d\n", event_num, err);
Fair enough. I will add sdei_is_present() and gate sdei_nmi_init() on
it.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2026-06-15 13:15 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 2:35 [PATCH v3 0/3] arm64: cross-CPU NMI via SDEI Kiryl Shutsemau
2026-06-15 2:35 ` [PATCH v3 1/3] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support Kiryl Shutsemau
2026-06-15 2:35 ` [PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Kiryl Shutsemau
2026-06-15 10:18 ` Puranjay Mohan
2026-06-15 13:15 ` Kiryl Shutsemau [this message]
2026-06-15 2:35 ` [PATCH v3 3/3] arm64: escalate smp_send_stop() to an SDEI NMI as a last resort Kiryl Shutsemau
2026-06-15 10:25 ` Puranjay Mohan
2026-06-15 12:46 ` Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ai_6pkm7fA8MLuV1@thinkstation \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=dianders@chromium.org \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kernel-team@meta.com \
--cc=kexec@lists.infradead.org \
--cc=lecopzer@gmail.com \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=pmladek@suse.com \
--cc=puranjay12@gmail.com \
--cc=sumit.garg@kernel.org \
--cc=tglx@linutronix.de \
--cc=usama.arif@linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.