The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kirill@shutemov.name>
To: Puranjay Mohan <puranjay12@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	 Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>,
	 Mark Rutland <mark.rutland@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	 Doug Anderson <dianders@chromium.org>,
	Petr Mladek <pmladek@suse.com>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Baoquan He <bhe@redhat.com>, Usama Arif <usama.arif@linux.dev>,
	 Breno Leitao <leitao@debian.org>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	 Lecopzer Chen <lecopzer@gmail.com>,
	Sumit Garg <sumit.garg@kernel.org>,
	kernel-team@meta.com,  kexec@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64
Date: Mon, 15 Jun 2026 14:15:40 +0100	[thread overview]
Message-ID: <ai_6pkm7fA8MLuV1@thinkstation> (raw)
In-Reply-To: <CANk7y0iBTYuhzLTJAX0yf8Cp8cyThOVpAQYMTDN9LZUKGThqJQ@mail.gmail.com>

On Mon, Jun 15, 2026 at 12:18:10PM +0200, Puranjay Mohan wrote:
> On Mon, Jun 15, 2026 at 4:35 AM Kiryl Shutsemau <kirill@shutemov.name> wrote:
> >
> > From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
> >
> > Deliver an NMI-like event to an interrupt-masked arm64 CPU via the
> > standard SDEI software-signalled event (event 0), without the pseudo-NMI
> > hot-path cost: register a handler for event 0 and poke a target with
> > sdei_event_signal(0, mpidr).
> >
> > First user is arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls,
> > hung-task/soft-lockup dumps), which otherwise rides an IPI that can't
> > reach a masked CPU. Falls back to the IPI path when SDEI is absent; no
> > watchdog backend yet, so the stock detector is untouched.
> >
> > Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
> > Reviewed-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >  MAINTAINERS                     |   2 +-
> >  arch/arm64/include/asm/nmi.h    |  24 +++++
> >  arch/arm64/kernel/smp.c         |  11 +++
> >  drivers/firmware/Kconfig        |  19 ++++
> >  drivers/firmware/Makefile       |   1 +
> >  drivers/firmware/arm_sdei_nmi.c | 149 ++++++++++++++++++++++++++++++++
> >  6 files changed, 205 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/nmi.h
> >  create mode 100644 drivers/firmware/arm_sdei_nmi.c
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index c8d4b913f26c..b5ddfb85dce9 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -24797,7 +24797,7 @@ M:      James Morse <james.morse@arm.com>
> >  L:     linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
> >  S:     Maintained
> >  F:     Documentation/devicetree/bindings/arm/firmware/sdei.txt
> > -F:     drivers/firmware/arm_sdei.c
> > +F:     drivers/firmware/arm_sdei*
> >  F:     include/linux/arm_sdei.h
> >  F:     include/uapi/linux/arm_sdei.h
> >
> > diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
> > new file mode 100644
> > index 000000000000..9366be419d18
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/nmi.h
> > @@ -0,0 +1,24 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef __ASM_NMI_H
> > +#define __ASM_NMI_H
> > +
> > +#include <linux/cpumask.h>
> > +
> > +/*
> > + * Cross-CPU NMI provider hooks, consulted by the arm64 arch code before
> > + * its regular-IRQ / pseudo-NMI IPI paths. The SDEI provider in
> > + * drivers/firmware/arm_sdei_nmi.c implements them when active; a future
> > + * FEAT_NMI provider could slot in here too. The stubs let callers stay
> > + * unconditional when ARM_SDEI_NMI is off.
> > + */
> > +#ifdef CONFIG_ARM_SDEI_NMI
> > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu);
> > +#else
> > +static inline bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
> > +                                                     int exclude_cpu)
> > +{
> > +       return false;
> > +}
> > +#endif
> > +
> > +#endif /* __ASM_NMI_H */
> > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> > index 1aa324104afb..a670434a8cae 100644
> > --- a/arch/arm64/kernel/smp.c
> > +++ b/arch/arm64/kernel/smp.c
> > @@ -45,6 +45,7 @@
> >  #include <asm/daifflags.h>
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/mmu_context.h>
> > +#include <asm/nmi.h>
> >  #include <asm/numa.h>
> >  #include <asm/processor.h>
> >  #include <asm/smp_plat.h>
> > @@ -927,6 +928,16 @@ static void arm64_backtrace_ipi(cpumask_t *mask)
> >
> >  void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> >  {
> > +       /*
> > +        * Prefer the SDEI cross-CPU NMI provider when active: firmware
> > +        * dispatches the event out of EL3 and reaches CPUs that have
> > +        * interrupts locally masked, without the per-IRQ-mask cost that
> > +        * pseudo-NMI pays for the same reach. The plain IPI path below
> > +        * can't reach such a CPU unless pseudo-NMI is enabled.
> > +        */
> > +       if (sdei_nmi_trigger_cpumask_backtrace(mask, exclude_cpu))
> > +               return;
> > +
> >         /*
> >          * NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name,
> >          * nothing about it truly needs to be implemented using an NMI, it's
> > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> > index bbd2155d8483..6501087ff90d 100644
> > --- a/drivers/firmware/Kconfig
> > +++ b/drivers/firmware/Kconfig
> > @@ -36,6 +36,25 @@ config ARM_SDE_INTERFACE
> >           standard for registering callbacks from the platform firmware
> >           into the OS. This is typically used to implement RAS notifications.
> >
> > +config ARM_SDEI_NMI
> > +       bool "SDEI-based cross-CPU NMI service (arm64)"
> > +       depends on ARM64 && ARM_SDE_INTERFACE
> > +       help
> > +         Provides SDEI-based cross-CPU NMI delivery for hooks that need
> > +         to reach interrupt-masked CPUs on silicon that lacks FEAT_NMI:
> > +
> > +           - arch_trigger_cpumask_backtrace()  (sysrq-l, RCU stalls,
> > +             hardlockup_all_cpu_backtrace, soft-lockup secondary dumps,
> > +             hung-task auxiliary dumps)
> > +
> > +         The driver registers a handler for the SDEI software-signalled
> > +         event (event 0) and reaches a target CPU by signalling it with
> > +         SDEI_EVENT_SIGNAL. Firmware delivers the event out of EL3
> > +         regardless of the target's PSTATE.DAIF -- forced delivery into a
> > +         CPU wedged with interrupts locally masked.
> > +
> > +         If unsure, say N.
> > +
> >  config EDD
> >         tristate "BIOS Enhanced Disk Drive calls determine boot disk"
> >         depends on X86
> > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> > index 4ddec2820c96..be46f1e1dc77 100644
> > --- a/drivers/firmware/Makefile
> > +++ b/drivers/firmware/Makefile
> > @@ -4,6 +4,7 @@
> >  #
> >  obj-$(CONFIG_ARM_SCPI_PROTOCOL)        += arm_scpi.o
> >  obj-$(CONFIG_ARM_SDE_INTERFACE)        += arm_sdei.o
> > +obj-$(CONFIG_ARM_SDEI_NMI)     += arm_sdei_nmi.o
> >  obj-$(CONFIG_DMI)              += dmi_scan.o
> >  obj-$(CONFIG_DMI_SYSFS)                += dmi-sysfs.o
> >  obj-$(CONFIG_EDD)              += edd.o
> > diff --git a/drivers/firmware/arm_sdei_nmi.c b/drivers/firmware/arm_sdei_nmi.c
> > new file mode 100644
> > index 000000000000..a82776e7b55a
> > --- /dev/null
> > +++ b/drivers/firmware/arm_sdei_nmi.c
> > @@ -0,0 +1,149 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arm64 SDEI-based cross-CPU NMI service.
> > + *
> > + * Delivering an "NMI-shaped" event to an EL1 context that has locally
> > + * masked interrupts, on silicon without FEAT_NMI, can be done two ways:
> > + *
> > + *   - pseudo-NMI: mask "interrupts" via the GIC priority register
> > + *     (ICC_PMR_EL1) instead of PSTATE.DAIF, leaving a high-priority band
> > + *     deliverable. Functionally this works -- but it reimplements every
> > + *     local_irq_disable()/enable() and exception entry/exit as a PMR
> > + *     write plus synchronisation, a cost paid on that hot path forever,
> > + *     whether or not an NMI is ever delivered.
> > + *
> > + *   - SDEI: leave interrupt masking as the cheap PSTATE.DAIF operation
> > + *     and have the firmware bounce an EL3-routed Group-0 SGI back to
> > + *     NS-EL1 as an event callback. The cost is a firmware round-trip,
> > + *     but only at the rare moment delivery is actually needed.
> > + *
> > + * This driver takes the second path: it keeps the IRQ-mask hot path
> > + * free and pays only when it fires, which is what makes cross-CPU NMI
> > + * affordable on hardware where the pseudo-NMI tax isn't, until FEAT_NMI
> > + * makes NMI masking cheap in the architecture itself.
> > + *
> > + * Capabilities provided:
> > + *
> > + *   - sdei_nmi_trigger_cpumask_backtrace() — override for arm64's
> > + *     arch_trigger_cpumask_backtrace(), so sysrq-l, RCU stall dumps,
> > + *     hardlockup_all_cpu_backtrace, soft-lockup/hung-task secondary
> > + *     dumps all reach interrupt-masked CPUs.
> > + *
> > + * Delivery uses the standard SDEI software-signalled event (event 0) and
> > + * SDEI_EVENT_SIGNAL. We register a handler for event 0, enable it, and
> > + * poke a target CPU with sdei_event_signal(0, mpidr): firmware makes
> > + * event 0 pending on that PE and dispatches the handler NMI-like,
> > + * regardless of the target's DAIF.
> > + * Availability is simply whether event 0 registers and enables -- if SDEI
> > + * and its software-signalled event are present we use it, otherwise the
> > + * driver stays inert.
> > + */
> > +
> > +#define pr_fmt(fmt) "sdei_nmi: " fmt
> > +
> > +#include <linux/arm_sdei.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/init.h>
> > +#include <linux/kernel.h>
> > +#include <linux/kprobes.h>
> > +#include <linux/nmi.h>
> > +#include <linux/printk.h>
> > +#include <linux/ptrace.h>
> > +#include <linux/smp.h>
> > +#include <linux/types.h>
> > +
> > +#include <asm/nmi.h>
> > +#include <asm/smp_plat.h>
> > +
> > +static bool sdei_nmi_available;
> > +
> > +#define SDEI_NMI_EVENT                 0
> > +
> > +static int sdei_nmi_handler(u32 event, struct pt_regs *regs, void *arg)
> > +{
> > +       /*
> > +        * nmi_cpu_backtrace() no-ops unless this CPU's bit is set in the
> > +        * global backtrace mask (driven by nmi_trigger_cpumask_backtrace()),
> > +        * so a fire that reaches a CPU not being backtraced is harmless.
> > +        */
> > +       nmi_cpu_backtrace(regs);
> > +       return SDEI_EV_HANDLED;
> > +}
> > +NOKPROBE_SYMBOL(sdei_nmi_handler);
> > +
> > +static void sdei_nmi_fire(unsigned int target_cpu)
> > +{
> > +       int err = sdei_event_signal(SDEI_NMI_EVENT, cpu_logical_map(target_cpu));
> > +
> > +       if (err)
> > +               pr_warn("SDEI_EVENT_SIGNAL to CPU %u failed: %d\n",
> > +                       target_cpu, err);
> > +}
> > +
> > +/*
> > + * Raise callback for nmi_trigger_cpumask_backtrace(): signal event 0
> > + * at every CPU still pending in @mask. The framework excludes the local
> > + * CPU from @mask before calling us.
> > + */
> > +static void sdei_nmi_raise_backtrace(cpumask_t *mask)
> > +{
> > +       unsigned int cpu;
> > +
> > +       for_each_cpu(cpu, mask)
> > +               sdei_nmi_fire(cpu);
> > +}
> > +
> > +/*
> > + * Override hook for arch_trigger_cpumask_backtrace() (see
> > + * arch/arm64/kernel/smp.c). Returns true when SDEI handled the request,
> > + * which is the case whenever SDEI is active; on a false return the arch
> > + * falls back to its regular-IRQ (or pseudo-NMI, if enabled) IPI.
> > + *
> > + * On a kernel built without paying the pseudo-NMI hot-path cost (the
> > + * usual case for this driver's target), the IPI can't reach a CPU that
> > + * has interrupts masked -- so the backtrace of the one CPU you care
> > + * about comes back empty. SDEI is dispatched out of EL3 and lands
> > + * regardless of the target's DAIF, without taxing the IRQ-mask path.
> > + */
> > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu)
> > +{
> > +       if (!sdei_nmi_available)
> > +               return false;
> > +
> > +       nmi_trigger_cpumask_backtrace(mask, exclude_cpu,
> > +                                     sdei_nmi_raise_backtrace);
> > +       return true;
> > +}
> > +
> > +/*
> > + * device_initcall (after arch_initcall(sdei_init), so the SDEI subsystem
> > + * is up): probe the firmware, register the event, and turn on the
> > + * cross-CPU service. If the probe fails the driver stays inert and the
> > + * override hooks decline, leaving the arch's own paths in place.
> > + */
> > +static int __init sdei_nmi_init(void)
> > +{
> > +       int err;
> > +
> > +       err = sdei_event_register(SDEI_NMI_EVENT, sdei_nmi_handler, NULL);
> > +       if (err) {
> > +               pr_err("sdei_event_register(%u) failed: %d\n",
> > +                      SDEI_NMI_EVENT, err);
> > +               return 0;
> > +       }
> 
> This initcall runs unconditionally whenever ARM_SDEI_NMI is built in,
> which includes the many arm64 systems that have no SDEI at all. On
> those, sdei_event_register() -> sdei_event_create() ->
> invoke_sdei_fn() returns -EIO, and the core already complains:
>     pr_warn("Failed to create event %u: %d\n", event_num, err);

Fair enough. I will add sdei_is_present() and gate sdei_nmi_init() on
it.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2026-06-15 13:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15  2:35 [PATCH v3 0/3] arm64: cross-CPU NMI via SDEI Kiryl Shutsemau
2026-06-15  2:35 ` [PATCH v3 1/3] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support Kiryl Shutsemau
2026-06-15  2:35 ` [PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Kiryl Shutsemau
2026-06-15 10:18   ` Puranjay Mohan
2026-06-15 13:15     ` Kiryl Shutsemau [this message]
2026-06-15  2:35 ` [PATCH v3 3/3] arm64: escalate smp_send_stop() to an SDEI NMI as a last resort Kiryl Shutsemau
2026-06-15 10:25   ` Puranjay Mohan
2026-06-15 12:46     ` Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ai_6pkm7fA8MLuV1@thinkstation \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=dianders@chromium.org \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kexec@lists.infradead.org \
    --cc=lecopzer@gmail.com \
    --cc=leitao@debian.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=pmladek@suse.com \
    --cc=puranjay12@gmail.com \
    --cc=sumit.garg@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=usama.arif@linux.dev \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox