From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fout-b4-smtp.messagingengine.com (fout-b4-smtp.messagingengine.com [202.12.124.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C2FA33065D for ; Mon, 15 Jun 2026 13:15:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.12.124.147 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781529354; cv=none; b=LOGcHtLTJLJI9P6npReOLPs+SBFHNhx4rVb3dM1QVZ/iIkxU5Pww4XGIJ0sroFGUDBLzdWoI4sR5PybqAdvb9CKRVPlbLEwPmiqNW1ghmg8VY4ceVU+bHEBIET9wS3YY+ZkmlgSGvemRirZ55ypeq53DXrph2/8vkkUmQZSRIbA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781529354; c=relaxed/simple; bh=l6VWTXTBcjO04+JieqmDX1neBIXSm/V4/T9LL0WXc8o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=d01erH21Z9/g/tO4i7Pyi+zDxfPKUuo9Q5trOqMKaLKi00ULqcvi9BkE310z5wP6+7QqI1ofABp4TtMvWcjg7gvekhbF1ieoKr3B0nzbjtLKVFPxEYkbFFzluazngID1Dq2mu7HgaPWiPQOIzY99HN1u3rNI7TLfIPvHusPCIvA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name; spf=pass smtp.mailfrom=shutemov.name; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b=bh80GIKO; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=GCa74JLk; arc=none smtp.client-ip=202.12.124.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=shutemov.name Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shutemov.name Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shutemov.name header.i=@shutemov.name header.b="bh80GIKO"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="GCa74JLk" Received: from phl-compute-03.internal (phl-compute-03.internal [10.202.2.43]) by mailfout.stl.internal (Postfix) with ESMTP id 5935E1D00161; Mon, 15 Jun 2026 09:15:50 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-03.internal (MEProxy); Mon, 15 Jun 2026 09:15:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-transfer-encoding:content-type:content-type :date:date:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:subject:subject:to:to; s=fm2; t=1781529350; x=1781615750; bh=jTtE9lrnx5uC8z75IcPqyJK9sU1cT8H2 5gmR36Q4LKw=; b=bh80GIKOD6TpGbAbFedvF+e1EjuiXmMsI3kDD27SUWIM6j4T TZOcPvqW/vCVSSzH+wR9LqeWaWAfFHFGQnqa8Kh1f+qrYh7d5fXBbMivQPCqIwN2 gacXZz60snBirdGRvpgUl3inmlhoDnDQBCDEE51oHUsX5r/cUjKMyKDiRIlppxu+ eyCMUCHbeNB2xtJ3cSHlGCHCOieebXL6z1u89UKcdL73Z3DAZhlP6dpBblf2ogLB a8dMiDcaSqa0UtjR6HNL7AwfSoEPnQlVp+G1AY8/S872IdO/RciMKEE4BwhP4Uar 3QK0nWnyNQKzYTs5Vzegy47rsP0ziufutfzkkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1781529350; x= 1781615750; bh=jTtE9lrnx5uC8z75IcPqyJK9sU1cT8H25gmR36Q4LKw=; b=G Ca74JLks8TG82uXOK5P9tNJ/NcO372iarftNXObxcstVwaq/jQeQ8mM5lFKV50Xt Mam2mKIXlLtO1gbtFcZ8wfRGgQHW2bY+twnRdrPRbaS5UHi2RLhD2iTQdBVI5wGI wC3zR+OuLJvXZFpjjTjVQ9ALrbkCiUPpAx6CtE6rjRKfIW1Fjourwl2eQcO3W0y2 F+NLdgqDdAtRA+NxeM5oR84vrmxWCpHc/YyQ6AoWzjiLGFeY3R26AA1TagoFi1H1 fhOAn855blXKSIra9QCQqSlaIizQCLq+CWehk+QxNJpPLUZmDlEo0bA+KaI3T453 xEivSc2xv8Ulo2Y0J8nwg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTGq9iF/E59Wrvbj3f2NyjHhNlVG9P3TNhxNGhjbPyje0c0j/bHfzdLLcj4Ikn3VrU IBXKWJGCZpVQuLasDktveERN2dmbTavlxxG0O7j9lHC/PUXZkTaMw1tRQG6Mbc1U1pMN07 /R4zFfaauiyBHgvk9bGC+CzUtp4HbRluWT+h+odERTddFxjarMYl23f6DhnwTJb+anDY38 apd4Loxk24kQ3pZ/6fZvkhQhI1faT6cRKv1CRlGDf87miOdhDoP+H+t2nw9tJQPGZkpH9L qna7VebVAF6H7dM96kjrjE1J9hqTA5/M3Tc+qDNua6bEZVbc33RHtqdBmunn/2z3PU40qY jaHwH/Kwsag/5BQlN7KZJcJMTClJNomr4tp0mdvyS4ncCCxh8YHEjVelAOwPyIZVE923dn FI7SYEBlB+XGeHANW0bx0tAO9Sz0NM+APDqj8SJLiBdGh48DCTgCww8+hu847LskxAeOhS JWFLkdFkCKOC/QjDxos2tDB6zvGQncaOyoUDyHxfL4q4ibImyLCphiNvGX8hWEABmah8g7 5RGNVfQRlaGSgXj5Rok5XsebJFuAvIPQwSmKm6sUq361UEux6Bnu/C1WZ5HkoWDXzlgdf1 wecal23kGbAFTT9B6p8RzVt6+gwVYK9n6KRMp1s3uNW7Se00YL9yvJ3LIIbg X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 15 Jun 2026 09:15:46 -0400 (EDT) Date: Mon, 15 Jun 2026 14:15:40 +0100 From: Kiryl Shutsemau To: Puranjay Mohan Cc: Catalin Marinas , Will Deacon , James Morse , Mark Rutland , Marc Zyngier , Doug Anderson , Petr Mladek , Thomas Gleixner , Andrew Morton , Baoquan He , Usama Arif , Breno Leitao , Julien Thierry , Lecopzer Chen , Sumit Garg , kernel-team@meta.com, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 2/3] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Message-ID: References: <704b467d5b320da9cf49fc9bb4a6814063986f3b.1781490440.git.kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Mon, Jun 15, 2026 at 12:18:10PM +0200, Puranjay Mohan wrote: > On Mon, Jun 15, 2026 at 4:35 AM Kiryl Shutsemau wrote: > > > > From: "Kiryl Shutsemau (Meta)" > > > > Deliver an NMI-like event to an interrupt-masked arm64 CPU via the > > standard SDEI software-signalled event (event 0), without the pseudo-NMI > > hot-path cost: register a handler for event 0 and poke a target with > > sdei_event_signal(0, mpidr). > > > > First user is arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls, > > hung-task/soft-lockup dumps), which otherwise rides an IPI that can't > > reach a masked CPU. Falls back to the IPI path when SDEI is absent; no > > watchdog backend yet, so the stock detector is untouched. > > > > Signed-off-by: Kiryl Shutsemau (Meta) > > Reviewed-by: Douglas Anderson > > --- > > MAINTAINERS | 2 +- > > arch/arm64/include/asm/nmi.h | 24 +++++ > > arch/arm64/kernel/smp.c | 11 +++ > > drivers/firmware/Kconfig | 19 ++++ > > drivers/firmware/Makefile | 1 + > > drivers/firmware/arm_sdei_nmi.c | 149 ++++++++++++++++++++++++++++++++ > > 6 files changed, 205 insertions(+), 1 deletion(-) > > create mode 100644 arch/arm64/include/asm/nmi.h > > create mode 100644 drivers/firmware/arm_sdei_nmi.c > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index c8d4b913f26c..b5ddfb85dce9 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -24797,7 +24797,7 @@ M: James Morse > > L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) > > S: Maintained > > F: Documentation/devicetree/bindings/arm/firmware/sdei.txt > > -F: drivers/firmware/arm_sdei.c > > +F: drivers/firmware/arm_sdei* > > F: include/linux/arm_sdei.h > > F: include/uapi/linux/arm_sdei.h > > > > diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h > > new file mode 100644 > > index 000000000000..9366be419d18 > > --- /dev/null > > +++ b/arch/arm64/include/asm/nmi.h > > @@ -0,0 +1,24 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef __ASM_NMI_H > > +#define __ASM_NMI_H > > + > > +#include > > + > > +/* > > + * Cross-CPU NMI provider hooks, consulted by the arm64 arch code before > > + * its regular-IRQ / pseudo-NMI IPI paths. The SDEI provider in > > + * drivers/firmware/arm_sdei_nmi.c implements them when active; a future > > + * FEAT_NMI provider could slot in here too. The stubs let callers stay > > + * unconditional when ARM_SDEI_NMI is off. > > + */ > > +#ifdef CONFIG_ARM_SDEI_NMI > > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu); > > +#else > > +static inline bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, > > + int exclude_cpu) > > +{ > > + return false; > > +} > > +#endif > > + > > +#endif /* __ASM_NMI_H */ > > diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c > > index 1aa324104afb..a670434a8cae 100644 > > --- a/arch/arm64/kernel/smp.c > > +++ b/arch/arm64/kernel/smp.c > > @@ -45,6 +45,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -927,6 +928,16 @@ static void arm64_backtrace_ipi(cpumask_t *mask) > > > > void arch_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) > > { > > + /* > > + * Prefer the SDEI cross-CPU NMI provider when active: firmware > > + * dispatches the event out of EL3 and reaches CPUs that have > > + * interrupts locally masked, without the per-IRQ-mask cost that > > + * pseudo-NMI pays for the same reach. The plain IPI path below > > + * can't reach such a CPU unless pseudo-NMI is enabled. > > + */ > > + if (sdei_nmi_trigger_cpumask_backtrace(mask, exclude_cpu)) > > + return; > > + > > /* > > * NOTE: though nmi_trigger_cpumask_backtrace() has "nmi_" in the name, > > * nothing about it truly needs to be implemented using an NMI, it's > > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig > > index bbd2155d8483..6501087ff90d 100644 > > --- a/drivers/firmware/Kconfig > > +++ b/drivers/firmware/Kconfig > > @@ -36,6 +36,25 @@ config ARM_SDE_INTERFACE > > standard for registering callbacks from the platform firmware > > into the OS. This is typically used to implement RAS notifications. > > > > +config ARM_SDEI_NMI > > + bool "SDEI-based cross-CPU NMI service (arm64)" > > + depends on ARM64 && ARM_SDE_INTERFACE > > + help > > + Provides SDEI-based cross-CPU NMI delivery for hooks that need > > + to reach interrupt-masked CPUs on silicon that lacks FEAT_NMI: > > + > > + - arch_trigger_cpumask_backtrace() (sysrq-l, RCU stalls, > > + hardlockup_all_cpu_backtrace, soft-lockup secondary dumps, > > + hung-task auxiliary dumps) > > + > > + The driver registers a handler for the SDEI software-signalled > > + event (event 0) and reaches a target CPU by signalling it with > > + SDEI_EVENT_SIGNAL. Firmware delivers the event out of EL3 > > + regardless of the target's PSTATE.DAIF -- forced delivery into a > > + CPU wedged with interrupts locally masked. > > + > > + If unsure, say N. > > + > > config EDD > > tristate "BIOS Enhanced Disk Drive calls determine boot disk" > > depends on X86 > > diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile > > index 4ddec2820c96..be46f1e1dc77 100644 > > --- a/drivers/firmware/Makefile > > +++ b/drivers/firmware/Makefile > > @@ -4,6 +4,7 @@ > > # > > obj-$(CONFIG_ARM_SCPI_PROTOCOL) += arm_scpi.o > > obj-$(CONFIG_ARM_SDE_INTERFACE) += arm_sdei.o > > +obj-$(CONFIG_ARM_SDEI_NMI) += arm_sdei_nmi.o > > obj-$(CONFIG_DMI) += dmi_scan.o > > obj-$(CONFIG_DMI_SYSFS) += dmi-sysfs.o > > obj-$(CONFIG_EDD) += edd.o > > diff --git a/drivers/firmware/arm_sdei_nmi.c b/drivers/firmware/arm_sdei_nmi.c > > new file mode 100644 > > index 000000000000..a82776e7b55a > > --- /dev/null > > +++ b/drivers/firmware/arm_sdei_nmi.c > > @@ -0,0 +1,149 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * arm64 SDEI-based cross-CPU NMI service. > > + * > > + * Delivering an "NMI-shaped" event to an EL1 context that has locally > > + * masked interrupts, on silicon without FEAT_NMI, can be done two ways: > > + * > > + * - pseudo-NMI: mask "interrupts" via the GIC priority register > > + * (ICC_PMR_EL1) instead of PSTATE.DAIF, leaving a high-priority band > > + * deliverable. Functionally this works -- but it reimplements every > > + * local_irq_disable()/enable() and exception entry/exit as a PMR > > + * write plus synchronisation, a cost paid on that hot path forever, > > + * whether or not an NMI is ever delivered. > > + * > > + * - SDEI: leave interrupt masking as the cheap PSTATE.DAIF operation > > + * and have the firmware bounce an EL3-routed Group-0 SGI back to > > + * NS-EL1 as an event callback. The cost is a firmware round-trip, > > + * but only at the rare moment delivery is actually needed. > > + * > > + * This driver takes the second path: it keeps the IRQ-mask hot path > > + * free and pays only when it fires, which is what makes cross-CPU NMI > > + * affordable on hardware where the pseudo-NMI tax isn't, until FEAT_NMI > > + * makes NMI masking cheap in the architecture itself. > > + * > > + * Capabilities provided: > > + * > > + * - sdei_nmi_trigger_cpumask_backtrace() — override for arm64's > > + * arch_trigger_cpumask_backtrace(), so sysrq-l, RCU stall dumps, > > + * hardlockup_all_cpu_backtrace, soft-lockup/hung-task secondary > > + * dumps all reach interrupt-masked CPUs. > > + * > > + * Delivery uses the standard SDEI software-signalled event (event 0) and > > + * SDEI_EVENT_SIGNAL. We register a handler for event 0, enable it, and > > + * poke a target CPU with sdei_event_signal(0, mpidr): firmware makes > > + * event 0 pending on that PE and dispatches the handler NMI-like, > > + * regardless of the target's DAIF. > > + * Availability is simply whether event 0 registers and enables -- if SDEI > > + * and its software-signalled event are present we use it, otherwise the > > + * driver stays inert. > > + */ > > + > > +#define pr_fmt(fmt) "sdei_nmi: " fmt > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include > > +#include > > + > > +static bool sdei_nmi_available; > > + > > +#define SDEI_NMI_EVENT 0 > > + > > +static int sdei_nmi_handler(u32 event, struct pt_regs *regs, void *arg) > > +{ > > + /* > > + * nmi_cpu_backtrace() no-ops unless this CPU's bit is set in the > > + * global backtrace mask (driven by nmi_trigger_cpumask_backtrace()), > > + * so a fire that reaches a CPU not being backtraced is harmless. > > + */ > > + nmi_cpu_backtrace(regs); > > + return SDEI_EV_HANDLED; > > +} > > +NOKPROBE_SYMBOL(sdei_nmi_handler); > > + > > +static void sdei_nmi_fire(unsigned int target_cpu) > > +{ > > + int err = sdei_event_signal(SDEI_NMI_EVENT, cpu_logical_map(target_cpu)); > > + > > + if (err) > > + pr_warn("SDEI_EVENT_SIGNAL to CPU %u failed: %d\n", > > + target_cpu, err); > > +} > > + > > +/* > > + * Raise callback for nmi_trigger_cpumask_backtrace(): signal event 0 > > + * at every CPU still pending in @mask. The framework excludes the local > > + * CPU from @mask before calling us. > > + */ > > +static void sdei_nmi_raise_backtrace(cpumask_t *mask) > > +{ > > + unsigned int cpu; > > + > > + for_each_cpu(cpu, mask) > > + sdei_nmi_fire(cpu); > > +} > > + > > +/* > > + * Override hook for arch_trigger_cpumask_backtrace() (see > > + * arch/arm64/kernel/smp.c). Returns true when SDEI handled the request, > > + * which is the case whenever SDEI is active; on a false return the arch > > + * falls back to its regular-IRQ (or pseudo-NMI, if enabled) IPI. > > + * > > + * On a kernel built without paying the pseudo-NMI hot-path cost (the > > + * usual case for this driver's target), the IPI can't reach a CPU that > > + * has interrupts masked -- so the backtrace of the one CPU you care > > + * about comes back empty. SDEI is dispatched out of EL3 and lands > > + * regardless of the target's DAIF, without taxing the IRQ-mask path. > > + */ > > +bool sdei_nmi_trigger_cpumask_backtrace(const cpumask_t *mask, int exclude_cpu) > > +{ > > + if (!sdei_nmi_available) > > + return false; > > + > > + nmi_trigger_cpumask_backtrace(mask, exclude_cpu, > > + sdei_nmi_raise_backtrace); > > + return true; > > +} > > + > > +/* > > + * device_initcall (after arch_initcall(sdei_init), so the SDEI subsystem > > + * is up): probe the firmware, register the event, and turn on the > > + * cross-CPU service. If the probe fails the driver stays inert and the > > + * override hooks decline, leaving the arch's own paths in place. > > + */ > > +static int __init sdei_nmi_init(void) > > +{ > > + int err; > > + > > + err = sdei_event_register(SDEI_NMI_EVENT, sdei_nmi_handler, NULL); > > + if (err) { > > + pr_err("sdei_event_register(%u) failed: %d\n", > > + SDEI_NMI_EVENT, err); > > + return 0; > > + } > > This initcall runs unconditionally whenever ARM_SDEI_NMI is built in, > which includes the many arm64 systems that have no SDEI at all. On > those, sdei_event_register() -> sdei_event_create() -> > invoke_sdei_fn() returns -EIO, and the core already complains: > pr_warn("Failed to create event %u: %d\n", event_num, err); Fair enough. I will add sdei_is_present() and gate sdei_nmi_init() on it. -- Kiryl Shutsemau / Kirill A. Shutemov