From: Kiryl Shutsemau <kirill@shutemov.name>
To: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
Marc Zyngier <maz@kernel.org>,
Doug Anderson <dianders@chromium.org>,
Petr Mladek <pmladek@suse.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>, Puranjay Mohan <puranjay@kernel.org>,
Usama Arif <usama.arif@linux.dev>,
Breno Leitao <leitao@debian.org>,
Julien Thierry <julien.thierry.kdev@gmail.com>,
Lecopzer Chen <lecopzer.chen@mediatek.com>,
Sumit Garg <sumit.garg@kernel.org>,
kernel-team@meta.com, kexec@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org,
"Kiryl Shutsemau (Meta)" <kas@kernel.org>
Subject: [PATCH 0/4] arm64: cross-CPU NMI via SDEI
Date: Wed, 3 Jun 2026 15:36:31 +0100 [thread overview]
Message-ID: <cover.1780496779.git.kas@kernel.org> (raw)
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
A class of debug/observability features needs to interrupt a CPU that has
its interrupts locally masked: hard-lockup detection, the all-CPU
backtrace behind sysrq-l / RCU-stall / hung-task dumps, and
crash_smp_send_stop() capturing a stuck CPU's state into the vmcore. On
arm64 these need a mechanism that reaches a CPU spinning with DAIF masked,
which a normal IPI cannot.
arm64 has two such mechanisms today:
- GICv3 pseudo-NMI (interrupt priority masking). This is the preferred
path and what the perf-based hard-lockup detector
(HAVE_HARDLOCKUP_DETECTOR_PERF) is built on. Its cost, however, is on
the interrupt mask/unmask hot path: local_irq_enable() becomes an
ICC_PMR_EL1 write plus a synchronising barrier, and exception
entry/exit save and restore the PMR, paid on every CPU whether or not
an NMI is ever delivered.
In our measurements, enabling pseudo-NMI costs up to ~5% on real
workloads, and ~66% on a syscall-in-a-loop microbenchmark that
maximises exception entry/exit (where pseudo-NMI adds the PMR
save/restore). A fleet-wide ~5% regression is not acceptable, so these
systems run with pseudo-NMI disabled — and therefore have no
hard-lockup detector and degraded backtrace/crash-stop today.
- FEAT_NMI (Armv8.8) — the architectural fix, but absent from deployed
silicon and from most of the fleet for years to come.
For deployments that do not run pseudo-NMI (to avoid that standing
hot-path cost), the hard-lockup detector and the backtrace/crash paths
are degraded: a plain IPI can't reach the masked CPU, so the lockup goes
undetected, the backtrace of the CPU you care about comes back empty, and
the kdump is missing the culprit's registers.
This series adds a third delivery backend that costs nothing on the hot
path: SDEI. Firmware delivers an SDEI event into a CPU regardless of its
DAIF state, so interrupt masking stays the cheap PSTATE.DAIF operation and
the firmware round-trip is paid only at the rare moment a CPU must be
interrupted.
Mechanism
=========
It uses the standard SDEI software-signalled event (event 0) and the
SDEI_EVENT_SIGNAL call (DEN0054) — a spec-defined cross-PE signal, not a
vendor extension. The driver registers a handler for event 0 and pokes a
target CPU with sdei_event_signal(0, target_mpidr); firmware makes event 0
pending on that PE and dispatches the handler NMI-like.
No firmware change is required beyond SDEI being enabled, which
firmware-first RAS (APEI/GHES) deployments already have; the only
SDEI-core addition is a thin sdei_event_signal() wrapper over the standard
call.
Clean kdump when a CPU panics from inside the SDEI handler (the
hard-lockup case) is handled by the already-merged sdei_handler_abort(),
which crash_smp_send_stop() calls: it issues SDEI_EVENT_COMPLETE_AND_RESUME
so the firmware-side priority is dropped before the capture kernel boots.
Prior SDEI watchdog work
========================
Out-of-tree SDEI hard-lockup watchdogs exist (e.g. in the openEuler and
Anolis kernels). They take a different mechanism: they bind the secure
physical timer as an SDEI event, so firmware delivers a periodic self-CPU
tick that drives the detector. That requires a new SDEI interrupt-binding
API, pushes the watchdog period (watchdog_thresh) into firmware, and adds
secure-timer EOI handling on the kexec path.
This series instead uses only the standard software-signalled event 0:
the kernel keeps the timing (a per-CPU hrtimer with a buddy heartbeat
check) and firmware does nothing but deliver the cross-CPU poke when a
buddy looks stalled. The result is a smaller, far less firmware-coupled
change — no secure-timer dependency, no new SDEI API, no period in
firmware — and the same delivery primitive serves the backtrace and
crash-stop users, not just the watchdog.
Testing
=======
Developed on QEMU (Trusted Firmware-A with SDEI enabled) and
validated on NVIDIA Grace (Neoverse V2) hardware, under
irqchip.gicv3_pseudo_nmi=0:
- hard lockup (LKDTM) caught by the SDEI watchdog and panicked, with the
stack pointing at the wedged code;
- sysrq-l backtrace of an interrupt-masked CPU returning its real stack;
- kdump via crash_smp_send_stop() with a wedged CPU, and via a watchdog
panic from inside the event-0 handler — sdei_handler_abort() fires and
the capture kernel boots to userspace on the formerly-wedged CPU, with
its registers present in the vmcore.
Series
======
[1/4] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support
Thin sdei_event_signal() wrapper over the standard call; NMI/crash
safe (no locks).
[2/4] drivers/firmware: add SDEI cross-CPU NMI service for arm64
Register event 0; first user, arch_trigger_cpumask_backtrace().
[3/4] arm64: wire SDEI NMI into the hardlockup watchdog
HAVE_HARDLOCKUP_DETECTOR_ARCH backend; boot-time source selection
with perf-NMI fallback.
[4/4] arm64: route crash_smp_send_stop() last resort through SDEI
SDEI as the final escalation rung for CPUs that ignored the normal
and pseudo-NMI stop IPIs.
Also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git sdei-nmi
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/nmi.h | 30 ++
arch/arm64/kernel/smp.c | 33 +++
drivers/firmware/Kconfig | 23 ++
drivers/firmware/Makefile | 1 +
drivers/firmware/arm_sdei.c | 12 +
drivers/firmware/sdei_nmi.c | 523 ++++++++++++++++++++++++++++++++++
include/linux/arm_sdei.h | 6 +
include/uapi/linux/arm_sdei.h | 1 +
9 files changed, 630 insertions(+)
create mode 100644 arch/arm64/include/asm/nmi.h
create mode 100644 drivers/firmware/sdei_nmi.c
base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d
--
2.54.0
next reply other threads:[~2026-06-03 14:36 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-03 14:36 Kiryl Shutsemau [this message]
2026-06-03 14:36 ` [PATCH 1/4] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support Kiryl Shutsemau
2026-06-03 14:36 ` [PATCH 2/4] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Kiryl Shutsemau
2026-06-03 14:36 ` [PATCH 3/4] arm64: wire SDEI NMI into the hardlockup watchdog Kiryl Shutsemau
2026-06-03 14:36 ` [PATCH 4/4] arm64: route crash_smp_send_stop() last resort through SDEI Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1780496779.git.kas@kernel.org \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=dianders@chromium.org \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kas@kernel.org \
--cc=kernel-team@meta.com \
--cc=kexec@lists.infradead.org \
--cc=lecopzer.chen@mediatek.com \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=pmladek@suse.com \
--cc=puranjay@kernel.org \
--cc=sumit.garg@kernel.org \
--cc=tglx@linutronix.de \
--cc=usama.arif@linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox