From: Kiryl Shutsemau <kirill@shutemov.name>
To: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
Doug Anderson <dianders@chromium.org>,
Petr Mladek <pmladek@suse.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Baoquan He <bhe@redhat.com>,
Puranjay Mohan <puranjay@kernel.org>,
Usama Arif <usama.arif@linux.dev>,
Breno Leitao <leitao@debian.org>,
Julien Thierry <julien.thierry.kdev@gmail.com>,
Lecopzer Chen <lecopzer@gmail.com>,
Sumit Garg <sumit.garg@kernel.org>,
kernel-team@meta.com, kexec@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI
Date: Mon, 22 Jun 2026 14:56:16 +0100 [thread overview]
Message-ID: <ajk-Vge2qhaY-TwJ@thinkstation> (raw)
In-Reply-To: <868q8asj1u.wl-maz@kernel.org>
On Fri, Jun 19, 2026 at 03:26:21PM +0100, Marc Zyngier wrote:
> > Does your firmware set ICC_CTLR_EL1.PMHE? I'd be curious to see the
> > numbers if the DSB was omitted on the enable path.
>
> I certainly don't observe this sort of overhead on the HW I have
> access to, and would like to understand where this is coming from with
> actual profiling data.
Full disclosure: the ~66% figures come from internal testing about a year ago.
I no longer have the details of the machine it ran on and can't confirm whether
ICC_CTLR_EL1.PMHE was set there -- it may well have been. I shouldn't have
carried those numbers forward without being able to stand behind them, so
please disregard them.
Here are fresh numbers from NVIDIA Grace (Neoverse V2). Importantly, this
box reports:
GICv3: Pseudo-NMIs enabled using relaxed ICC_PMR_EL1 synchronisation
i.e. PMHE == 0, so the synchronising DSB on the unmask path is already
patched to a NOP (ARM64_HAS_GIC_PRIO_RELAXED_SYNC). What's left is the
floor cost of PMR-based masking itself plus the PMR save/restore on
exception entry/exit -- not the DSB. So this is the case Catalin asked
about (DSB omitted), and there is still a measurable cost.
A trivial single-threaded gettid() loop (1e6 calls, median of 5,
performance governor, ASLR off):
pseudo_nmi=0 (DAIF): 178.4 ns/call
pseudo_nmi=1 (PMR): 252.5 ns/call
delta: +74.1 ns/call (~230-250 cycles)
+41.5% wall time / 0.706 throughput
--- u-bench.c ---
#include <unistd.h>
#include <sys/syscall.h>
#include <time.h>
#include <stdio.h>
int main(void) {
struct timespec a, b;
clock_gettime(CLOCK_MONOTONIC, &a);
for (long i = 0; i < 1000000; i++)
syscall(SYS_gettid);
clock_gettime(CLOCK_MONOTONIC, &b);
printf("%f ns\n", (b.tv_sec-a.tv_sec)*1e9 + (b.tv_nsec-a.tv_nsec));
return 0;
}
will-it-scale agrees independently. sched_yield (ops/s, median of 5):
1 task 72 tasks
pseudo_nmi=0 3,195,656 230,824,534
pseudo_nmi=1 2,253,753 163,914,837
ratio 0.705 0.710
The ratio is flat across the whole 1-to-72 sweep, so -- relevant to the
scalability question -- it's a constant per-syscall tax, not a contention
effect. The impact tracks syscall/exception density: page_fault1, a more
realistic workload, stays within ~5%.
> The direction of travel is to deprecate SDEI. I wouldn't add more stuff
> on top of this interface.
I understand FEAT_NMI is the long-term answer, and I'm not arguing against
deprecating SDEI. My concern is the gap in between. By our estimate it's
10+ years before the last non-FEAT_NMI machine retires from the fleet --
for scale, we're still running Skylake today. So there's roughly a
decade where a large installed base has neither FEAT_NMI nor affordable
pseudo-NMI, and no way to reach a DAIF-masked CPU for an all-CPU
backtrace or to capture a wedged CPU in a crash dump. That's the
functional gap this series tries to cover.
Given the deprecation direction, I deliberately kept the SDEI footprint as
small as I could. The series adds no new firmware interface and no vendor
SMC -- it uses only the standard software-signalled event (event 0) via
SDEI_EVENT_SIGNAL, which is already present on these systems for
firmware-first RAS (APEI/GHES). And SDEI is only ever invoked in a "bad
state": to deliver a backtrace signal to a CPU that a normal IPI can't
reach, or to stop a CPU that ignored the stop IPIs. Nothing on any hot or
steady-state path touches it.
If even that minimal use is unacceptable on a deprecated interface, I'd
rather know now and redirect the effort -- but I'd appreciate a pointer to
what should cover this gap for existing silicon in the meantime.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2026-06-22 13:56 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 19:20 [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI Kiryl Shutsemau
2026-06-17 19:20 ` [PATCH v4 1/4] firmware: arm_sdei: add sdei_is_present() Kiryl Shutsemau
2026-06-17 20:02 ` Doug Anderson
2026-06-17 19:20 ` [PATCH v4 2/4] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support Kiryl Shutsemau
2026-06-17 19:20 ` [PATCH v4 3/4] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Kiryl Shutsemau
2026-06-18 10:46 ` Julian Braha
2026-06-18 15:48 ` Kiryl Shutsemau
2026-06-26 17:11 ` Catalin Marinas
2026-06-17 19:20 ` [PATCH v4 4/4] arm64: escalate smp_send_stop() to an SDEI NMI as a last resort Kiryl Shutsemau
2026-06-17 20:02 ` Doug Anderson
2026-06-26 17:08 ` Catalin Marinas
2026-06-26 19:46 ` Kiryl Shutsemau
2026-06-19 14:00 ` [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI Catalin Marinas
2026-06-19 14:26 ` Marc Zyngier
2026-06-22 13:56 ` Kiryl Shutsemau [this message]
2026-06-22 16:52 ` Doug Anderson
2026-06-26 8:48 ` Breno Leitao
2026-06-26 8:25 ` YinFengwei
2026-06-26 17:07 ` Catalin Marinas
2026-06-26 19:40 ` Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajk-Vge2qhaY-TwJ@thinkstation \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=dianders@chromium.org \
--cc=james.morse@arm.com \
--cc=julien.thierry.kdev@gmail.com \
--cc=kernel-team@meta.com \
--cc=kexec@lists.infradead.org \
--cc=lecopzer@gmail.com \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=pmladek@suse.com \
--cc=puranjay@kernel.org \
--cc=sumit.garg@kernel.org \
--cc=tglx@linutronix.de \
--cc=usama.arif@linux.dev \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.