From: Puranjay Mohan <puranjay@kernel.org>
To: leitao@debian.org
Cc: catalin.marinas@arm.com, kernel-team@meta.com,
linux-arm-kernel@lists.infradead.org, mark.rutland@arm.com,
paulmck@kernel.org, rmikey@meta.com, will@kernel.org,
Puranjay Mohan <puranjay@kernel.org>
Subject: Re: Overhead of arm64 LSE per-CPU atomics?
Date: Tue, 4 Nov 2025 20:57:53 +0000 [thread overview]
Message-ID: <20251104205753.42224-1-puranjay@kernel.org> (raw)
In-Reply-To: <ahkk2r22peni4s7j6c7tnv3uajvwiaeg3vwyusppblcokpvgjw@zuuzipntgu7x>
Hi Breno,
I tried your benchmark on AWS graviton platforms:
On EC2 c8g.metal-24xl (96 cpus Neoverse-V2) (AWS Graviton 4):
With ldadd, it was stable and LSE is always better than LL/SC
But with stadd, I saw some spikes in p95 and p99:
CPU: 28 - Latency Percentiles:
====================
LL/SC: p50: 6.61 ns p95: 6.61 ns p99: 6.62 ns
LSE : p50: 4.64 ns p95: 4.65 ns p99: 4.65 ns
CPU: 30 - Latency Percentiles:
====================
LL/SC: p50: 6.61 ns p95: 6.61 ns p99: 6.62 ns
LSE : p50: 4.64 ns p95: 14.24 ns ***p99: 27.74 ns***
On EC2 m6g.metal (64 cpus Neoverse-N1) (AWS Graviton 2):
Here both stadd and ldadd were stable and LSE was always better than LL/SC
with ldadd:
ARM64 Per-CPU Atomic Add Benchmark
===================================
Running percentile measurements (100 iterations)...
Detected 64 CPUs
CPU: 0 - Latency Percentiles:
====================
LL/SC: p50: 8.40 ns p95: 8.40 ns p99: 8.42 ns
LSE : p50: 5.60 ns p95: 5.60 ns p99: 5.61 ns
CPU: 1 - Latency Percentiles:
====================
LL/SC: p50: 8.40 ns p95: 8.40 ns p99: 8.41 ns
LSE : p50: 5.60 ns p95: 5.60 ns p99: 5.61 ns
[....]
CPU: 62 - Latency Percentiles:
====================
LL/SC: p50: 8.40 ns p95: 8.40 ns p99: 8.40 ns
LSE : p50: 5.60 ns p95: 5.60 ns p99: 5.60 ns
CPU: 63 - Latency Percentiles:
====================
LL/SC: p50: 8.40 ns p95: 8.40 ns p99: 8.41 ns
LSE : p50: 5.60 ns p95: 5.60 ns p99: 5.60 ns
=== Benchmark Complete ===
With stadd:
ARM64 Per-CPU Atomic Add Benchmark
===================================
Running percentile measurements (100 iterations)...
Detected 64 CPUs
CPU: 0 - Latency Percentiles:
====================
LL/SC: p50: 8.00 ns p95: 8.01 ns p99: 8.02 ns
LSE : p50: 5.20 ns p95: 5.21 ns p99: 5.21 ns
CPU: 1 - Latency Percentiles:
====================
LL/SC: p50: 8.00 ns p95: 8.01 ns p99: 8.01 ns
LSE : p50: 5.20 ns p95: 5.21 ns p99: 5.22 ns
[.....]
CPU: 62 - Latency Percentiles:
====================
LL/SC: p50: 8.00 ns p95: 8.01 ns p99: 8.14 ns
LSE : p50: 5.20 ns p95: 5.21 ns p99: 5.21 ns
CPU: 63 - Latency Percentiles:
====================
LL/SC: p50: 8.00 ns p95: 8.01 ns p99: 8.01 ns
LSE : p50: 5.20 ns p95: 5.20 ns p99: 5.20 ns
=== Benchmark Complete ===
next prev parent reply other threads:[~2025-11-04 20:58 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-30 22:37 Overhead of arm64 LSE per-CPU atomics? Paul E. McKenney
2025-10-31 18:30 ` Catalin Marinas
2025-10-31 19:39 ` Paul E. McKenney
2025-10-31 22:21 ` Paul E. McKenney
2025-10-31 22:43 ` Catalin Marinas
2025-10-31 23:38 ` Paul E. McKenney
2025-11-01 3:25 ` Paul E. McKenney
2025-11-01 9:44 ` Willy Tarreau
2025-11-01 18:07 ` Paul E. McKenney
2025-11-01 11:23 ` Catalin Marinas
2025-11-01 11:41 ` Yicong Yang
2025-11-05 13:25 ` Catalin Marinas
2025-11-05 13:42 ` Willy Tarreau
2025-11-05 14:49 ` Catalin Marinas
2025-11-05 16:21 ` Breno Leitao
2025-11-06 7:44 ` Willy Tarreau
2025-11-06 13:53 ` Catalin Marinas
2025-11-06 14:16 ` Willy Tarreau
2025-11-03 20:12 ` Palmer Dabbelt
2025-11-03 21:49 ` Catalin Marinas
2025-11-03 21:56 ` Willy Tarreau
2025-11-04 17:05 ` Catalin Marinas
2025-11-04 18:43 ` Paul E. McKenney
2025-11-04 20:10 ` Paul E. McKenney
2025-11-05 15:34 ` Catalin Marinas
2025-11-05 16:25 ` Paul E. McKenney
2025-11-05 17:15 ` Catalin Marinas
2025-11-05 17:40 ` Paul E. McKenney
2025-11-05 19:16 ` Catalin Marinas
2025-11-05 19:47 ` Paul E. McKenney
2025-11-05 20:17 ` Catalin Marinas
2025-11-05 20:45 ` Paul E. McKenney
2025-11-05 21:13 ` Palmer Dabbelt
2025-11-06 14:00 ` Catalin Marinas
2025-11-06 16:30 ` Palmer Dabbelt
2025-11-06 17:54 ` Catalin Marinas
2025-11-06 18:23 ` Palmer Dabbelt
2025-11-04 15:59 ` Breno Leitao
2025-11-04 17:06 ` Catalin Marinas
2025-11-04 18:08 ` Willy Tarreau
2025-11-04 18:22 ` Breno Leitao
2025-11-04 20:13 ` Paul E. McKenney
2025-11-04 20:35 ` Willy Tarreau
2025-11-04 21:25 ` Paul E. McKenney
2025-11-04 20:57 ` Puranjay Mohan [this message]
2025-11-27 12:29 ` Wentao Guan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104205753.42224-1-puranjay@kernel.org \
--to=puranjay@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=paulmck@kernel.org \
--cc=rmikey@meta.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).