From: "Paul E. McKenney" <paulmck@kernel.org>
To: rcu@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com,
rostedt@goodmis.org, "Paul E. McKenney" <paulmck@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-arm-kernel@lists.infradead.org, bpf@vger.kernel.org
Subject: [PATCH v2 15/16] srcu: Optimize SRCU-fast-updown for arm64
Date: Wed, 5 Nov 2025 12:32:15 -0800 [thread overview]
Message-ID: <20251105203216.2701005-15-paulmck@kernel.org> (raw)
In-Reply-To: <bb177afd-eea8-4a2a-9600-e36ada26a500@paulmck-laptop>
Some arm64 platforms have slow per-CPU atomic operations, for example,
the Neoverse V2. This commit therefore moves SRCU-fast from per-CPU
atomic operations to interrupt-disabled non-read-modify-write-atomic
atomic_read()/atomic_set() operations. This works because
SRCU-fast-updown is not invoked from read-side primitives, which
means that if srcu_read_unlock_fast() NMI handlers. This means that
srcu_read_lock_fast_updown() and srcu_read_unlock_fast_updown() can
exclude themselves and each other
This reduces the overhead of calls to srcu_read_lock_fast_updown() and
srcu_read_unlock_fast_updown() from about 100ns to about 12ns on an ARM
Neoverse V2. Although this is not excellent compared to about 2ns on x86,
it sure beats 100ns.
This command was used to measure the overhead:
tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --configs NOPREEMPT --kconfig "CONFIG_NR_CPUS=64 CONFIG_TASKS_TRACE_RCU=y" --bootargs "refscale.loops=100000 refscale.guest_os_delay=5 refscale.nreaders=64 refscale.holdoff=30 torture.disable_onoff_at_boot refscale.scale_type=srcu-fast-updown refscale.verbose_batched=8 torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=8 refscale.nruns=100" --trust-make
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <bpf@vger.kernel.org>
---
include/linux/srcutree.h | 51 +++++++++++++++++++++++++++++++++++++---
1 file changed, 48 insertions(+), 3 deletions(-)
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index d6f978b50472..0e06f87e1d7c 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -253,6 +253,34 @@ static inline struct srcu_ctr __percpu *__srcu_ctr_to_ptr(struct srcu_struct *ss
return &ssp->sda->srcu_ctrs[idx];
}
+/*
+ * Non-atomic manipulation of SRCU lock counters.
+ */
+static inline struct srcu_ctr __percpu notrace *__srcu_read_lock_fast_na(struct srcu_struct *ssp)
+{
+ atomic_long_t *scnp;
+ struct srcu_ctr __percpu *scp;
+
+ lockdep_assert_preemption_disabled();
+ scp = READ_ONCE(ssp->srcu_ctrp);
+ scnp = raw_cpu_ptr(&scp->srcu_locks);
+ atomic_long_set(scnp, atomic_long_read(scnp) + 1);
+ return scp;
+}
+
+/*
+ * Non-atomic manipulation of SRCU unlock counters.
+ */
+static inline void notrace
+__srcu_read_unlock_fast_na(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
+{
+ atomic_long_t *scnp;
+
+ lockdep_assert_preemption_disabled();
+ scnp = raw_cpu_ptr(&scp->srcu_unlocks);
+ atomic_long_set(scnp, atomic_long_read(scnp) + 1);
+}
+
/*
* Counts the new reader in the appropriate per-CPU element of the
* srcu_struct. Returns a pointer that must be passed to the matching
@@ -327,8 +355,18 @@ __srcu_read_unlock_fast(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
static inline
struct srcu_ctr __percpu notrace *__srcu_read_lock_fast_updown(struct srcu_struct *ssp)
{
- struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
+ struct srcu_ctr __percpu *scp;
+ if (IS_ENABLED(CONFIG_ARM64) && IS_ENABLED(CONFIG_ARM64_USE_LSE_PERCPU_ATOMICS)) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+ scp = __srcu_read_lock_fast_na(ssp);
+ local_irq_restore(flags); /* Avoids leaking the critical section. */
+ return scp;
+ }
+
+ scp = READ_ONCE(ssp->srcu_ctrp);
if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
this_cpu_inc(scp->srcu_locks.counter); // Y, and implicit RCU reader.
else
@@ -350,10 +388,17 @@ static inline void notrace
__srcu_read_unlock_fast_updown(struct srcu_struct *ssp, struct srcu_ctr __percpu *scp)
{
barrier(); /* Avoid leaking the critical section. */
- if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
+ if (IS_ENABLED(CONFIG_ARM64)) {
+ unsigned long flags;
+
+ local_irq_save(flags);
+ __srcu_read_unlock_fast_na(ssp, scp);
+ local_irq_restore(flags);
+ } else if (!IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE)) {
this_cpu_inc(scp->srcu_unlocks.counter); // Z, and implicit RCU reader.
- else
+ } else {
atomic_long_inc(raw_cpu_ptr(&scp->srcu_unlocks)); // Z, and implicit RCU reader.
+ }
}
void __srcu_check_read_flavor(struct srcu_struct *ssp, int read_flavor);
--
2.40.1
next prev parent reply other threads:[~2025-11-05 20:32 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 20:31 [PATCH v2 0/16] SRCU updates for v6.19 Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 01/16] srcu: Permit Tiny SRCU srcu_read_unlock() with interrupts disabled Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 02/16] srcu: Create an srcu_expedite_current() function Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 03/16] rcutorture: Test srcu_expedite_current() Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 04/16] srcu: Create a DEFINE_SRCU_FAST() Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 05/16] srcu: Make grace-period determination use ssp->srcu_reader_flavor Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 06/16] rcutorture: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast() Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 07/16] srcu: Require special srcu_struct define/init for SRCU-fast readers Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 08/16] srcu: Make SRCU-fast readers enforce use of SRCU-fast definition/init Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 09/16] doc: Update for SRCU-fast definitions and initialization Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 10/16] tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast Paul E. McKenney
2025-11-06 16:02 ` Steven Rostedt
2025-11-06 17:01 ` Paul E. McKenney
2025-11-06 17:10 ` Steven Rostedt
2025-11-06 17:52 ` Paul E. McKenney
2025-11-07 0:03 ` Steven Rostedt
2025-11-07 1:04 ` Paul E. McKenney
2025-11-07 1:16 ` Steven Rostedt
2025-11-07 1:53 ` Paul E. McKenney
2025-11-07 12:58 ` Steven Rostedt
2025-11-05 20:32 ` [PATCH v2 11/16] rcu: Mark diagnostic functions as notrace Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 12/16] srcu: Add SRCU_READ_FLAVOR_FAST_UPDOWN CPP macro Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 13/16] torture: Permit negative kvm.sh --kconfig numberic arguments Paul E. McKenney
2025-11-05 20:32 ` [PATCH v2 14/16] srcu: Create an SRCU-fast-updown API Paul E. McKenney
2025-11-25 14:18 ` Frederic Weisbecker
2025-11-25 15:54 ` Paul E. McKenney
2025-11-26 14:06 ` Frederic Weisbecker
2025-11-26 17:09 ` Paul E. McKenney
2025-11-05 20:32 ` Paul E. McKenney [this message]
2025-11-08 13:07 ` [PATCH v2 15/16] srcu: Optimize SRCU-fast-updown for arm64 Will Deacon
2025-11-08 18:38 ` Paul E. McKenney
2025-11-10 11:24 ` Will Deacon
2025-11-10 17:29 ` Paul E. McKenney
2025-11-24 13:04 ` Will Deacon
2025-11-24 17:20 ` Paul E. McKenney
2025-11-24 22:47 ` Frederic Weisbecker
2025-11-25 11:40 ` Will Deacon
2025-11-05 20:32 ` [PATCH v2 16/16] rcutorture: Make srcu{,d}_torture_init() announce the SRCU type Paul E. McKenney
2025-11-05 23:00 ` [PATCH v2 0/16] SRCU updates for v6.19 Frederic Weisbecker
2025-11-06 16:06 ` Steven Rostedt
2025-11-07 12:48 ` Frederic Weisbecker
2025-11-07 16:57 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251105203216.2701005-15-paulmck@kernel.org \
--to=paulmck@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=kernel-team@meta.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox