* [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
@ 2025-05-02 8:52 Ankur Arora
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
` (8 more replies)
0 siblings, 9 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Hi,
This series adds waited variants of the smp_cond_load() primitives:
smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
There are two known users for these interfaces:
- poll_idle() [1]
- resilient queued spinlocks [2]
For both of these cases we want to wait on a condition but also want
to terminate the wait based on a timeout.
Before describing how v2 implements these interfaces, let me recap the
problems in v1 (Catalin outlined most of these in [3]):
smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, time_limit_ns)
took four arguments, with ptr and cond_expr doing the usual smp_cond_load()
things and time_expr_ns and time_limit_ns being used to decide the
terminating condition.
There were some problems in the timekeeping:
1. How often do we do the (relatively expensive) time-check?
The choice made was once very 200 spin-wait iterations, with each
iteration trying to idle the pipeline by executing cpu_relax().
The choice of 200 was, of course, arbitrary and somewhat meaningless
across architectures. On recent x86, cpu_relax()/PAUSE takes ~20-30
cycles, but on (non-SMT) arm64 cpu_relax()/YIELD is effectively
just a NOP.
Even if each architecture had its own limit, this will also vary
across microarchitectures.
2. On arm64, which can do better than just cpu_relax(), for instance,
by waiting for a store on an address (WFE), the implementation
exclusively used WFE, with the spin-wait only used as a fallback
for when the event-stream was disabled.
One problem with this was that the worst case timeout overshoot
with WFE is ARCH_TIMER_EVT_STREAM_PERIOD_US (100us) and so there's
a vast gulf between that and a potentially much smaller granularity
with the spin-wait versions. In addition the interface provided
no way for the caller to specify or limit the oveshoot.
Non-timekeeping issues:
3. The interface was useful for poll_idle() like users but was not
usable if the caller needed to do any work. For instance,
rqspinlock uses it thus:
smp_cond_load_acquire_timewait(v, c, 0, 1)
Here the time-check always evaluates to false and all of the logic
(ex. deadlock checking) is folded into the conditional.
With that foundation, the new interface is:
smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy,
time_expr, time_end)
The added parameter, wait_policy provides a mechanism for the caller
to apportion time spent spinning or, where supported, in a wait.
This is somewhat inspired from the queue_poll() mechanism used
with smp_cond_load() in arm-smmu-v3 [4].
It addresses (1) by deciding the time-check granularity based on a
time interval instead of spinning for a fixed number of iterations.
(2) is addressed by the wait_policy allowing for different slack
values. The implemented versions of wait_policy allow for a coarse
or a fine grained slack. A user defined wait_policy could choose
its own wait parameter. This would also address (3).
With that, patches 1-5, add the generic and arm64 logic:
"asm-generic: barrier: add smp_cond_load_relaxed_timewait()",
"asm-generic: barrier: add wait_policy handlers"
"arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()"
"arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()"
"arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()".
And, patch 6, adds the acquire variant:
"asm-generic: barrier: add smp_cond_load_acquire_timewait()"
And, finally patch 7 lays out how this could be used for rqspinlock:
"bpf: rqspinlock: add rqspinlock policy handler for arm64".
Any comments appreciated!
Ankur
[1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
[2] Uses the smp_cond_load_acquire_timewait() from v1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
[3] https://lore.kernel.org/lkml/Z8dRalfxYcJIcLGj@arm.com/
[4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c#n223
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org
Ankur Arora (7):
asm-generic: barrier: add smp_cond_load_relaxed_timewait()
asm-generic: barrier: add wait_policy handlers
arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()
arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()
arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()
asm-generic: barrier: add smp_cond_load_acquire_timewait()
bpf: rqspinlock: add rqspinlock policy handler for arm64
arch/arm64/include/asm/barrier.h | 82 +++++++++++++++
arch/arm64/include/asm/rqspinlock.h | 96 ++++--------------
include/asm-generic/barrier.h | 150 ++++++++++++++++++++++++++++
3 files changed, 251 insertions(+), 77 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-21 18:37 ` Catalin Marinas
2025-05-02 8:52 ` [PATCH v2 2/7] asm-generic: barrier: add wait_policy handlers Ankur Arora
` (7 subsequent siblings)
8 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Add smp_cond_load_relaxed_timewait(), which extends the non-timeout
variant for cases where we don't want to wait indefinitely.
The interface adds parameters to allow timeout checks and a policy
that decides how exactly to wait for the condition to change.
The waiting is done via the usual cpu_relax() spin-wait around the
conditional variable with periodic evaluation of the time-check
expression, and optionally by architectural primitives that allow
for cheaper mechanisms such as waiting on stores to a memory address
with an out-of-band timeout mechanism.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/asm-generic/barrier.h | 58 +++++++++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index d4f581c1e21d..a7be98e906f4 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -273,6 +273,64 @@ do { \
})
#endif
+/*
+ * Non-spin primitive that allows waiting for stores to an address,
+ * with support for a timeout. This works in conjunction with an
+ * architecturally defined wait_policy.
+ */
+#ifndef __smp_timewait_store
+#define __smp_timewait_store(ptr, val) do { } while (0)
+#endif
+
+#ifndef __smp_cond_load_relaxed_timewait
+#define __smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
+ time_expr, time_end) ({ \
+ typeof(ptr) __PTR = (ptr); \
+ __unqual_scalar_typeof(*ptr) VAL; \
+ u32 __n = 0, __spin = 0; \
+ u64 __prev = 0, __end = (time_end); \
+ bool __wait = false; \
+ \
+ for (;;) { \
+ VAL = READ_ONCE(*__PTR); \
+ if (cond_expr) \
+ break; \
+ cpu_relax(); \
+ if (++__n < __spin) \
+ continue; \
+ if (!(__prev = wait_policy((time_expr), __prev, __end, \
+ &__spin, &__wait))) \
+ break; \
+ if (__wait) \
+ __smp_timewait_store(__PTR, VAL); \
+ __n = 0; \
+ } \
+ (typeof(*ptr))VAL; \
+})
+#endif
+
+/**
+ * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering
+ * guarantees until a timeout expires.
+ * @ptr: pointer to the variable to wait on
+ * @cond: boolean expression to wait for
+ * @wait_policy: policy handler that adjusts the number of times we spin or
+ * wait for cacheline to change (depends on architecture, not supported in
+ * generic code.) before evaluating the time-expr.
+ * @time_expr: monotonic expression that evaluates to the current time
+ * @time_end: compared against time_expr
+ *
+ * Equivalent to using READ_ONCE() on the condition variable.
+ */
+#define smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
+ time_expr, time_end) ({ \
+ __unqual_scalar_typeof(*ptr) _val;; \
+ _val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \
+ wait_policy, time_expr, \
+ time_end); \
+ (typeof(*ptr))_val; \
+})
+
/*
* pmem_wmb() ensures that all stores for which the modification
* are written to persistent storage by preceding instructions have
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 2/7] asm-generic: barrier: add wait_policy handlers
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 3/7] arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait() Ankur Arora
` (6 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
smp_cond_load_relaxed_timewait() waits on a conditional variable
while either spinning or via some architectural primitive, while
also watching the clock.
The generic code presents the simple case where the waiting is done
exclusively via a cpu_relax() spin-wait loop. To keep the pipeline
as idle as possible, we want to do the time-check only intermittently.
How often the time-check is done -- which also determines how much we
overshoot the timeout by, is configured via the __smp_cond_timewait_coarse()
and __smp_cond_timewait_fine() wait policies.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/asm-generic/barrier.h | 66 +++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index a7be98e906f4..76124683be4b 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -15,6 +15,7 @@
#include <linux/compiler.h>
#include <linux/kcsan-checks.h>
+#include <linux/minmax.h>
#include <asm/rwonce.h>
#ifndef nop
@@ -273,6 +274,64 @@ do { \
})
#endif
+#ifndef SMP_TIMEWAIT_SPIN_BASE
+#define SMP_TIMEWAIT_SPIN_BASE 16
+#endif
+
+static inline u64 ___cond_spinwait(u64 now, u64 prev, u64 end,
+ u32 *spin, bool *wait, u64 slack)
+{
+ if (now >= end)
+ return 0;
+
+ *wait = false;
+
+ /*
+ * Scale the spin-count up or down so we evaluate the time-expr every
+ * slack unit of time or so.
+ */
+ if ((now - prev) < slack)
+ *spin <<= 1;
+ else
+ /*
+ * Ensure the spin-count is at least SMP_TIMEWAIT_SPIN_BASE
+ * when scaling down to guard against artificially low values
+ * due to interrupts etc. Clamping down also handles the case
+ * of the first iteration (*spin == 0).
+ */
+ *spin = max((*spin >> 1) + (*spin >> 2), SMP_TIMEWAIT_SPIN_BASE);
+
+ return now;
+}
+
+#ifndef SMP_TIMEWAIT_SLACK_FINE_US
+#define SMP_TIMEWAIT_SLACK_FINE_US 2UL
+#endif
+
+#ifndef SMP_TIMEWAIT_SLACK_COARSE_US
+#define SMP_TIMEWAIT_SLACK_COARSE_US 5UL
+#endif
+
+/*
+ * wait_policy: to minimize how often we do the (typically) expensive
+ * time-check, expect a slack duration which would vary based on
+ * architecture.
+ *
+ * For the generic variant, the fine and coarse variants have a slack
+ * duration of SMP_TIMEWAIT_SLACK_FINE_US and SMP_TIMEWAIT_SLACK_COARSE_US.
+ */
+#ifndef __smp_cond_timewait_fine
+#define __smp_cond_timewait_fine(now, prev, end, spin, wait) \
+ ___cond_spinwait(now, prev, end, spin, wait, \
+ SMP_TIMEWAIT_SLACK_FINE_US)
+#endif
+
+#ifndef __smp_cond_timewait_coarse
+#define __smp_cond_timewait_coarse(now, prev, end, spin, wait) \
+ ___cond_spinwait(now, prev, end, spin, wait, \
+ SMP_TIMEWAIT_SLACK_COARSE_US)
+#endif
+
/*
* Non-spin primitive that allows waiting for stores to an address,
* with support for a timeout. This works in conjunction with an
@@ -320,11 +379,18 @@ do { \
* @time_expr: monotonic expression that evaluates to the current time
* @time_end: compared against time_expr
*
+ * The default policies (__smp_cond_timewait_coarse, __smp_cond_timewait_fine)
+ * assume that time_expr and time_end evaluate to time in us (both with a user
+ * specified precision.)
+ * With a user specified policy, any units and precision can be used.
+ *
* Equivalent to using READ_ONCE() on the condition variable.
*/
#define smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
time_expr, time_end) ({ \
__unqual_scalar_typeof(*ptr) _val;; \
+ BUILD_BUG_ON_MSG(!__same_type(typeof(time_expr), u64), \
+ "incompatible time units"); \
_val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \
wait_policy, time_expr, \
time_end); \
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 3/7] arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 2/7] asm-generic: barrier: add wait_policy handlers Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 4/7] arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait() Ankur Arora
` (5 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Define __smp_timewait_store() to support waiting in
smp_cond_load_relaxed_timewait(). This uses __cmpwait_relaxed() to
wait in WFE for stores to the target address, with the event-stream
periodically ensuring that we don't wait forever in the failure
case.
In the unlikely case the event-stream is unavailable, the wait_policy
is expected to just fallback to the generic spin-wait implementation.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/barrier.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 1ca947d5c939..eaeb78dd48c0 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -216,6 +216,9 @@ do { \
(typeof(*ptr))VAL; \
})
+#define __smp_timewait_store(ptr, val) \
+ __cmpwait_relaxed(ptr, val)
+
#include <asm-generic/barrier.h>
#endif /* __ASSEMBLY__ */
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 4/7] arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (2 preceding siblings ...)
2025-05-02 8:52 ` [PATCH v2 3/7] arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 5/7] arm64: barrier: add fine " Ankur Arora
` (4 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
smp_cond_load_relaxed_timewait() waits on a conditional variable
until a timeout expires. This waiting is via some mix of looping
around, dereferencing an address, or waiting in a WFE until the CPU
gets an event due to a store to the address, or because of periodic
events from the event-stream.
Define __smp_cond_timewait_coarse() for usecases where the caller can
tolerate a relatively large overshoot. This allows us to minimize the
time spent spinning at the cost of spending extra time in the WFE
state.
This would result in a worst case delay of ARCH_TIMER_EVT_STREAM_PERIOD_US
and a spin period of no more than SMP_TIMEWAIT_CHECK_US.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/barrier.h | 66 ++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index eaeb78dd48c0..f4a184a96933 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -10,6 +10,7 @@
#ifndef __ASSEMBLY__
#include <linux/kasan-checks.h>
+#include <linux/minmax.h>
#include <asm/alternative-macros.h>
@@ -219,6 +220,71 @@ do { \
#define __smp_timewait_store(ptr, val) \
__cmpwait_relaxed(ptr, val)
+/*
+ * Redefine ARCH_TIMER_EVT_STREAM_PERIOD_US locally to avoid include hell.
+ */
+#define __ARCH_TIMER_EVT_STREAM_PERIOD_US 100UL
+extern bool arch_timer_evtstrm_available(void);
+
+/*
+ * For coarse grained waits, allow overshoot by the event-stream period.
+ * Defined without reference to ARCH_TIMER_EVT_STREAM_PERIOD_US to avoid
+ * include hell.
+ */
+#define SMP_TIMEWAIT_SLACK_COARSE_US __ARCH_TIMER_EVT_STREAM_PERIOD_US
+
+#define SMP_TIMEWAIT_SPIN_BASE 16
+#define SMP_TIMEWAIT_CHECK_US 2UL
+
+static inline u64 ___cond_timewait(u64 now, u64 prev, u64 end,
+ u32 *spin, bool *wait, u64 slack)
+{
+ bool wfet = alternative_has_cap_unlikely(ARM64_HAS_WFXT);
+ bool wfe, ev = arch_timer_evtstrm_available();
+ u64 evt_period = __ARCH_TIMER_EVT_STREAM_PERIOD_US;
+ u64 remaining = end - now;
+
+ if (now >= end)
+ return 0;
+
+ /*
+ * Use WFE if there's enough slack to get an event-stream wakeup even
+ * if we don't come out of the WFE due to natural causes.
+ */
+ wfe = ev && ((remaining + slack) > evt_period);
+
+ if (wfe || wfet) {
+ *wait = true;
+ *spin = 0;
+ return now;
+ }
+
+ /*
+ * Our wait period is shorter than our best granularity. Spin.
+ *
+ * A time-check is expensive but not too expensive. Scale the
+ * spin-count so we stay close to the fine-grained slack period.
+ */
+ *wait = false;
+ if ((now - prev) < SMP_TIMEWAIT_CHECK_US)
+ *spin <<= 1;
+ else
+ *spin = max((*spin >> 1) + (*spin >> 2), SMP_TIMEWAIT_SPIN_BASE);
+ return now;
+}
+
+/*
+ * Coarse wait_policy: minimizes the duration spent spinning at the cost of
+ * potentially spending the available slack in a WFE wait state.
+ *
+ * The resultant worst case timeout delay is SMP_TIMEWAIT_SLACK_COARSE_US
+ * (same as ARCH_TIMER_EVT_STREAM_PERIOD_US) and a spin period of no more
+ * than SMP_TIMEWAIT_CHECK_US.
+ */
+#define __smp_cond_timewait_coarse(now, prev, end, spin, wait) \
+ ___cond_timewait(now, prev, end, spin, wait, \
+ SMP_TIMEWAIT_SLACK_COARSE_US)
+
#include <asm-generic/barrier.h>
#endif /* __ASSEMBLY__ */
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 5/7] arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (3 preceding siblings ...)
2025-05-02 8:52 ` [PATCH v2 4/7] arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 6/7] asm-generic: barrier: add smp_cond_load_acquire_timewait() Ankur Arora
` (3 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Define __smp_cond_timewait_fine for callers which need fine grained
timeout.
To do this, use a narrowing timeout slack, equal to the remaining
duration. This allows us to optimistically wait in WFE until
the remaining duration drops below ARCH_TIMER_EVT_STREAM_PERIOD_US/2.
Once we reach that point, we go into the spin-wait state.
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/barrier.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f4a184a96933..e4abb8f5dd97 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -247,6 +247,9 @@ static inline u64 ___cond_timewait(u64 now, u64 prev, u64 end,
if (now >= end)
return 0;
+ if (slack == 0)
+ slack = max(remaining, SMP_TIMEWAIT_CHECK_US);
+
/*
* Use WFE if there's enough slack to get an event-stream wakeup even
* if we don't come out of the WFE due to natural causes.
@@ -273,6 +276,16 @@ static inline u64 ___cond_timewait(u64 now, u64 prev, u64 end,
return now;
}
+/*
+ * Fine wait_policy: minimize the timeout delay while balancing against the
+ * time spent in the WFE wait state.
+ *
+ * The worst case timeout delay is ARCH_TIMER_EVT_STREAM_PERIOD_US/2, which
+ * would also be the worst case spin period.
+ */
+#define __smp_cond_timewait_fine(now, prev, end, spin, wait) \
+ __smp_cond_timewait(now, prev, end, spin, wait, \
+ 0)
/*
* Coarse wait_policy: minimizes the duration spent spinning at the cost of
* potentially spending the available slack in a WFE wait state.
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 6/7] asm-generic: barrier: add smp_cond_load_acquire_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (4 preceding siblings ...)
2025-05-02 8:52 ` [PATCH v2 5/7] arm64: barrier: add fine " Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 7/7] bpf: rqspinlock: add rqspinlock policy handler for arm64 Ankur Arora
` (2 subsequent siblings)
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Add the acquire variant of smp_cond_load_relaxed_timewait(). This
reuses the relaxed variant, with the additional LOAD->LOAD
ordering via smp_acquire__after_ctrl_dep().
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/asm-generic/barrier.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 76124683be4b..2d52dc5b82fe 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -397,6 +397,32 @@ static inline u64 ___cond_spinwait(u64 now, u64 prev, u64 end,
(typeof(*ptr))_val; \
})
+/**
+ * smp_cond_load_acquire_timewait() - (Spin) wait for cond with ACQUIRE ordering
+ * until a timeout expires.
+ * @ptr: pointer to the variable to wait on
+ * @cond: boolean expression to wait for
+ * @wait_policy: policy handler that adjusts how much we spin before evaluating
+ * the timeout, and if we drop into a wait for cacheline to change (depending
+ * on architecture support.)
+ * @time_expr: monotonic expression that evaluates to the current time
+ * @time_end: compared against time_expr
+ *
+ * Equivalent to using smp_cond_load_acquire() on the condition variable with
+ * a timeout.
+ */
+#ifndef smp_cond_load_acquire_timewait
+#define smp_cond_load_acquire_timewait(ptr, cond_expr, wait_policy, \
+ time_expr, time_end) ({ \
+ __unqual_scalar_typeof(*ptr) _val; \
+ _val = smp_cond_load_relaxed_timewait(ptr, cond_expr, \
+ wait_policy, time_expr, \
+ time_end); \
+ /* Depends on the control dependency of the wait above. */ \
+ smp_acquire__after_ctrl_dep(); \
+ (typeof(*ptr))_val; \
+})
+#endif
/*
* pmem_wmb() ensures that all stores for which the modification
* are written to persistent storage by preceding instructions have
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v2 7/7] bpf: rqspinlock: add rqspinlock policy handler for arm64
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (5 preceding siblings ...)
2025-05-02 8:52 ` [PATCH v2 6/7] asm-generic: barrier: add smp_cond_load_acquire_timewait() Ankur Arora
@ 2025-05-02 8:52 ` Ankur Arora
2025-05-02 16:42 ` [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Christoph Lameter (Ampere)
2025-05-16 22:50 ` Okanovic, Haris
8 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 8:52 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
The local copy of smp_cond_load_acquire_timewait() (from [1]) is only
usable for rqspinlock timeout and deadlock checking in a degenerate
fashion by overloading the evaluation of the condvar.
Update smp_cond_load_acquire_timewait(). Move the timeout and deadlock
handlng (partially stubbed) to the wait policy handler.
[1] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
Note: This patch is missing all the important bits. Just wanted to check
if the interface is workable before threshing plugging in the deadlock
checking etc.
arch/arm64/include/asm/rqspinlock.h | 96 ++++++-----------------------
1 file changed, 19 insertions(+), 77 deletions(-)
diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
index 9ea0a74e5892..27138b591e31 100644
--- a/arch/arm64/include/asm/rqspinlock.h
+++ b/arch/arm64/include/asm/rqspinlock.h
@@ -4,89 +4,31 @@
#include <asm/barrier.h>
-/*
- * Hardcode res_smp_cond_load_acquire implementations for arm64 to a custom
- * version based on [0]. In rqspinlock code, our conditional expression involves
- * checking the value _and_ additionally a timeout. However, on arm64, the
- * WFE-based implementation may never spin again if no stores occur to the
- * locked byte in the lock word. As such, we may be stuck forever if
- * event-stream based unblocking is not available on the platform for WFE spin
- * loops (arch_timer_evtstrm_available).
- *
- * Once support for smp_cond_load_acquire_timewait [0] lands, we can drop this
- * copy-paste.
- *
- * While we rely on the implementation to amortize the cost of sampling
- * cond_expr for us, it will not happen when event stream support is
- * unavailable, time_expr check is amortized. This is not the common case, and
- * it would be difficult to fit our logic in the time_expr_ns >= time_limit_ns
- * comparison, hence just let it be. In case of event-stream, the loop is woken
- * up at microsecond granularity.
- *
- * [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com
- */
+#define RES_DEF_SPIN_COUNT (32 * 1024)
-#ifndef smp_cond_load_acquire_timewait
-
-#define smp_cond_time_check_count 200
-
-#define __smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, \
- time_limit_ns) ({ \
- typeof(ptr) __PTR = (ptr); \
- __unqual_scalar_typeof(*ptr) VAL; \
- unsigned int __count = 0; \
- for (;;) { \
- VAL = READ_ONCE(*__PTR); \
- if (cond_expr) \
- break; \
- cpu_relax(); \
- if (__count++ < smp_cond_time_check_count) \
- continue; \
- if ((time_expr_ns) >= (time_limit_ns)) \
- break; \
- __count = 0; \
- } \
- (typeof(*ptr))VAL; \
-})
-
-#define __smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, time_limit_ns) \
-({ \
- typeof(ptr) __PTR = (ptr); \
- __unqual_scalar_typeof(*ptr) VAL; \
- for (;;) { \
- VAL = smp_load_acquire(__PTR); \
- if (cond_expr) \
- break; \
- __cmpwait_relaxed(__PTR, VAL); \
- if ((time_expr_ns) >= (time_limit_ns)) \
- break; \
- } \
- (typeof(*ptr))VAL; \
-})
-
-#define smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, time_limit_ns) \
-({ \
- __unqual_scalar_typeof(*ptr) _val; \
- int __wfe = arch_timer_evtstrm_available(); \
+#define rqspinlock_cond_timewait(now, prev, end, spin, wait) ({ \
+ bool __ev = arch_timer_evtstrm_available(); \
+ bool __wfet = alternative_has_cap_unlikely(ARM64_HAS_WFXT); \
+ u64 __ret; \
\
- if (likely(__wfe)) { \
- _val = __smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, \
- time_limit_ns); \
+ *wait = false; \
+ /* TODO Handle deadlock check. */ \
+ if (end >= now) { \
+ __ret = 0; \
} else { \
- _val = __smp_cond_load_relaxed_spinwait(ptr, cond_expr, \
- time_expr_ns, \
- time_limit_ns); \
- smp_acquire__after_ctrl_dep(); \
+ if (__ev || __wfet) \
+ *wait = true; \
+ else \
+ *spin = RES_DEF_SPIN_COUNT; \
+ __ret = now; \
} \
- (typeof(*ptr))_val; \
+ \
+ __ret; \
})
-#endif
-
-#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire_timewait(v, c, 0, 1)
+#define res_smp_cond_load_acquire(v, c) \
+ smp_cond_load_acquire_timewait(v, c, rqspinlock_cond_timewait, \
+ ktime_get_mono_fast_ns(), (u64)RES_DEF_TIMEOUT)
#include <asm-generic/rqspinlock.h>
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (6 preceding siblings ...)
2025-05-02 8:52 ` [PATCH v2 7/7] bpf: rqspinlock: add rqspinlock policy handler for arm64 Ankur Arora
@ 2025-05-02 16:42 ` Christoph Lameter (Ampere)
2025-05-02 20:05 ` Ankur Arora
2025-05-16 22:50 ` Okanovic, Haris
8 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-05-02 16:42 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd,
catalin.marinas, will, peterz, akpm, mark.rutland, harisokn, ast,
memxor, zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, 2 May 2025, Ankur Arora wrote:
> smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, time_limit_ns)
> took four arguments, with ptr and cond_expr doing the usual smp_cond_load()
> things and time_expr_ns and time_limit_ns being used to decide the
> terminating condition.
>
> There were some problems in the timekeeping:
>
> 1. How often do we do the (relatively expensive) time-check?
Is this really important? We have instructions that wait on an event and
terminate at cycle counter values like WFET on arm64
The case were we need to perform time checks is only needed if the
processor does not support WFET but must use a event stream or does not
even have that available.
So the best approach is to have a simple interface were we specify the
cycle count when the wait is to be terminated and where we can cover that
with one WFET instruction.
The other cases then are degenerate forms of that. If only WFE is
available then only use that if the timeout is larger than the event
stream granularity. Or if both are not available them do the relax /
loop thing.
So the interface could be much simpler:
__smp_cond_load_relaxed_wait(ptr, timeout_cycle_count)
with a wrapper
smp_cond_relaxed_wait_expr(ptr, expr, timeout cycle count)
where we check the expression too and retry if the expression is not true.
The fallbacks with the spins and relax logic would be architecture
specific.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-02 16:42 ` [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Christoph Lameter (Ampere)
@ 2025-05-02 20:05 ` Ankur Arora
2025-05-05 16:13 ` Christoph Lameter (Ampere)
0 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2025-05-02 20:05 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Christoph Lameter (Ampere) <cl@gentwo.org> writes:
> On Fri, 2 May 2025, Ankur Arora wrote:
>
>> smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, time_limit_ns)
>> took four arguments, with ptr and cond_expr doing the usual smp_cond_load()
>> things and time_expr_ns and time_limit_ns being used to decide the
>> terminating condition.
>>
>> There were some problems in the timekeeping:
>>
>> 1. How often do we do the (relatively expensive) time-check?
>
> Is this really important? We have instructions that wait on an event and
> terminate at cycle counter values like WFET on arm64
>
> The case were we need to perform time checks is only needed if the
> processor does not support WFET but must use a event stream or does not
> even have that available.
AFAICT the vast majority of arm64 processors in the wild don't yet
support WFET. For instance I haven't been able to find a single one
to test my WFET changes with ;).
The other part is that this needs to be in common code and x86 primarily
uses PAUSE.
So, supporting both configurations: WFE + spin on arm64 and PAUSE on x86
needs a way of rate-limiting the time-check. Otherwise you run into
issues like this one:
commit 4dc2375c1a4e88ed2701f6961e0e4f9a7696ad3c
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Tue Mar 27 23:58:45 2018 +0200
cpuidle: poll_state: Avoid invoking local_clock() too often
Rik reports that he sees an increase in CPU use in one benchmark
due to commit 612f1a22f067 "cpuidle: poll_state: Add time limit to
poll_idle()" that caused poll_idle() to call local_clock() in every
iteration of the loop. Utilization increase generally means more
non-idle time with respect to total CPU time (on the average) which
implies reduced CPU frequency.
Doug reports that limiting the rate of local_clock() invocations
in there causes much less power to be drawn during a CPU-intensive
parallel workload (with idle states 1 and 2 disabled to enforce more
state 0 residency).
These two reports together suggest that executing local_clock() on
multiple CPUs in parallel at a high rate may cause chips to get hot
and trigger thermal/power limits on them to kick in, so reduce the
rate of local_clock() invocations in poll_idle() to avoid that issue.
> So the best approach is to have a simple interface were we specify the
> cycle count when the wait is to be terminated and where we can cover that
> with one WFET instruction.
>
> The other cases then are degenerate forms of that. If only WFE is
> available then only use that if the timeout is larger than the event
> stream granularity. Or if both are not available them do the relax /
> loop thing.
That's what I had proposed for v1. But as Catalin pointed out, that's
not very useful when the caller wants to limit the overshoot.
This version tries to optimistically use WFE where possible while
minimizing the spin time.
> So the interface could be much simpler:
>
> __smp_cond_load_relaxed_wait(ptr, timeout_cycle_count)
>
> with a wrapper
>
> smp_cond_relaxed_wait_expr(ptr, expr, timeout cycle count)
Oh, I would have absolutely liked to keep the interface simple, but
couldn't see a way to do that while managing the other constraints.
For instance, different users want different clocks: poll_idle() can do
with an imprecise clock but rqspinlock needs ktime_get_mono().
I think using standard clock types is also better instead of using
arm64 specific cycles or tsc or whatever.
> where we check the expression too and retry if the expression is not true.
>
> The fallbacks with the spins and relax logic would be architecture
> specific.
Even if they were all architecture specific, I suspect there's quite a
lot of variation between cpu_relax() across microarchitectures. For
instance YIELD is a nop on non-SMT arm64, but probably heavier on
SMT arm64.
Thanks for the quick review!
Ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-02 20:05 ` Ankur Arora
@ 2025-05-05 16:13 ` Christoph Lameter (Ampere)
2025-05-05 17:08 ` Ankur Arora
0 siblings, 1 reply; 16+ messages in thread
From: Christoph Lameter (Ampere) @ 2025-05-05 16:13 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd,
catalin.marinas, will, peterz, akpm, mark.rutland, harisokn, ast,
memxor, zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, 2 May 2025, Ankur Arora wrote:
> AFAICT the vast majority of arm64 processors in the wild don't yet
> support WFET. For instance I haven't been able to find a single one
> to test my WFET changes with ;).
Ok then for patch 1-6:
Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-05 16:13 ` Christoph Lameter (Ampere)
@ 2025-05-05 17:08 ` Ankur Arora
0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-05 17:08 UTC (permalink / raw)
To: Christoph Lameter (Ampere)
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Christoph Lameter (Ampere) <cl@gentwo.org> writes:
> On Fri, 2 May 2025, Ankur Arora wrote:
>
>> AFAICT the vast majority of arm64 processors in the wild don't yet
>> support WFET. For instance I haven't been able to find a single one
>> to test my WFET changes with ;).
>
> Ok then for patch 1-6:
>
> Reviewed-by: Christoph Lameter (Ampere) <cl@gentwo.org>
Great. Thanks Christoph!
--
ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
` (7 preceding siblings ...)
2025-05-02 16:42 ` [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Christoph Lameter (Ampere)
@ 2025-05-16 22:50 ` Okanovic, Haris
2025-05-17 1:16 ` Ankur Arora
8 siblings, 1 reply; 16+ messages in thread
From: Okanovic, Haris @ 2025-05-16 22:50 UTC (permalink / raw)
To: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
ankur.a.arora@oracle.com, bpf@vger.kernel.org
Cc: Okanovic, Haris, cl@gentwo.org, joao.m.martins@oracle.com,
akpm@linux-foundation.org, peterz@infradead.org,
mark.rutland@arm.com, memxor@gmail.com, catalin.marinas@arm.com,
arnd@arndb.de, will@kernel.org, zhenglifeng1@huawei.com,
ast@kernel.org, xueshuai@linux.alibaba.com,
konrad.wilk@oracle.com, boris.ostrovsky@oracle.com
On Fri, 2025-05-02 at 01:52 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi,
>
> This series adds waited variants of the smp_cond_load() primitives:
> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
>
> There are two known users for these interfaces:
>
> - poll_idle() [1]
> - resilient queued spinlocks [2]
>
> For both of these cases we want to wait on a condition but also want
> to terminate the wait based on a timeout.
>
> Before describing how v2 implements these interfaces, let me recap the
> problems in v1 (Catalin outlined most of these in [3]):
>
> smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, time_limit_ns)
> took four arguments, with ptr and cond_expr doing the usual smp_cond_load()
> things and time_expr_ns and time_limit_ns being used to decide the
> terminating condition.
>
> There were some problems in the timekeeping:
>
> 1. How often do we do the (relatively expensive) time-check?
>
> The choice made was once very 200 spin-wait iterations, with each
> iteration trying to idle the pipeline by executing cpu_relax().
>
> The choice of 200 was, of course, arbitrary and somewhat meaningless
> across architectures. On recent x86, cpu_relax()/PAUSE takes ~20-30
> cycles, but on (non-SMT) arm64 cpu_relax()/YIELD is effectively
> just a NOP.
>
> Even if each architecture had its own limit, this will also vary
> across microarchitectures.
>
> 2. On arm64, which can do better than just cpu_relax(), for instance,
> by waiting for a store on an address (WFE), the implementation
> exclusively used WFE, with the spin-wait only used as a fallback
> for when the event-stream was disabled.
>
> One problem with this was that the worst case timeout overshoot
> with WFE is ARCH_TIMER_EVT_STREAM_PERIOD_US (100us) and so there's
> a vast gulf between that and a potentially much smaller granularity
> with the spin-wait versions. In addition the interface provided
> no way for the caller to specify or limit the oveshoot.
>
> Non-timekeeping issues:
>
> 3. The interface was useful for poll_idle() like users but was not
> usable if the caller needed to do any work. For instance,
> rqspinlock uses it thus:
>
> smp_cond_load_acquire_timewait(v, c, 0, 1)
>
> Here the time-check always evaluates to false and all of the logic
> (ex. deadlock checking) is folded into the conditional.
>
>
> With that foundation, the new interface is:
>
> smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy,
> time_expr, time_end)
>
> The added parameter, wait_policy provides a mechanism for the caller
> to apportion time spent spinning or, where supported, in a wait.
> This is somewhat inspired from the queue_poll() mechanism used
> with smp_cond_load() in arm-smmu-v3 [4].
>
> It addresses (1) by deciding the time-check granularity based on a
> time interval instead of spinning for a fixed number of iterations.
>
> (2) is addressed by the wait_policy allowing for different slack
> values. The implemented versions of wait_policy allow for a coarse
> or a fine grained slack. A user defined wait_policy could choose
> its own wait parameter. This would also address (3).
>
>
> With that, patches 1-5, add the generic and arm64 logic:
>
> "asm-generic: barrier: add smp_cond_load_relaxed_timewait()",
> "asm-generic: barrier: add wait_policy handlers"
>
> "arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()"
> "arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()"
> "arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()".
>
> And, patch 6, adds the acquire variant:
>
> "asm-generic: barrier: add smp_cond_load_acquire_timewait()"
>
> And, finally patch 7 lays out how this could be used for rqspinlock:
>
> "bpf: rqspinlock: add rqspinlock policy handler for arm64".
>
> Any comments appreciated!
>
> Ankur
>
>
> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
> [2] Uses the smp_cond_load_acquire_timewait() from v1
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
> [3] https://lore.kernel.org/lkml/Z8dRalfxYcJIcLGj@arm.com/
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c#n223
>
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: linux-arch@vger.kernel.org
>
>
> Ankur Arora (7):
> asm-generic: barrier: add smp_cond_load_relaxed_timewait()
> asm-generic: barrier: add wait_policy handlers
> arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()
> arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()
> arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()
> asm-generic: barrier: add smp_cond_load_acquire_timewait()
> bpf: rqspinlock: add rqspinlock policy handler for arm64
>
> arch/arm64/include/asm/barrier.h | 82 +++++++++++++++
> arch/arm64/include/asm/rqspinlock.h | 96 ++++--------------
> include/asm-generic/barrier.h | 150 ++++++++++++++++++++++++++++
> 3 files changed, 251 insertions(+), 77 deletions(-)
>
> --
> 2.43.5
>
Tested on AWS Graviton (ARM64 Neoverse V1) with your V10 haltpoll
changes, atop master 83a896549f.
Reviewed-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Haris Okanovic <harisokn@amazon.com>
Regards,
Haris Okanovic
AWS Graviton Software
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait()
2025-05-16 22:50 ` Okanovic, Haris
@ 2025-05-17 1:16 ` Ankur Arora
0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-17 1:16 UTC (permalink / raw)
To: Okanovic, Haris
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
ankur.a.arora@oracle.com, bpf@vger.kernel.org, cl@gentwo.org,
joao.m.martins@oracle.com, akpm@linux-foundation.org,
peterz@infradead.org, mark.rutland@arm.com, memxor@gmail.com,
catalin.marinas@arm.com, arnd@arndb.de, will@kernel.org,
zhenglifeng1@huawei.com, ast@kernel.org,
xueshuai@linux.alibaba.com, konrad.wilk@oracle.com,
boris.ostrovsky@oracle.com
Okanovic, Haris <harisokn@amazon.com> writes:
> On Fri, 2025-05-02 at 01:52 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Hi,
>>
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
>>
>> There are two known users for these interfaces:
>>
>> - poll_idle() [1]
>> - resilient queued spinlocks [2]
>>
>> For both of these cases we want to wait on a condition but also want
>> to terminate the wait based on a timeout.
>>
>> Before describing how v2 implements these interfaces, let me recap the
>> problems in v1 (Catalin outlined most of these in [3]):
>>
>> smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, time_limit_ns)
>> took four arguments, with ptr and cond_expr doing the usual smp_cond_load()
>> things and time_expr_ns and time_limit_ns being used to decide the
>> terminating condition.
>>
>> There were some problems in the timekeeping:
>>
>> 1. How often do we do the (relatively expensive) time-check?
>>
>> The choice made was once very 200 spin-wait iterations, with each
>> iteration trying to idle the pipeline by executing cpu_relax().
>>
>> The choice of 200 was, of course, arbitrary and somewhat meaningless
>> across architectures. On recent x86, cpu_relax()/PAUSE takes ~20-30
>> cycles, but on (non-SMT) arm64 cpu_relax()/YIELD is effectively
>> just a NOP.
>>
>> Even if each architecture had its own limit, this will also vary
>> across microarchitectures.
>>
>> 2. On arm64, which can do better than just cpu_relax(), for instance,
>> by waiting for a store on an address (WFE), the implementation
>> exclusively used WFE, with the spin-wait only used as a fallback
>> for when the event-stream was disabled.
>>
>> One problem with this was that the worst case timeout overshoot
>> with WFE is ARCH_TIMER_EVT_STREAM_PERIOD_US (100us) and so there's
>> a vast gulf between that and a potentially much smaller granularity
>> with the spin-wait versions. In addition the interface provided
>> no way for the caller to specify or limit the oveshoot.
>>
>> Non-timekeeping issues:
>>
>> 3. The interface was useful for poll_idle() like users but was not
>> usable if the caller needed to do any work. For instance,
>> rqspinlock uses it thus:
>>
>> smp_cond_load_acquire_timewait(v, c, 0, 1)
>>
>> Here the time-check always evaluates to false and all of the logic
>> (ex. deadlock checking) is folded into the conditional.
>>
>>
>> With that foundation, the new interface is:
>>
>> smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy,
>> time_expr, time_end)
>>
>> The added parameter, wait_policy provides a mechanism for the caller
>> to apportion time spent spinning or, where supported, in a wait.
>> This is somewhat inspired from the queue_poll() mechanism used
>> with smp_cond_load() in arm-smmu-v3 [4].
>>
>> It addresses (1) by deciding the time-check granularity based on a
>> time interval instead of spinning for a fixed number of iterations.
>>
>> (2) is addressed by the wait_policy allowing for different slack
>> values. The implemented versions of wait_policy allow for a coarse
>> or a fine grained slack. A user defined wait_policy could choose
>> its own wait parameter. This would also address (3).
>>
>>
>> With that, patches 1-5, add the generic and arm64 logic:
>>
>> "asm-generic: barrier: add smp_cond_load_relaxed_timewait()",
>> "asm-generic: barrier: add wait_policy handlers"
>>
>> "arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()"
>> "arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()"
>> "arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()".
>>
>> And, patch 6, adds the acquire variant:
>>
>> "asm-generic: barrier: add smp_cond_load_acquire_timewait()"
>>
>> And, finally patch 7 lays out how this could be used for rqspinlock:
>>
>> "bpf: rqspinlock: add rqspinlock policy handler for arm64".
>>
>> Any comments appreciated!
>>
>> Ankur
>>
>>
>> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
>> [2] Uses the smp_cond_load_acquire_timewait() from v1
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
>> [3] https://lore.kernel.org/lkml/Z8dRalfxYcJIcLGj@arm.com/
>> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c#n223
>>
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: linux-arch@vger.kernel.org
>>
>>
>> Ankur Arora (7):
>> asm-generic: barrier: add smp_cond_load_relaxed_timewait()
>> asm-generic: barrier: add wait_policy handlers
>> arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait()
>> arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait()
>> arm64: barrier: add fine wait for smp_cond_load_relaxed_timewait()
>> asm-generic: barrier: add smp_cond_load_acquire_timewait()
>> bpf: rqspinlock: add rqspinlock policy handler for arm64
>>
>> arch/arm64/include/asm/barrier.h | 82 +++++++++++++++
>> arch/arm64/include/asm/rqspinlock.h | 96 ++++--------------
>> include/asm-generic/barrier.h | 150 ++++++++++++++++++++++++++++
>> 3 files changed, 251 insertions(+), 77 deletions(-)
>>
>> --
>> 2.43.5
>>
>
> Tested on AWS Graviton (ARM64 Neoverse V1) with your V10 haltpoll
> changes, atop master 83a896549f.
>
> Reviewed-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Haris Okanovic <harisokn@amazon.com>
Thanks for the review (and the testing)!
--
ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait()
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-05-21 18:37 ` Catalin Marinas
2025-05-24 3:22 ` Ankur Arora
0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2025-05-21 18:37 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Hi Ankur,
Sorry, it took me some time to get back to this series (well, I tried
once and got stuck on what wait_policy is supposed to mean, so decided
to wait until I had more coffee ;)).
On Fri, May 02, 2025 at 01:52:17AM -0700, Ankur Arora wrote:
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index d4f581c1e21d..a7be98e906f4 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -273,6 +273,64 @@ do { \
> })
> #endif
>
> +/*
> + * Non-spin primitive that allows waiting for stores to an address,
> + * with support for a timeout. This works in conjunction with an
> + * architecturally defined wait_policy.
> + */
> +#ifndef __smp_timewait_store
> +#define __smp_timewait_store(ptr, val) do { } while (0)
> +#endif
> +
> +#ifndef __smp_cond_load_relaxed_timewait
> +#define __smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
> + time_expr, time_end) ({ \
> + typeof(ptr) __PTR = (ptr); \
> + __unqual_scalar_typeof(*ptr) VAL; \
> + u32 __n = 0, __spin = 0; \
> + u64 __prev = 0, __end = (time_end); \
> + bool __wait = false; \
> + \
> + for (;;) { \
> + VAL = READ_ONCE(*__PTR); \
> + if (cond_expr) \
> + break; \
> + cpu_relax(); \
> + if (++__n < __spin) \
> + continue; \
> + if (!(__prev = wait_policy((time_expr), __prev, __end, \
> + &__spin, &__wait))) \
> + break; \
> + if (__wait) \
> + __smp_timewait_store(__PTR, VAL); \
> + __n = 0; \
> + } \
> + (typeof(*ptr))VAL; \
> +})
> +#endif
> +
> +/**
> + * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering
> + * guarantees until a timeout expires.
> + * @ptr: pointer to the variable to wait on
> + * @cond: boolean expression to wait for
> + * @wait_policy: policy handler that adjusts the number of times we spin or
> + * wait for cacheline to change (depends on architecture, not supported in
> + * generic code.) before evaluating the time-expr.
> + * @time_expr: monotonic expression that evaluates to the current time
> + * @time_end: compared against time_expr
> + *
> + * Equivalent to using READ_ONCE() on the condition variable.
> + */
> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
> + time_expr, time_end) ({ \
> + __unqual_scalar_typeof(*ptr) _val;; \
> + _val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \
> + wait_policy, time_expr, \
> + time_end); \
> + (typeof(*ptr))_val; \
> +})
IIUC, a generic user of this interface would need a wait_policy() that
is aware of the arch details (event stream, WFET etc.), given the
__smp_timewait_store() implementation in patch 3. This becomes clearer
in patch 7 where one needs to create rqspinlock_cond_timewait().
The __spin count can be arch specific, not part of some wait_policy,
even if such policy is most likely implemented in the arch code (as the
generic caller has no clue what it means). The __wait decision, again, I
don't think it should be the caller of this API to decide how to handle,
it's something internal to the API implementation based on whether the
event stream (or later WFET) is available.
The ___cond_timewait() implementation in patch 4 sets __wait if either
the event stream of WFET is available. However, __smp_timewait_store()
only uses WFE as per the __cmpwait_relaxed() implementation. So you
can't really decouple wait_policy() from how the spinning is done, in an
arch-specific way. In this implementation, wait_policy() would need to
say how to wait - WFE, WFET. That's not captured (and I don't think it
should, we can't expand the API every time we have a new method of
waiting).
I still think this interface can be simpler and fairly generic, not with
wait_policy specific to rqspinlock or poll_idle. Maybe you can keep a
policy argument for an internal __smp_cond_load_relaxed_timewait() if
it's easier to structure the code this way but definitely not for
smp_cond_*().
Another aspect I'm not keen on is the arbitrary fine/coarse constants.
Can we not have the caller pass a slack value (in ns or 0 if it doesn't
care) to smp_cond_load_relaxed_timewait() and let the arch code decide
which policy to use?
In summary, I see the API something like:
#define smp_cond_load_relaxed_timewait(ptr, cond_expr,
time_expr, time_end, slack_ns)
We can even drop time_end if we capture it in time_expr returning a bool
(like we do with cond_expr).
Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait()
2025-05-21 18:37 ` Catalin Marinas
@ 2025-05-24 3:22 ` Ankur Arora
0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2025-05-24 3:22 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> Hi Ankur,
>
> Sorry, it took me some time to get back to this series (well, I tried
> once and got stuck on what wait_policy is supposed to mean, so decided
> to wait until I had more coffee ;)).
I suppose that's as good a sign as any that the wait_policy stuff needs
to change ;).
> On Fri, May 02, 2025 at 01:52:17AM -0700, Ankur Arora wrote:
>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>> index d4f581c1e21d..a7be98e906f4 100644
>> --- a/include/asm-generic/barrier.h
>> +++ b/include/asm-generic/barrier.h
>> @@ -273,6 +273,64 @@ do { \
>> })
>> #endif
>>
>> +/*
>> + * Non-spin primitive that allows waiting for stores to an address,
>> + * with support for a timeout. This works in conjunction with an
>> + * architecturally defined wait_policy.
>> + */
>> +#ifndef __smp_timewait_store
>> +#define __smp_timewait_store(ptr, val) do { } while (0)
>> +#endif
>> +
>> +#ifndef __smp_cond_load_relaxed_timewait
>> +#define __smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
>> + time_expr, time_end) ({ \
>> + typeof(ptr) __PTR = (ptr); \
>> + __unqual_scalar_typeof(*ptr) VAL; \
>> + u32 __n = 0, __spin = 0; \
>> + u64 __prev = 0, __end = (time_end); \
>> + bool __wait = false; \
>> + \
>> + for (;;) { \
>> + VAL = READ_ONCE(*__PTR); \
>> + if (cond_expr) \
>> + break; \
>> + cpu_relax(); \
>> + if (++__n < __spin) \
>> + continue; \
>> + if (!(__prev = wait_policy((time_expr), __prev, __end, \
>> + &__spin, &__wait))) \
>> + break; \
>> + if (__wait) \
>> + __smp_timewait_store(__PTR, VAL); \
>> + __n = 0; \
>> + } \
>> + (typeof(*ptr))VAL; \
>> +})
>> +#endif
>> +
>> +/**
>> + * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering
>> + * guarantees until a timeout expires.
>> + * @ptr: pointer to the variable to wait on
>> + * @cond: boolean expression to wait for
>> + * @wait_policy: policy handler that adjusts the number of times we spin or
>> + * wait for cacheline to change (depends on architecture, not supported in
>> + * generic code.) before evaluating the time-expr.
>> + * @time_expr: monotonic expression that evaluates to the current time
>> + * @time_end: compared against time_expr
>> + *
>> + * Equivalent to using READ_ONCE() on the condition variable.
>> + */
>> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
>> + time_expr, time_end) ({ \
>> + __unqual_scalar_typeof(*ptr) _val;; \
>> + _val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \
>> + wait_policy, time_expr, \
>> + time_end); \
>> + (typeof(*ptr))_val; \
>> +})
>
> IIUC, a generic user of this interface would need a wait_policy() that
> is aware of the arch details (event stream, WFET etc.), given the
> __smp_timewait_store() implementation in patch 3. This becomes clearer
> in patch 7 where one needs to create rqspinlock_cond_timewait().
Yes, if a caller can't work with the __smp_cond_timewait_coarse() etc,
they would need to know the mechanics of how to do that on each arch.
I meant the two policies to be somewhat generic, but having to know
the internals is a problem.
> The __spin count can be arch specific, not part of some wait_policy,
> even if such policy is most likely implemented in the arch code (as the
> generic caller has no clue what it means). The __wait decision, again, I
> don't think it should be the caller of this API to decide how to handle,
> it's something internal to the API implementation based on whether the
> event stream (or later WFET) is available.
>
> The ___cond_timewait() implementation in patch 4 sets __wait if either
> the event stream of WFET is available. However, __smp_timewait_store()
> only uses WFE as per the __cmpwait_relaxed() implementation. So you
> can't really decouple wait_policy() from how the spinning is done, in an
> arch-specific way.
Agreed.
> In this implementation, wait_policy() would need to
> say how to wait - WFE, WFET. That's not captured (and I don't think it
> should, we can't expand the API every time we have a new method of
> waiting).
The idea was both the wait_policy and the arch specific interface would
evolve together and so once __cmpwait_relaxed() supports WFET, the
wait_policy would also change alongside.
However, as you say, for users that define their own wait_policy, the
interface becomes a mess to maintain.
> I still think this interface can be simpler and fairly generic, not with
> wait_policy specific to rqspinlock or poll_idle. Maybe you can keep a
> policy argument for an internal __smp_cond_load_relaxed_timewait() if
> it's easier to structure the code this way but definitely not for
> smp_cond_*().
Yeah. I think that's probably the way to do this. The main reason I felt
that we need an explicit wait_policy was to address the rqspinlock case
but as you point out, that makes the interface unmaintainable.
So, this should work (see below for one proviso), for most users:
#define smp_cond_load_relaxed_timewait(ptr, cond_expr,
time_expr, time_end, slack_us)
(Though, I would use slack_us instead of slack_ns and also keep time_expr
and time_end denominated in us.)
And users like rqspinlock could use __smp_cond_load_relaxed_timewait()
with a policy argument where they can combine rqspinock policy plus
with the common wait policy so wouldn't need to know the internals of
the waiting mechanisms.
> Another aspect I'm not keen on is the arbitrary fine/coarse constants.
> Can we not have the caller pass a slack value (in ns or 0 if it doesn't
> care) to smp_cond_load_relaxed_timewait() and let the arch code decide
> which policy to use?
Yeah, as you probably noticed, that's pretty much how what they are
implemented internally already.
> In summary, I see the API something like:
>
> #define smp_cond_load_relaxed_timewait(ptr, cond_expr,
> time_expr, time_end, slack_ns)
Ack.
> We can even drop time_end if we capture it in time_expr returning a bool
> (like we do with cond_expr).
I'm not sure we can combine time_expr, time_end. Given that we have two
ways to wait: spin and wait, both with different granularity, just a
binary check won't suffice.
For switching between wait and spin, we would also need to compare the
granularity of the mechanism, derive the time-remaining, check against
slack etc.
Thanks for the comments. Most helpful.
--
ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-05-24 3:22 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-21 18:37 ` Catalin Marinas
2025-05-24 3:22 ` Ankur Arora
2025-05-02 8:52 ` [PATCH v2 2/7] asm-generic: barrier: add wait_policy handlers Ankur Arora
2025-05-02 8:52 ` [PATCH v2 3/7] arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 4/7] arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 5/7] arm64: barrier: add fine " Ankur Arora
2025-05-02 8:52 ` [PATCH v2 6/7] asm-generic: barrier: add smp_cond_load_acquire_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 7/7] bpf: rqspinlock: add rqspinlock policy handler for arm64 Ankur Arora
2025-05-02 16:42 ` [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Christoph Lameter (Ampere)
2025-05-02 20:05 ` Ankur Arora
2025-05-05 16:13 ` Christoph Lameter (Ampere)
2025-05-05 17:08 ` Ankur Arora
2025-05-16 22:50 ` Okanovic, Haris
2025-05-17 1:16 ` Ankur Arora
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).