* [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
@ 2025-08-29 8:07 Ankur Arora
2025-08-29 8:07 ` [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
` (6 more replies)
0 siblings, 7 replies; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Hi,
This series adds waited variants of the smp_cond_load() primitives:
smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
Why?: as the name suggests, the new interfaces are meant for contexts
where you want to wait on a condition variable for a finite duration.
This is easy enough to do with a loop around cpu_relax(). However,
some architectures (ex. arm64) also allow waiting on a cacheline. So,
these interfaces handle a mixture of spin/wait with a smp_cond_load()
thrown in.
There are two known users for these interfaces:
- poll_idle() [1]
- resilient queued spinlocks [2]
The interfaces are:
smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr)
smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr)
The added parameter, time_check_expr, determines the bail out condition.
Changelog:
v3 [3]:
- further interface simplifications (suggested by Catalin Marinas)
v2 [4]:
- simplified the interface (suggested by Catalin Marinas)
- get rid of wait_policy, and a multitude of constants
- adds a slack parameter
This helped remove a fair amount of duplicated code duplication and in hindsight
unnecessary constants.
v1 [5]:
- add wait_policy (coarse and fine)
- derive spin-count etc at runtime instead of using arbitrary
constants.
Haris Okanovic had tested an earlier version of this series with
poll_idle()/haltpoll patches. [6]
Any comments appreciated!
Thanks!
Ankur
[1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
[2] Uses the smp_cond_load_acquire_timewait() from v1
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
[3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/
[4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/
[5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com/
[6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@amazon.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org
Ankur Arora (5):
asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
arm64: barrier: Add smp_cond_load_relaxed_timewait()
arm64: rqspinlock: Remove private copy of
smp_cond_load_acquire_timewait
asm-generic: barrier: Add smp_cond_load_acquire_timewait()
rqspinlock: use smp_cond_load_acquire_timewait()
arch/arm64/include/asm/barrier.h | 22 ++++++++
arch/arm64/include/asm/rqspinlock.h | 84 +----------------------------
include/asm-generic/barrier.h | 57 ++++++++++++++++++++
include/asm-generic/rqspinlock.h | 4 ++
kernel/bpf/rqspinlock.c | 25 ++++-----
5 files changed, 93 insertions(+), 99 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
@ 2025-08-29 8:07 ` Ankur Arora
2025-09-01 11:29 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 2/5] arm64: " Ankur Arora
` (5 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Add smp_cond_load_relaxed_timewait(), which extends
smp_cond_load_relaxed() to allow waiting for a finite duration.
The additional parameter allows for the timeout check.
The waiting is done via the usual cpu_relax() spin-wait around the
condition variable with periodic evaluation of the time-check.
The number of times we spin is defined by SMP_TIMEWAIT_SPIN_COUNT
(chosen to be 200 by default) which, assuming each cpu_relax()
iteration takes around 20-30 cycles (measured on a variety of x86
platforms), amounts to around 4000-6000 cycles.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/asm-generic/barrier.h | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index d4f581c1e21d..c87d6fd8746f 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -273,6 +273,41 @@ do { \
})
#endif
+#ifndef SMP_TIMEWAIT_SPIN_COUNT
+#define SMP_TIMEWAIT_SPIN_COUNT 200
+#endif
+
+/**
+ * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering
+ * guarantees until a timeout expires.
+ * @ptr: pointer to the variable to wait on
+ * @cond: boolean expression to wait for
+ * @time_check_expr: expression to decide when to bail out
+ *
+ * Equivalent to using READ_ONCE() on the condition variable.
+ */
+#ifndef smp_cond_load_relaxed_timewait
+#define smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) \
+({ \
+ typeof(ptr) __PTR = (ptr); \
+ __unqual_scalar_typeof(*ptr) VAL; \
+ u32 __n = 0, __spin = SMP_TIMEWAIT_SPIN_COUNT; \
+ \
+ for (;;) { \
+ VAL = READ_ONCE(*__PTR); \
+ if (cond_expr) \
+ break; \
+ cpu_relax(); \
+ if (++__n < __spin) \
+ continue; \
+ if (time_check_expr) \
+ break; \
+ __n = 0; \
+ } \
+ (typeof(*ptr))VAL; \
+})
+#endif
+
/*
* pmem_wmb() ensures that all stores for which the modification
* are written to persistent storage by preceding instructions have
--
2.31.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 2/5] arm64: barrier: Add smp_cond_load_relaxed_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
2025-08-29 8:07 ` [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-08-29 8:07 ` Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait Ankur Arora
` (4 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Add smp_cond_load_relaxed_timewait(), a timed variant of
smp_cond_load_relaxed().
This uses __cmpwait_relaxed() to do the actual waiting, with the
event-stream guaranteeing that we wake up from WFE periodically
and not block forever in case there are no stores to the cacheline.
For cases when the event-stream is unavailable, fallback to
spin-waiting.
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/barrier.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f5801b0ba9e9..9b29abc212db 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -219,6 +219,28 @@ do { \
(typeof(*ptr))VAL; \
})
+extern bool arch_timer_evtstrm_available(void);
+
+#define smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) \
+({ \
+ typeof(ptr) __PTR = (ptr); \
+ __unqual_scalar_typeof(*ptr) VAL; \
+ bool __wfe = arch_timer_evtstrm_available(); \
+ \
+ for (;;) { \
+ VAL = READ_ONCE(*__PTR); \
+ if (cond_expr) \
+ break; \
+ if (time_check_expr) \
+ break; \
+ if (likely(__wfe)) \
+ __cmpwait_relaxed(__PTR, VAL); \
+ else \
+ cpu_relax(); \
+ } \
+ (typeof(*ptr)) VAL; \
+})
+
#include <asm-generic/barrier.h>
#endif /* __ASSEMBLY__ */
--
2.31.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
2025-08-29 8:07 ` [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
2025-08-29 8:07 ` [PATCH v4 2/5] arm64: " Ankur Arora
@ 2025-08-29 8:07 ` Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait() Ankur Arora
` (3 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
In preparation for defining smp_cond_load_acquire_timewait(), remove
the private copy. Lacking this, the rqspinlock code falls back to using
smp_cond_load_acquire().
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/rqspinlock.h | 85 -----------------------------
1 file changed, 85 deletions(-)
diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
index 9ea0a74e5892..a385603436e9 100644
--- a/arch/arm64/include/asm/rqspinlock.h
+++ b/arch/arm64/include/asm/rqspinlock.h
@@ -3,91 +3,6 @@
#define _ASM_RQSPINLOCK_H
#include <asm/barrier.h>
-
-/*
- * Hardcode res_smp_cond_load_acquire implementations for arm64 to a custom
- * version based on [0]. In rqspinlock code, our conditional expression involves
- * checking the value _and_ additionally a timeout. However, on arm64, the
- * WFE-based implementation may never spin again if no stores occur to the
- * locked byte in the lock word. As such, we may be stuck forever if
- * event-stream based unblocking is not available on the platform for WFE spin
- * loops (arch_timer_evtstrm_available).
- *
- * Once support for smp_cond_load_acquire_timewait [0] lands, we can drop this
- * copy-paste.
- *
- * While we rely on the implementation to amortize the cost of sampling
- * cond_expr for us, it will not happen when event stream support is
- * unavailable, time_expr check is amortized. This is not the common case, and
- * it would be difficult to fit our logic in the time_expr_ns >= time_limit_ns
- * comparison, hence just let it be. In case of event-stream, the loop is woken
- * up at microsecond granularity.
- *
- * [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com
- */
-
-#ifndef smp_cond_load_acquire_timewait
-
-#define smp_cond_time_check_count 200
-
-#define __smp_cond_load_relaxed_spinwait(ptr, cond_expr, time_expr_ns, \
- time_limit_ns) ({ \
- typeof(ptr) __PTR = (ptr); \
- __unqual_scalar_typeof(*ptr) VAL; \
- unsigned int __count = 0; \
- for (;;) { \
- VAL = READ_ONCE(*__PTR); \
- if (cond_expr) \
- break; \
- cpu_relax(); \
- if (__count++ < smp_cond_time_check_count) \
- continue; \
- if ((time_expr_ns) >= (time_limit_ns)) \
- break; \
- __count = 0; \
- } \
- (typeof(*ptr))VAL; \
-})
-
-#define __smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, time_limit_ns) \
-({ \
- typeof(ptr) __PTR = (ptr); \
- __unqual_scalar_typeof(*ptr) VAL; \
- for (;;) { \
- VAL = smp_load_acquire(__PTR); \
- if (cond_expr) \
- break; \
- __cmpwait_relaxed(__PTR, VAL); \
- if ((time_expr_ns) >= (time_limit_ns)) \
- break; \
- } \
- (typeof(*ptr))VAL; \
-})
-
-#define smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, time_limit_ns) \
-({ \
- __unqual_scalar_typeof(*ptr) _val; \
- int __wfe = arch_timer_evtstrm_available(); \
- \
- if (likely(__wfe)) { \
- _val = __smp_cond_load_acquire_timewait(ptr, cond_expr, \
- time_expr_ns, \
- time_limit_ns); \
- } else { \
- _val = __smp_cond_load_relaxed_spinwait(ptr, cond_expr, \
- time_expr_ns, \
- time_limit_ns); \
- smp_acquire__after_ctrl_dep(); \
- } \
- (typeof(*ptr))_val; \
-})
-
-#endif
-
-#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire_timewait(v, c, 0, 1)
-
#include <asm-generic/rqspinlock.h>
#endif /* _ASM_RQSPINLOCK_H */
--
2.31.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
` (2 preceding siblings ...)
2025-08-29 8:07 ` [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait Ankur Arora
@ 2025-08-29 8:07 ` Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait() Ankur Arora
` (2 subsequent siblings)
6 siblings, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Add the acquire variant of smp_cond_load_relaxed_timewait(). This
reuses the relaxed variant, with an additional LOAD->LOAD
ordering.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/asm-generic/barrier.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index c87d6fd8746f..58e510e37b87 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -308,6 +308,28 @@ do { \
})
#endif
+/**
+ * smp_cond_load_acquire_timewait() - (Spin) wait for cond with ACQUIRE ordering
+ * until a timeout expires.
+ *
+ * Arguments: same as smp_cond_load_relaxed_timeout().
+ *
+ * Equivalent to using smp_cond_load_acquire() on the condition variable with
+ * a timeout.
+ */
+#ifndef smp_cond_load_acquire_timewait
+#define smp_cond_load_acquire_timewait(ptr, cond_expr, time_check_expr) \
+({ \
+ __unqual_scalar_typeof(*ptr) _val; \
+ _val = smp_cond_load_relaxed_timewait(ptr, cond_expr, \
+ time_check_expr); \
+ \
+ /* Depends on the control dependency of the wait above. */ \
+ smp_acquire__after_ctrl_dep(); \
+ (typeof(*ptr))_val; \
+})
+#endif
+
/*
* pmem_wmb() ensures that all stores for which the modification
* are written to persistent storage by preceding instructions have
--
2.31.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
` (3 preceding siblings ...)
2025-08-29 8:07 ` [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait() Ankur Arora
@ 2025-08-29 8:07 ` Ankur Arora
2025-09-01 11:28 ` Catalin Marinas
2025-08-29 18:54 ` [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Okanovic, Haris
2025-09-01 11:49 ` Catalin Marinas
6 siblings, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 8:07 UTC (permalink / raw)
To: linux-kernel, linux-arch, linux-arm-kernel, bpf
Cc: arnd, catalin.marinas, will, peterz, akpm, mark.rutland, harisokn,
cl, ast, memxor, zhenglifeng1, xueshuai, joao.m.martins,
boris.ostrovsky, konrad.wilk
Use smp_cond_load_acquire_timewait() to define
res_atomic_cond_read_acquire() and res_smp_cond_load_acquire_timewait().
The timeout check for both is done via RES_CHECK_TIMEOUT(). Define
res_smp_cond_load_acquire_waiting() to allow it to amortize the
check for spin-wait implementations.
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/rqspinlock.h | 3 +++
include/asm-generic/rqspinlock.h | 4 ++++
kernel/bpf/rqspinlock.c | 25 +++++++++----------------
3 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
index a385603436e9..ce8feadeb9a9 100644
--- a/arch/arm64/include/asm/rqspinlock.h
+++ b/arch/arm64/include/asm/rqspinlock.h
@@ -3,6 +3,9 @@
#define _ASM_RQSPINLOCK_H
#include <asm/barrier.h>
+
+#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
+
#include <asm-generic/rqspinlock.h>
#endif /* _ASM_RQSPINLOCK_H */
diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h
index 6d4244d643df..4b49c0ddf89a 100644
--- a/include/asm-generic/rqspinlock.h
+++ b/include/asm-generic/rqspinlock.h
@@ -247,4 +247,8 @@ static __always_inline void res_spin_unlock(rqspinlock_t *lock)
#define raw_res_spin_unlock_irqrestore(lock, flags) ({ raw_res_spin_unlock(lock); local_irq_restore(flags); })
+#ifndef res_smp_cond_load_acquire_waiting
+#define res_smp_cond_load_acquire_waiting() 0
+#endif
+
#endif /* __ASM_GENERIC_RQSPINLOCK_H */
diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
index 5ab354d55d82..8de1395422e8 100644
--- a/kernel/bpf/rqspinlock.c
+++ b/kernel/bpf/rqspinlock.c
@@ -82,6 +82,7 @@ struct rqspinlock_timeout {
u64 duration;
u64 cur;
u16 spin;
+ u8 wait;
};
#define RES_TIMEOUT_VAL 2
@@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
}
/*
- * Do not amortize with spins when res_smp_cond_load_acquire is defined,
- * as the macro does internal amortization for us.
+ * Only amortize with spins when we don't have a waiting implementation.
*/
-#ifndef res_smp_cond_load_acquire
#define RES_CHECK_TIMEOUT(ts, ret, mask) \
({ \
- if (!(ts).spin++) \
+ if ((ts).wait || !(ts).spin++) \
(ret) = check_timeout((lock), (mask), &(ts)); \
(ret); \
})
-#else
-#define RES_CHECK_TIMEOUT(ts, ret, mask) \
- ({ (ret) = check_timeout((lock), (mask), &(ts)); })
-#endif
/*
* Initialize the 'spin' member.
* Set spin member to 0 to trigger AA/ABBA checks immediately.
*/
-#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
+#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
/*
* We only need to reset 'timeout_end', 'spin' will just wrap around as necessary.
@@ -313,11 +308,8 @@ EXPORT_SYMBOL_GPL(resilient_tas_spin_lock);
*/
static DEFINE_PER_CPU_ALIGNED(struct qnode, rqnodes[_Q_MAX_NODES]);
-#ifndef res_smp_cond_load_acquire
-#define res_smp_cond_load_acquire(v, c) smp_cond_load_acquire(v, c)
-#endif
-
-#define res_atomic_cond_read_acquire(v, c) res_smp_cond_load_acquire(&(v)->counter, (c))
+#define res_atomic_cond_read_acquire(v, c, t) smp_cond_load_acquire_timewait(&(v)->counter, (c), (t))
+#define res_smp_cond_load_acquire_timewait(v, c, t) smp_cond_load_acquire_timewait(v, (c), (t))
/**
* resilient_queued_spin_lock_slowpath - acquire the queued spinlock
@@ -418,7 +410,8 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
*/
if (val & _Q_LOCKED_MASK) {
RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT);
- res_smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK));
+ res_smp_cond_load_acquire_timewait(&lock->locked, !VAL,
+ RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK));
}
if (ret) {
@@ -572,7 +565,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val)
* us.
*/
RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT * 2);
- val = res_atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK) ||
+ val = res_atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK),
RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_PENDING_MASK));
waitq_timeout:
--
2.31.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
` (4 preceding siblings ...)
2025-08-29 8:07 ` [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait() Ankur Arora
@ 2025-08-29 18:54 ` Okanovic, Haris
2025-08-29 22:38 ` Ankur Arora
2025-09-01 11:49 ` Catalin Marinas
6 siblings, 1 reply; 23+ messages in thread
From: Okanovic, Haris @ 2025-08-29 18:54 UTC (permalink / raw)
To: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
ankur.a.arora@oracle.com, bpf@vger.kernel.org
Cc: Okanovic, Haris, cl@gentwo.org, joao.m.martins@oracle.com,
akpm@linux-foundation.org, peterz@infradead.org,
mark.rutland@arm.com, memxor@gmail.com, catalin.marinas@arm.com,
arnd@arndb.de, will@kernel.org, zhenglifeng1@huawei.com,
ast@kernel.org, xueshuai@linux.alibaba.com,
konrad.wilk@oracle.com, boris.ostrovsky@oracle.com
On Fri, 2025-08-29 at 01:07 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi,
>
> This series adds waited variants of the smp_cond_load() primitives:
> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
>
> Why?: as the name suggests, the new interfaces are meant for contexts
> where you want to wait on a condition variable for a finite duration.
> This is easy enough to do with a loop around cpu_relax(). However,
> some architectures (ex. arm64) also allow waiting on a cacheline. So,
> these interfaces handle a mixture of spin/wait with a smp_cond_load()
> thrown in.
>
> There are two known users for these interfaces:
>
> - poll_idle() [1]
> - resilient queued spinlocks [2]
>
> The interfaces are:
> smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr)
> smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr)
>
> The added parameter, time_check_expr, determines the bail out condition.
>
> Changelog:
> v3 [3]:
> - further interface simplifications (suggested by Catalin Marinas)
>
> v2 [4]:
> - simplified the interface (suggested by Catalin Marinas)
> - get rid of wait_policy, and a multitude of constants
> - adds a slack parameter
> This helped remove a fair amount of duplicated code duplication and in hindsight
> unnecessary constants.
>
> v1 [5]:
> - add wait_policy (coarse and fine)
> - derive spin-count etc at runtime instead of using arbitrary
> constants.
>
> Haris Okanovic had tested an earlier version of this series with
> poll_idle()/haltpoll patches. [6]
>
> Any comments appreciated!
>
> Thanks!
> Ankur
>
> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
> [2] Uses the smp_cond_load_acquire_timewait() from v1
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
> [3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/
> [4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/
> [5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com/
> [6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@amazon.com
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: linux-arch@vger.kernel.org
>
> Ankur Arora (5):
> asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
> arm64: barrier: Add smp_cond_load_relaxed_timewait()
> arm64: rqspinlock: Remove private copy of
> smp_cond_load_acquire_timewait
> asm-generic: barrier: Add smp_cond_load_acquire_timewait()
> rqspinlock: use smp_cond_load_acquire_timewait()
>
> arch/arm64/include/asm/barrier.h | 22 ++++++++
> arch/arm64/include/asm/rqspinlock.h | 84 +----------------------------
> include/asm-generic/barrier.h | 57 ++++++++++++++++++++
> include/asm-generic/rqspinlock.h | 4 ++
> kernel/bpf/rqspinlock.c | 25 ++++-----
> 5 files changed, 93 insertions(+), 99 deletions(-)
>
> --
> 2.31.1
>
Tested on AWS Graviton 2, 3, and 4 (ARM64 Neoverse N1, V1, and V2) with
your V10 haltpoll changes, atop 6.17.0-rc3 (commit 07d9df8008).
Still seeing between 1.3x and 2.5x speedups in `perf bench sched pipe`
and `seccomp-notify`; no change in `messaging`.
Reviewed-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Haris Okanovic <harisokn@amazon.com>
Regards,
Haris Okanovic
AWS Graviton Software
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-08-29 18:54 ` [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Okanovic, Haris
@ 2025-08-29 22:38 ` Ankur Arora
0 siblings, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-08-29 22:38 UTC (permalink / raw)
To: Okanovic, Haris
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
ankur.a.arora@oracle.com, bpf@vger.kernel.org, cl@gentwo.org,
joao.m.martins@oracle.com, akpm@linux-foundation.org,
peterz@infradead.org, mark.rutland@arm.com, memxor@gmail.com,
catalin.marinas@arm.com, arnd@arndb.de, will@kernel.org,
zhenglifeng1@huawei.com, ast@kernel.org,
xueshuai@linux.alibaba.com, konrad.wilk@oracle.com,
boris.ostrovsky@oracle.com
Okanovic, Haris <harisokn@amazon.com> writes:
> On Fri, 2025-08-29 at 01:07 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Hi,
>>
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timewait(), and smp_cond_load_acquire_timewait().
>>
>> Why?: as the name suggests, the new interfaces are meant for contexts
>> where you want to wait on a condition variable for a finite duration.
>> This is easy enough to do with a loop around cpu_relax(). However,
>> some architectures (ex. arm64) also allow waiting on a cacheline. So,
>> these interfaces handle a mixture of spin/wait with a smp_cond_load()
>> thrown in.
>>
>> There are two known users for these interfaces:
>>
>> - poll_idle() [1]
>> - resilient queued spinlocks [2]
>>
>> The interfaces are:
>> smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr)
>> smp_cond_load_acquire_spinwait(ptr, cond_expr, time_check_expr)
>>
>> The added parameter, time_check_expr, determines the bail out condition.
>>
>> Changelog:
>> v3 [3]:
>> - further interface simplifications (suggested by Catalin Marinas)
>>
>> v2 [4]:
>> - simplified the interface (suggested by Catalin Marinas)
>> - get rid of wait_policy, and a multitude of constants
>> - adds a slack parameter
>> This helped remove a fair amount of duplicated code duplication and in hindsight
>> unnecessary constants.
>>
>> v1 [5]:
>> - add wait_policy (coarse and fine)
>> - derive spin-count etc at runtime instead of using arbitrary
>> constants.
>>
>> Haris Okanovic had tested an earlier version of this series with
>> poll_idle()/haltpoll patches. [6]
>>
>> Any comments appreciated!
>>
>> Thanks!
>> Ankur
>>
>> [1] https://lore.kernel.org/lkml/20241107190818.522639-3-ankur.a.arora@oracle.com/
>> [2] Uses the smp_cond_load_acquire_timewait() from v1
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/include/asm/rqspinlock.h
>> [3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/
>> [4] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/
>> [5] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com/
>> [6] https://lore.kernel.org/lkml/f2f5d09e79539754ced085ed89865787fa668695.camel@amazon.com
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>> Cc: Alexei Starovoitov <ast@kernel.org>
>> Cc: linux-arch@vger.kernel.org
>>
>> Ankur Arora (5):
>> asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: rqspinlock: Remove private copy of
>> smp_cond_load_acquire_timewait
>> asm-generic: barrier: Add smp_cond_load_acquire_timewait()
>> rqspinlock: use smp_cond_load_acquire_timewait()
>>
>> arch/arm64/include/asm/barrier.h | 22 ++++++++
>> arch/arm64/include/asm/rqspinlock.h | 84 +----------------------------
>> include/asm-generic/barrier.h | 57 ++++++++++++++++++++
>> include/asm-generic/rqspinlock.h | 4 ++
>> kernel/bpf/rqspinlock.c | 25 ++++-----
>> 5 files changed, 93 insertions(+), 99 deletions(-)
>>
>> --
>> 2.31.1
>>
>
> Tested on AWS Graviton 2, 3, and 4 (ARM64 Neoverse N1, V1, and V2) with
> your V10 haltpoll changes, atop 6.17.0-rc3 (commit 07d9df8008).
> Still seeing between 1.3x and 2.5x speedups in `perf bench sched pipe`
> and `seccomp-notify`; no change in `messaging`.
Great.
> Reviewed-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Haris Okanovic <harisokn@amazon.com>
Thank you.
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
2025-08-29 8:07 ` [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait() Ankur Arora
@ 2025-09-01 11:28 ` Catalin Marinas
2025-09-02 17:43 ` Alexei Starovoitov
2025-09-02 21:31 ` Ankur Arora
0 siblings, 2 replies; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:28 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
> diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
> index a385603436e9..ce8feadeb9a9 100644
> --- a/arch/arm64/include/asm/rqspinlock.h
> +++ b/arch/arm64/include/asm/rqspinlock.h
> @@ -3,6 +3,9 @@
> #define _ASM_RQSPINLOCK_H
>
> #include <asm/barrier.h>
> +
> +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
More on this below, I don't think we should define it.
> diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> index 5ab354d55d82..8de1395422e8 100644
> --- a/kernel/bpf/rqspinlock.c
> +++ b/kernel/bpf/rqspinlock.c
> @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
> u64 duration;
> u64 cur;
> u16 spin;
> + u8 wait;
> };
>
> #define RES_TIMEOUT_VAL 2
> @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
> }
>
> /*
> - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
> - * as the macro does internal amortization for us.
> + * Only amortize with spins when we don't have a waiting implementation.
> */
> -#ifndef res_smp_cond_load_acquire
> #define RES_CHECK_TIMEOUT(ts, ret, mask) \
> ({ \
> - if (!(ts).spin++) \
> + if ((ts).wait || !(ts).spin++) \
> (ret) = check_timeout((lock), (mask), &(ts)); \
> (ret); \
> })
> -#else
> -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
> - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
> -#endif
IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
doesn't amortise the spins, as the comment suggests, but rather the
calls to check_timeout(). This is fine, it matches the behaviour of
smp_cond_load_relaxed_timewait() you introduced in the first patch. The
only difference is the number of spins - 200 (matching poll_idle) vs 64K
above. Does 200 work for the above?
> /*
> * Initialize the 'spin' member.
> * Set spin member to 0 to trigger AA/ABBA checks immediately.
> */
> -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
> +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
First of all, I don't really like the smp_cond_load_acquire_waiting(),
that's an implementation detail of smp_cond_load_*_timewait() that
shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
also used outside the smp_cond_load_acquire_timewait() condition. The
(ts).wait check only makes sense when used together with the WFE
waiting.
I would leave RES_CHECK_TIMEOUT() as is for the stand-alone cases and
just use check_timeout() in the smp_cond_load_acquire_timewait()
scenarios. I would also drop the res_smp_cond_load_acquire() macro since
you now defined smp_cond_load_acquire_timewait() generically and can be
used directly.
--
Catalin
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
2025-08-29 8:07 ` [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
@ 2025-09-01 11:29 ` Catalin Marinas
2025-09-02 21:34 ` Ankur Arora
0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:29 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:31AM -0700, Ankur Arora wrote:
> Add smp_cond_load_relaxed_timewait(), which extends
> smp_cond_load_relaxed() to allow waiting for a finite duration.
>
> The additional parameter allows for the timeout check.
>
> The waiting is done via the usual cpu_relax() spin-wait around the
> condition variable with periodic evaluation of the time-check.
>
> The number of times we spin is defined by SMP_TIMEWAIT_SPIN_COUNT
> (chosen to be 200 by default) which, assuming each cpu_relax()
> iteration takes around 20-30 cycles (measured on a variety of x86
> platforms), amounts to around 4000-6000 cycles.
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: linux-arch@vger.kernel.org
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Apart from the name, this looks fine (I'd have preferred the "timeout"
suffix).
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 2/5] arm64: barrier: Add smp_cond_load_relaxed_timewait()
2025-08-29 8:07 ` [PATCH v4 2/5] arm64: " Ankur Arora
@ 2025-09-01 11:47 ` Catalin Marinas
2025-09-02 22:40 ` Ankur Arora
0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:47 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:32AM -0700, Ankur Arora wrote:
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index f5801b0ba9e9..9b29abc212db 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -219,6 +219,28 @@ do { \
> (typeof(*ptr))VAL; \
> })
>
> +extern bool arch_timer_evtstrm_available(void);
In theory, this doesn't work if CONFIG_ARM_ARCH_TIMER is disabled,
though that's not the case for arm64. Maybe add a short comment that's
re-declared here to avoid include dependencies.
> +
> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) \
> +({ \
> + typeof(ptr) __PTR = (ptr); \
> + __unqual_scalar_typeof(*ptr) VAL; \
> + bool __wfe = arch_timer_evtstrm_available(); \
> + \
> + for (;;) { \
> + VAL = READ_ONCE(*__PTR); \
> + if (cond_expr) \
> + break; \
> + if (time_check_expr) \
> + break; \
> + if (likely(__wfe)) \
> + __cmpwait_relaxed(__PTR, VAL); \
> + else \
> + cpu_relax(); \
> + } \
> + (typeof(*ptr)) VAL; \
> +})
> +
> #include <asm-generic/barrier.h>
>
> #endif /* __ASSEMBLY__ */
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait
2025-08-29 8:07 ` [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait Ankur Arora
@ 2025-09-01 11:47 ` Catalin Marinas
0 siblings, 0 replies; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:47 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:33AM -0700, Ankur Arora wrote:
> In preparation for defining smp_cond_load_acquire_timewait(), remove
> the private copy. Lacking this, the rqspinlock code falls back to using
> smp_cond_load_acquire().
>
> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait()
2025-08-29 8:07 ` [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait() Ankur Arora
@ 2025-09-01 11:47 ` Catalin Marinas
0 siblings, 0 replies; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:47 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:34AM -0700, Ankur Arora wrote:
> Add the acquire variant of smp_cond_load_relaxed_timewait(). This
> reuses the relaxed variant, with an additional LOAD->LOAD
> ordering.
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: linux-arch@vger.kernel.org
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
` (5 preceding siblings ...)
2025-08-29 18:54 ` [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Okanovic, Haris
@ 2025-09-01 11:49 ` Catalin Marinas
2025-09-02 22:46 ` Ankur Arora
2025-09-03 15:56 ` Okanovic, Haris
6 siblings, 2 replies; 23+ messages in thread
From: Catalin Marinas @ 2025-09-01 11:49 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 29, 2025 at 01:07:30AM -0700, Ankur Arora wrote:
> Ankur Arora (5):
> asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
> arm64: barrier: Add smp_cond_load_relaxed_timewait()
> arm64: rqspinlock: Remove private copy of
> smp_cond_load_acquire_timewait
> asm-generic: barrier: Add smp_cond_load_acquire_timewait()
> rqspinlock: use smp_cond_load_acquire_timewait()
Can you have a go at poll_idle() to see how it would look like using
this API? It doesn't necessarily mean we have to merge them all at once
but it gives us a better idea of the suitability of the interface.
--
Catalin
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
2025-09-01 11:28 ` Catalin Marinas
@ 2025-09-02 17:43 ` Alexei Starovoitov
2025-09-02 21:30 ` Ankur Arora
2025-09-02 21:31 ` Ankur Arora
1 sibling, 1 reply; 23+ messages in thread
From: Alexei Starovoitov @ 2025-09-02 17:43 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, LKML, linux-arch, linux-arm-kernel, bpf,
Arnd Bergmann, Will Deacon, Peter Zijlstra, Andrew Morton,
Mark Rutland, harisokn, cl, Alexei Starovoitov,
Kumar Kartikeya Dwivedi, zhenglifeng1, xueshuai, joao.m.martins,
Boris Ostrovsky, konrad.wilk
On Mon, Sep 1, 2025 at 4:28 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
> > diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
> > index a385603436e9..ce8feadeb9a9 100644
> > --- a/arch/arm64/include/asm/rqspinlock.h
> > +++ b/arch/arm64/include/asm/rqspinlock.h
> > @@ -3,6 +3,9 @@
> > #define _ASM_RQSPINLOCK_H
> >
> > #include <asm/barrier.h>
> > +
> > +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
>
> More on this below, I don't think we should define it.
>
> > diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
> > index 5ab354d55d82..8de1395422e8 100644
> > --- a/kernel/bpf/rqspinlock.c
> > +++ b/kernel/bpf/rqspinlock.c
> > @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
> > u64 duration;
> > u64 cur;
> > u16 spin;
> > + u8 wait;
> > };
> >
> > #define RES_TIMEOUT_VAL 2
> > @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
> > }
> >
> > /*
> > - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
> > - * as the macro does internal amortization for us.
> > + * Only amortize with spins when we don't have a waiting implementation.
> > */
> > -#ifndef res_smp_cond_load_acquire
> > #define RES_CHECK_TIMEOUT(ts, ret, mask) \
> > ({ \
> > - if (!(ts).spin++) \
> > + if ((ts).wait || !(ts).spin++) \
> > (ret) = check_timeout((lock), (mask), &(ts)); \
> > (ret); \
> > })
> > -#else
> > -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
> > - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
> > -#endif
>
> IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
> doesn't amortise the spins, as the comment suggests, but rather the
> calls to check_timeout(). This is fine, it matches the behaviour of
> smp_cond_load_relaxed_timewait() you introduced in the first patch. The
> only difference is the number of spins - 200 (matching poll_idle) vs 64K
> above. Does 200 work for the above?
>
> > /*
> > * Initialize the 'spin' member.
> > * Set spin member to 0 to trigger AA/ABBA checks immediately.
> > */
> > -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
> > +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
>
> First of all, I don't really like the smp_cond_load_acquire_waiting(),
> that's an implementation detail of smp_cond_load_*_timewait() that
> shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
> also used outside the smp_cond_load_acquire_timewait() condition. The
> (ts).wait check only makes sense when used together with the WFE
> waiting.
+1 to the above.
Penalizing all other architectures with pointless runtime check:
> - if (!(ts).spin++) \
> + if ((ts).wait || !(ts).spin++) \
is not acceptable.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
2025-09-02 17:43 ` Alexei Starovoitov
@ 2025-09-02 21:30 ` Ankur Arora
0 siblings, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-09-02 21:30 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Catalin Marinas, Ankur Arora, LKML, linux-arch, linux-arm-kernel,
bpf, Arnd Bergmann, Will Deacon, Peter Zijlstra, Andrew Morton,
Mark Rutland, harisokn, cl, Alexei Starovoitov,
Kumar Kartikeya Dwivedi, zhenglifeng1, xueshuai, joao.m.martins,
Boris Ostrovsky, konrad.wilk
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> On Mon, Sep 1, 2025 at 4:28 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>>
>> On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
>> > diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
>> > index a385603436e9..ce8feadeb9a9 100644
>> > --- a/arch/arm64/include/asm/rqspinlock.h
>> > +++ b/arch/arm64/include/asm/rqspinlock.h
>> > @@ -3,6 +3,9 @@
>> > #define _ASM_RQSPINLOCK_H
>> >
>> > #include <asm/barrier.h>
>> > +
>> > +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
>>
>> More on this below, I don't think we should define it.
>>
>> > diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
>> > index 5ab354d55d82..8de1395422e8 100644
>> > --- a/kernel/bpf/rqspinlock.c
>> > +++ b/kernel/bpf/rqspinlock.c
>> > @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
>> > u64 duration;
>> > u64 cur;
>> > u16 spin;
>> > + u8 wait;
>> > };
>> >
>> > #define RES_TIMEOUT_VAL 2
>> > @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
>> > }
>> >
>> > /*
>> > - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
>> > - * as the macro does internal amortization for us.
>> > + * Only amortize with spins when we don't have a waiting implementation.
>> > */
>> > -#ifndef res_smp_cond_load_acquire
>> > #define RES_CHECK_TIMEOUT(ts, ret, mask) \
>> > ({ \
>> > - if (!(ts).spin++) \
>> > + if ((ts).wait || !(ts).spin++) \
>> > (ret) = check_timeout((lock), (mask), &(ts)); \
>> > (ret); \
>> > })
>> > -#else
>> > -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
>> > - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
>> > -#endif
>>
>> IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
>> doesn't amortise the spins, as the comment suggests, but rather the
>> calls to check_timeout(). This is fine, it matches the behaviour of
>> smp_cond_load_relaxed_timewait() you introduced in the first patch. The
>> only difference is the number of spins - 200 (matching poll_idle) vs 64K
>> above. Does 200 work for the above?
>>
>> > /*
>> > * Initialize the 'spin' member.
>> > * Set spin member to 0 to trigger AA/ABBA checks immediately.
>> > */
>> > -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
>> > +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
>>
>> First of all, I don't really like the smp_cond_load_acquire_waiting(),
>> that's an implementation detail of smp_cond_load_*_timewait() that
>> shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
>> also used outside the smp_cond_load_acquire_timewait() condition. The
>> (ts).wait check only makes sense when used together with the WFE
>> waiting.
>
> +1 to the above.
Ack.
> Penalizing all other architectures with pointless runtime check:
>
>> - if (!(ts).spin++) \
>> + if ((ts).wait || !(ts).spin++) \
>
> is not acceptable.
Is it still a penalty if the context is a busy wait loop.
Oddly enough the compiler does not eliminate this check on x86 (given
that it is statically defined to be 0 and ts does not escape the
function.)
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait()
2025-09-01 11:28 ` Catalin Marinas
2025-09-02 17:43 ` Alexei Starovoitov
@ 2025-09-02 21:31 ` Ankur Arora
1 sibling, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-09-02 21:31 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Aug 29, 2025 at 01:07:35AM -0700, Ankur Arora wrote:
>> diff --git a/arch/arm64/include/asm/rqspinlock.h b/arch/arm64/include/asm/rqspinlock.h
>> index a385603436e9..ce8feadeb9a9 100644
>> --- a/arch/arm64/include/asm/rqspinlock.h
>> +++ b/arch/arm64/include/asm/rqspinlock.h
>> @@ -3,6 +3,9 @@
>> #define _ASM_RQSPINLOCK_H
>>
>> #include <asm/barrier.h>
>> +
>> +#define res_smp_cond_load_acquire_waiting() arch_timer_evtstrm_available()
>
> More on this below, I don't think we should define it.
>
>> diff --git a/kernel/bpf/rqspinlock.c b/kernel/bpf/rqspinlock.c
>> index 5ab354d55d82..8de1395422e8 100644
>> --- a/kernel/bpf/rqspinlock.c
>> +++ b/kernel/bpf/rqspinlock.c
>> @@ -82,6 +82,7 @@ struct rqspinlock_timeout {
>> u64 duration;
>> u64 cur;
>> u16 spin;
>> + u8 wait;
>> };
>>
>> #define RES_TIMEOUT_VAL 2
>> @@ -241,26 +242,20 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask,
>> }
>>
>> /*
>> - * Do not amortize with spins when res_smp_cond_load_acquire is defined,
>> - * as the macro does internal amortization for us.
>> + * Only amortize with spins when we don't have a waiting implementation.
>> */
>> -#ifndef res_smp_cond_load_acquire
>> #define RES_CHECK_TIMEOUT(ts, ret, mask) \
>> ({ \
>> - if (!(ts).spin++) \
>> + if ((ts).wait || !(ts).spin++) \
>> (ret) = check_timeout((lock), (mask), &(ts)); \
>> (ret); \
>> })
>> -#else
>> -#define RES_CHECK_TIMEOUT(ts, ret, mask) \
>> - ({ (ret) = check_timeout((lock), (mask), &(ts)); })
>> -#endif
>
> IIUC, RES_CHECK_TIMEOUT in the current res_smp_cond_load_acquire() usage
> doesn't amortise the spins, as the comment suggests, but rather the
> calls to check_timeout(). This is fine, it matches the behaviour of
> smp_cond_load_relaxed_timewait() you introduced in the first patch. The
> only difference is the number of spins - 200 (matching poll_idle) vs 64K
> above. Does 200 work for the above?
Works for me. I had added this because there seemed to be vast gulf between
64K and 200. Happy to drop this.
>> /*
>> * Initialize the 'spin' member.
>> * Set spin member to 0 to trigger AA/ABBA checks immediately.
>> */
>> -#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; })
>> +#define RES_INIT_TIMEOUT(ts) ({ (ts).spin = 0; (ts).wait = res_smp_cond_load_acquire_waiting(); })
>
> First of all, I don't really like the smp_cond_load_acquire_waiting(),
> that's an implementation detail of smp_cond_load_*_timewait() that
> shouldn't leak outside. But more importantly, RES_CHECK_TIMEOUT() is
> also used outside the smp_cond_load_acquire_timewait() condition. The
> (ts).wait check only makes sense when used together with the WFE
> waiting.
>
> I would leave RES_CHECK_TIMEOUT() as is for the stand-alone cases and
> just use check_timeout() in the smp_cond_load_acquire_timewait()
> scenarios. I would also drop the res_smp_cond_load_acquire() macro since
> you now defined smp_cond_load_acquire_timewait() generically and can be
> used directly.
Sounds good.
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
2025-09-01 11:29 ` Catalin Marinas
@ 2025-09-02 21:34 ` Ankur Arora
0 siblings, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-09-02 21:34 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Aug 29, 2025 at 01:07:31AM -0700, Ankur Arora wrote:
>> Add smp_cond_load_relaxed_timewait(), which extends
>> smp_cond_load_relaxed() to allow waiting for a finite duration.
>>
>> The additional parameter allows for the timeout check.
>>
>> The waiting is done via the usual cpu_relax() spin-wait around the
>> condition variable with periodic evaluation of the time-check.
>>
>> The number of times we spin is defined by SMP_TIMEWAIT_SPIN_COUNT
>> (chosen to be 200 by default) which, assuming each cpu_relax()
>> iteration takes around 20-30 cycles (measured on a variety of x86
>> platforms), amounts to around 4000-6000 cycles.
>>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: linux-arch@vger.kernel.org
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>
> Apart from the name, this looks fine (I'd have preferred the "timeout"
> suffix).
So, the suffix "timewait" made sense to me because I was trying to
differentiate with spinwait etc.
Given that that issue is no longer meaningful, "timeout" makes more sense.
Will change.
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Thanks for this and all the reviews.
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 2/5] arm64: barrier: Add smp_cond_load_relaxed_timewait()
2025-09-01 11:47 ` Catalin Marinas
@ 2025-09-02 22:40 ` Ankur Arora
0 siblings, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-09-02 22:40 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Aug 29, 2025 at 01:07:32AM -0700, Ankur Arora wrote:
>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> index f5801b0ba9e9..9b29abc212db 100644
>> --- a/arch/arm64/include/asm/barrier.h
>> +++ b/arch/arm64/include/asm/barrier.h
>> @@ -219,6 +219,28 @@ do { \
>> (typeof(*ptr))VAL; \
>> })
>>
>> +extern bool arch_timer_evtstrm_available(void);
>
> In theory, this doesn't work if CONFIG_ARM_ARCH_TIMER is disabled,
> though that's not the case for arm64. Maybe add a short comment that's
> re-declared here to avoid include dependencies.
Will add.
Thanks
Ankur
>> +
>> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, time_check_expr) \
>> +({ \
>> + typeof(ptr) __PTR = (ptr); \
>> + __unqual_scalar_typeof(*ptr) VAL; \
>> + bool __wfe = arch_timer_evtstrm_available(); \
>> + \
>> + for (;;) { \
>> + VAL = READ_ONCE(*__PTR); \
>> + if (cond_expr) \
>> + break; \
>> + if (time_check_expr) \
>> + break; \
>> + if (likely(__wfe)) \
>> + __cmpwait_relaxed(__PTR, VAL); \
>> + else \
>> + cpu_relax(); \
>> + } \
>> + (typeof(*ptr)) VAL; \
>> +})
>> +
>> #include <asm-generic/barrier.h>
>>
>> #endif /* __ASSEMBLY__ */
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-09-01 11:49 ` Catalin Marinas
@ 2025-09-02 22:46 ` Ankur Arora
2025-09-03 9:27 ` Catalin Marinas
2025-09-03 15:56 ` Okanovic, Haris
1 sibling, 1 reply; 23+ messages in thread
From: Ankur Arora @ 2025-09-02 22:46 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Fri, Aug 29, 2025 at 01:07:30AM -0700, Ankur Arora wrote:
>> Ankur Arora (5):
>> asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: barrier: Add smp_cond_load_relaxed_timewait()
>> arm64: rqspinlock: Remove private copy of
>> smp_cond_load_acquire_timewait
>> asm-generic: barrier: Add smp_cond_load_acquire_timewait()
>> rqspinlock: use smp_cond_load_acquire_timewait()
>
> Can you have a go at poll_idle() to see how it would look like using
> this API? It doesn't necessarily mean we have to merge them all at once
> but it gives us a better idea of the suitability of the interface.
So, I've been testing with some version of the following:
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..361879396d0c 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -8,35 +8,25 @@
#include <linux/sched/clock.h>
#include <linux/sched/idle.h>
-#define POLL_IDLE_RELAX_COUNT 200
-
static int __cpuidle poll_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
- u64 time_start;
-
- time_start = local_clock_noinstr();
+ unsigned long flags;
dev->poll_time_limit = false;
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
- unsigned int loop_count = 0;
- u64 limit;
+ u64 limit, time_end;
limit = cpuidle_poll_time(drv, dev);
+ time_end = local_clock_noinstr() + limit;
- while (!need_resched()) {
- cpu_relax();
- if (loop_count++ < POLL_IDLE_RELAX_COUNT)
- continue;
+ flags = smp_cond_load_relaxed_timewait(¤t_thread_info()->flags,
+ VAL & _TIF_NEED_RESCHED,
+ (local_clock_noinstr() >= time_end));
- loop_count = 0;
- if (local_clock_noinstr() - time_start > limit) {
- dev->poll_time_limit = true;
- break;
- }
- }
+ dev->poll_time_limit = (local_clock_noinstr() >= time_end);
}
raw_local_irq_disable();
With that, poll_idle() is:
static int __cpuidle poll_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
unsigned long flags;
dev->poll_time_limit = false;
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
u64 limit, time_end;
limit = cpuidle_poll_time(drv, dev);
time_end = local_clock_noinstr() + limit;
flags = smp_cond_load_relaxed_timewait(¤t_thread_info()->flags,
VAL & _TIF_NEED_RESCHED,
(local_clock_noinstr() >= time_end));
dev->poll_time_limit = (local_clock_noinstr() >= time_end);
}
raw_local_irq_disable();
current_clr_polling();
return index;
}
--
ankur
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-09-02 22:46 ` Ankur Arora
@ 2025-09-03 9:27 ` Catalin Marinas
2025-09-03 18:34 ` Ankur Arora
0 siblings, 1 reply; 23+ messages in thread
From: Catalin Marinas @ 2025-09-03 9:27 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, linux-arch, linux-arm-kernel, bpf, arnd, will,
peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Tue, Sep 02, 2025 at 03:46:52PM -0700, Ankur Arora wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
> > Can you have a go at poll_idle() to see how it would look like using
> > this API? It doesn't necessarily mean we have to merge them all at once
> > but it gives us a better idea of the suitability of the interface.
>
> So, I've been testing with some version of the following:
>
> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
> index 9b6d90a72601..361879396d0c 100644
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -8,35 +8,25 @@
> #include <linux/sched/clock.h>
> #include <linux/sched/idle.h>
>
> -#define POLL_IDLE_RELAX_COUNT 200
> -
> static int __cpuidle poll_idle(struct cpuidle_device *dev,
> struct cpuidle_driver *drv, int index)
> {
> - u64 time_start;
> -
> - time_start = local_clock_noinstr();
> + unsigned long flags;
>
> dev->poll_time_limit = false;
>
> raw_local_irq_enable();
> if (!current_set_polling_and_test()) {
> - unsigned int loop_count = 0;
> - u64 limit;
> + u64 limit, time_end;
>
> limit = cpuidle_poll_time(drv, dev);
> + time_end = local_clock_noinstr() + limit;
>
> - while (!need_resched()) {
> - cpu_relax();
> - if (loop_count++ < POLL_IDLE_RELAX_COUNT)
> - continue;
> + flags = smp_cond_load_relaxed_timewait(¤t_thread_info()->flags,
> + VAL & _TIF_NEED_RESCHED,
> + (local_clock_noinstr() >= time_end));
It makes sense to have the non-strict comparison, though it changes the
original behaviour slightly. Just mention it in the commit log.
>
> - loop_count = 0;
> - if (local_clock_noinstr() - time_start > limit) {
> - dev->poll_time_limit = true;
> - break;
> - }
> - }
> + dev->poll_time_limit = (local_clock_noinstr() >= time_end);
Could we do this instead and avoid another clock read:
dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
In the original code, it made sense since it had to check the clock
anyway and break the loop.
When you repost, please include the rqspinlock and poll_idle changes as
well to show how the interface is used.
--
Catalin
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-09-01 11:49 ` Catalin Marinas
2025-09-02 22:46 ` Ankur Arora
@ 2025-09-03 15:56 ` Okanovic, Haris
1 sibling, 0 replies; 23+ messages in thread
From: Okanovic, Haris @ 2025-09-03 15:56 UTC (permalink / raw)
To: catalin.marinas@arm.com, ankur.a.arora@oracle.com
Cc: joao.m.martins@oracle.com, xueshuai@linux.alibaba.com,
boris.ostrovsky@oracle.com, memxor@gmail.com,
zhenglifeng1@huawei.com, konrad.wilk@oracle.com, cl@gentwo.org,
akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
ast@kernel.org, linux-arch@vger.kernel.org, arnd@arndb.de,
will@kernel.org, mark.rutland@arm.com, peterz@infradead.org,
bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
Okanovic, Haris
On Mon, 2025-09-01 at 12:49 +0100, Catalin Marinas wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Fri, Aug 29, 2025 at 01:07:30AM -0700, Ankur Arora wrote:
> > Ankur Arora (5):
> > asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
> > arm64: barrier: Add smp_cond_load_relaxed_timewait()
> > arm64: rqspinlock: Remove private copy of
> > smp_cond_load_acquire_timewait
> > asm-generic: barrier: Add smp_cond_load_acquire_timewait()
> > rqspinlock: use smp_cond_load_acquire_timewait()
>
> Can you have a go at poll_idle() to see how it would look like using
> this API? It doesn't necessarily mean we have to merge them all at once
> but it gives us a better idea of the suitability of the interface.
This is what I tested on ARM on Fri:
https://github.com/harisokanovic/linux/blob/37e02b950c99370466e7385e5e754bbd6232ef95/drivers/cpuidle/poll_state.c#L24
Regards,
Haris Okanovic
AWS Graviton Software
>
> --
> Catalin
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait()
2025-09-03 9:27 ` Catalin Marinas
@ 2025-09-03 18:34 ` Ankur Arora
0 siblings, 0 replies; 23+ messages in thread
From: Ankur Arora @ 2025-09-03 18:34 UTC (permalink / raw)
To: Catalin Marinas
Cc: Ankur Arora, linux-kernel, linux-arch, linux-arm-kernel, bpf,
arnd, will, peterz, akpm, mark.rutland, harisokn, cl, ast, memxor,
zhenglifeng1, xueshuai, joao.m.martins, boris.ostrovsky,
konrad.wilk
Catalin Marinas <catalin.marinas@arm.com> writes:
> On Tue, Sep 02, 2025 at 03:46:52PM -0700, Ankur Arora wrote:
>> Catalin Marinas <catalin.marinas@arm.com> writes:
>> > Can you have a go at poll_idle() to see how it would look like using
>> > this API? It doesn't necessarily mean we have to merge them all at once
>> > but it gives us a better idea of the suitability of the interface.
>>
>> So, I've been testing with some version of the following:
>>
>> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
>> index 9b6d90a72601..361879396d0c 100644
>> --- a/drivers/cpuidle/poll_state.c
>> +++ b/drivers/cpuidle/poll_state.c
>> @@ -8,35 +8,25 @@
>> #include <linux/sched/clock.h>
>> #include <linux/sched/idle.h>
>>
>> -#define POLL_IDLE_RELAX_COUNT 200
>> -
>> static int __cpuidle poll_idle(struct cpuidle_device *dev,
>> struct cpuidle_driver *drv, int index)
>> {
>> - u64 time_start;
>> -
>> - time_start = local_clock_noinstr();
>> + unsigned long flags;
>>
>> dev->poll_time_limit = false;
>>
>> raw_local_irq_enable();
>> if (!current_set_polling_and_test()) {
>> - unsigned int loop_count = 0;
>> - u64 limit;
>> + u64 limit, time_end;
>>
>> limit = cpuidle_poll_time(drv, dev);
>> + time_end = local_clock_noinstr() + limit;
>>
>> - while (!need_resched()) {
>> - cpu_relax();
>> - if (loop_count++ < POLL_IDLE_RELAX_COUNT)
>> - continue;
>> + flags = smp_cond_load_relaxed_timewait(¤t_thread_info()->flags,
>> + VAL & _TIF_NEED_RESCHED,
>> + (local_clock_noinstr() >= time_end));
>
> It makes sense to have the non-strict comparison, though it changes the
> original behaviour slightly. Just mention it in the commit log.
>
>>
>> - loop_count = 0;
>> - if (local_clock_noinstr() - time_start > limit) {
>> - dev->poll_time_limit = true;
>> - break;
>> - }
>> - }
>> + dev->poll_time_limit = (local_clock_noinstr() >= time_end);
>
> Could we do this instead and avoid another clock read:
>
> dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
>
> In the original code, it made sense since it had to check the clock
> anyway and break the loop.
>
> When you repost, please include the rqspinlock and poll_idle changes as
> well to show how the interface is used.
Sure.
--
ankur
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2025-09-03 18:35 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-29 8:07 [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
2025-08-29 8:07 ` [PATCH v4 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
2025-09-01 11:29 ` Catalin Marinas
2025-09-02 21:34 ` Ankur Arora
2025-08-29 8:07 ` [PATCH v4 2/5] arm64: " Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-09-02 22:40 ` Ankur Arora
2025-08-29 8:07 ` [PATCH v4 3/5] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 4/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait() Ankur Arora
2025-09-01 11:47 ` Catalin Marinas
2025-08-29 8:07 ` [PATCH v4 5/5] rqspinlock: use smp_cond_load_acquire_timewait() Ankur Arora
2025-09-01 11:28 ` Catalin Marinas
2025-09-02 17:43 ` Alexei Starovoitov
2025-09-02 21:30 ` Ankur Arora
2025-09-02 21:31 ` Ankur Arora
2025-08-29 18:54 ` [PATCH v4 0/5] barrier: Add smp_cond_load_*_timewait() Okanovic, Haris
2025-08-29 22:38 ` Ankur Arora
2025-09-01 11:49 ` Catalin Marinas
2025-09-02 22:46 ` Ankur Arora
2025-09-03 9:27 ` Catalin Marinas
2025-09-03 18:34 ` Ankur Arora
2025-09-03 15:56 ` Okanovic, Haris
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).