BPF List
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Ankur Arora" <ankur.a.arora@oracle.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH v12 04/15] arm64: support WFET in smp_cond_load_relaxed_timeout()
Date: Mon, 08 Jun 2026 08:27:35 +0000	[thread overview]
Message-ID: <20260608082735.830451F00899@smtp.kernel.org> (raw)
In-Reply-To: <20260608080440.127491-5-ankur.a.arora@oracle.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] smp_cond_load_relaxed_timeout() doubles the requested wait time when WFET is available.
- [High] Passing a negative timeout to smp_cond_load_relaxed_timeout() causes a massive ~85-second stall.
- [Medium] Missing EXPORT_SYMBOL for __delay_cycles causes module build failures.
- [Medium] Integer overflow in NSECS_TO_CYCLES() degrades WFET low-power sleep into a 100% CPU busy-spin for large timeouts.
--

commit 4ce1d0efe15be472dcb8f8d8e4e2200fdd9f078f
Author: Ankur Arora <ankur.a.arora@oracle.com>
Date:   Mon Jun 8 08:04:29 2026 +0000

    arm64: support WFET in smp_cond_load_relaxed_timeout()
    
    To handle WFET use __cmpwait_timeout() similarly to __cmpwait(). These
    call out to the respective __cmpwait_case_timeout_##sz(),
    __cmpwait_case_##sz() functions.

    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will@kernel.org>
    Cc: linux-arm-kernel@lists.infradead.org
    Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>

> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 6190e178db51f..fbd71cd4ef4e5 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h

[ ... ]

> @@ -234,7 +234,9 @@ extern bool arch_timer_evtstrm_available(void);
>  #define SMP_TIMEOUT_POLL_COUNT	1
>  
>  #define cpu_poll_relax(ptr, val, timeout_ns) do {			\
> -	if (arch_timer_evtstrm_available())				\
> +	if (alternative_has_cap_unlikely(ARM64_HAS_WFXT))		\
> +		__cmpwait_relaxed_timeout(ptr, val, timeout_ns);	\

[Severity: High]
Does this double the requested wait time?

In include/asm-generic/barrier.h, smp_cond_load_relaxed_timeout() initializes
the deadline __time_end lazily after the first execution of cpu_poll_relax(),
assuming the poll is brief:

cpu_poll_relax(__PTR, VAL, (u64)__timeout);
if (++__count < __spin)
        continue;
__time_now = (s64)(time_expr_ns);
if (unlikely(__time_end == 0))
        __time_end = __time_now + __timeout;

Because the overridden cpu_poll_relax() uses WFET to sleep for the entire
timeout_ns duration on the first iteration, __time_now advances by timeout_ns
before __time_end is initialized. The new deadline effectively becomes
start + 2 * timeout_ns, causing a second full sleep.


[Severity: High]
What happens if a negative timeout is passed to
smp_cond_load_relaxed_timeout()?

The generic loop unconditionally invokes cpu_poll_relax() before checking if
the timeout is negative:

cpu_poll_relax(__PTR, VAL, (u64)__timeout);

A negative timeout casts to U64_MAX. In __cmpwait_case_timeout_##sz(),
NSECS_TO_CYCLES(U64_MAX) overflows and yields an ecycles offset of
4,294,967,295 cycles. For a 50MHz timer, this causes the WFET instruction to
sleep for ~85.89 seconds, completely blocking the CPU instead of returning
immediately.

> +	else if (arch_timer_evtstrm_available())			\
>  		__cmpwait_relaxed(ptr, val);				\
>  	else								\
>  		cpu_relax();						\
> diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
> index 6cf3cd6873f5f..9e4cdc9e41d1b 100644
> --- a/arch/arm64/include/asm/cmpxchg.h
> +++ b/arch/arm64/include/asm/cmpxchg.h

[ ... ]

> @@ -235,20 +237,52 @@ __CMPWAIT_CASE( ,  , 64);
>  
>  #undef __CMPWAIT_CASE
>  
> -#define __CMPWAIT_GEN(sfx)						\
> -static __always_inline void __cmpwait##sfx(volatile void *ptr,		\
> -				  unsigned long val,			\
> -				  int size)				\
> +#define __CMPWAIT_TIMEOUT_CASE(w, sfx, sz)				\
> +static inline void __cmpwait_case_timeout_##sz(volatile void *ptr,	\
> +					       unsigned long val,	\
> +					       u64 timeout_ns)		\
> +{									\
> +	unsigned long tmp;						\
> +	u64 ecycles = __delay_cycles() +				\
> +			NSECS_TO_CYCLES(timeout_ns);			\

[Severity: Medium]
Does this cause build failures for loadable modules?

The inline function __cmpwait_case_timeout_##sz() explicitly calls
__delay_cycles(). However, __delay_cycles() is only exported via
EXPORT_SYMBOL_IF_KUNIT in arch/arm64/lib/delay.c.

Since smp_cond_load_relaxed_timeout() is an exported generic API, any
dynamically loadable kernel module that invokes it on ARM64 will fail to
link with an undefined symbol error for __delay_cycles.


[Severity: Medium]
Could an integer overflow in NSECS_TO_CYCLES() degrade the WFET low-power
sleep into a 100% CPU busy-spin for large timeouts?

The target wake cycle is computed via NSECS_TO_CYCLES(timeout_ns), which
evaluates to:

(timeout_ns * 5 * loops_per_jiffy * HZ) >> 32

For a 50MHz timer, the 64-bit multiplication overflows at timeouts greater
than ~73.78 seconds. When this overflow wraps around to a small value,
ecycles evaluates to a time in the past. The WFET instruction then behaves
as a NOP, returning immediately and causing the loop to continuously invoke
WFET without sleeping until the remaining timeout drops below the threshold.

> +	asm volatile(							\
> +	"	sevl\n"							\
> +	"	wfe\n"							\

[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260608080440.127491-1-ankur.a.arora@oracle.com?part=4

  reply	other threads:[~2026-06-08  8:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  8:04 [PATCH v12 00/15] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Ankur Arora
2026-06-08  8:04 ` [PATCH v12 01/15] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2026-06-08  8:25   ` sashiko-bot
2026-06-08  8:53   ` bot+bpf-ci
2026-06-08  8:04 ` [PATCH v12 02/15] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2026-06-08  8:31   ` sashiko-bot
2026-06-08  8:53   ` bot+bpf-ci
2026-06-08  8:04 ` [PATCH v12 03/15] arm64/delay: move some constants out to a separate header Ankur Arora
2026-06-08  8:22   ` sashiko-bot
2026-06-08  8:04 ` [PATCH v12 04/15] arm64: support WFET in smp_cond_load_relaxed_timeout() Ankur Arora
2026-06-08  8:27   ` sashiko-bot [this message]
2026-06-08  8:04 ` [PATCH v12 05/15] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2026-06-08  8:19   ` sashiko-bot
2026-06-08  8:53   ` bot+bpf-ci
2026-06-08  8:04 ` [PATCH v12 06/15] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2026-06-08  8:27   ` sashiko-bot
2026-06-08  8:04 ` [PATCH v12 07/15] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2026-06-08  8:23   ` sashiko-bot
2026-06-08  8:04 ` [PATCH v12 08/15] locking/atomic: scripts: build atomic_long_cond_read_*_timeout() Ankur Arora
2026-06-08  8:04 ` [PATCH v12 09/15] bpf/rqspinlock: switch check_timeout() to a clock interface Ankur Arora
2026-06-08  8:04 ` [PATCH v12 10/15] bpf/rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2026-06-08  9:04   ` bot+bpf-ci
2026-06-08  8:04 ` [PATCH v12 11/15] sched: add need-resched timed wait interface Ankur Arora
2026-06-08  8:04 ` [PATCH v12 12/15] cpuidle/poll_state: Wait for need-resched via tif_need_resched_relaxed_wait() Ankur Arora
2026-06-08  8:31   ` sashiko-bot
2026-06-08  8:04 ` [PATCH v12 13/15] arm64/delay: enable testing smp_cond_load_relaxed_timeout() Ankur Arora
2026-06-08  8:32   ` sashiko-bot
2026-06-08  8:04 ` [PATCH v12 14/15] barrier: add tests for smp_cond_load_*_timeout() Ankur Arora
2026-06-08  8:04 ` [PATCH v12 15/15] barrier: add clock tests for smp_cond_load_relaxed_timeout() Ankur Arora
2026-06-08  8:34   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608082735.830451F00899@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox