Re: [RESEND PATCH v7 1/7] asm-generic: barrier: Add smp_cond_load_relaxed_timeout()

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ankur Arora <ankur.a.arora@oracle.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org,
	Linux-Arch <linux-arch@vger.kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org,
	bpf@vger.kernel.org, Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Haris Okanovic <harisokn@amazon.com>,
	"Christoph Lameter (Ampere)" <cl@gentwo.org>,
	Alexei Starovoitov <ast@kernel.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	zhenglifeng1@huawei.com, xueshuai@linux.alibaba.com,
	Joao Martins <joao.m.martins@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [RESEND PATCH v7 1/7] asm-generic: barrier: Add smp_cond_load_relaxed_timeout()
Date: Tue, 28 Oct 2025 20:17:14 -0700	[thread overview]
Message-ID: <874irimm6d.fsf@oracle.com> (raw)
In-Reply-To: <4c87bbf8-00a3-4666-b844-916edd678305@app.fastmail.com>


Arnd Bergmann <arnd@arndb.de> writes:

> On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
>
>> + */
>> +#ifndef smp_cond_load_relaxed_timeout
>> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)	\
>> +({									\
>> +	typeof(ptr) __PTR = (ptr);					\
>> +	__unqual_scalar_typeof(*ptr) VAL;				\
>> +	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
>> +									\
>> +	for (;;) {							\
>> +		VAL = READ_ONCE(*__PTR);				\
>> +		if (cond_expr)						\
>> +			break;						\
>> +		cpu_poll_relax(__PTR, VAL);				\
>> +		if (++__n < __spin)					\
>> +			continue;					\
>> +		if (time_check_expr) {					\
>> +			VAL = READ_ONCE(*__PTR);			\
>> +			break;						\
>> +		}							\
>> +		__n = 0;						\
>> +	}								\
>> +	(typeof(*ptr))VAL;						\
>> +})
>> +#endif
>
> I'm trying to think of ideas for how this would done on arm64
> with FEAT_FWXT in a way that doesn't hurt other architectures.
>
> The best idea I've come up with is to change that inner loop
> to combine the cpu_poll_relax() with the timecheck and then
> define the 'time_check_expr' so it has to return an approximate
> (ceiling) number of nanoseconds of remaining time or zero if
> expired.

Agree that it's a pretty good idea :). I came up with something pretty
similar. Though that had taken a bunch of iterations.

> The FEAT_WFXT version would then look something like
>
> static inline void __cmpwait_u64_timeout(volatile u64 *ptr, unsigned long val, __u64 ns)
> {
>    unsigned long tmp;
>    asm volatile ("sev; wfe; ldxr; eor; cbnz; wfet; 1:"
>         : "=&r" (tmp), "+Q" (*ptr)
>         : "r" (val), "r" (ns));
> }
> #define cpu_poll_relax_timeout_wfet(__PTR, VAL, TIMECHECK) \
> ({                                                    \
>        u64 __t = TIMECHECK;
>        if (__t)
>             __cmpwait_u64_timeout(__PTR, VAL, __t);
> })
>
> while the 'wfe' version would continue to do the timecheck after the
> wait.

I think this is a good way to do it if we need the precision
at some point in the future.

> I have two lesser concerns with the generic definition here:
>
> - having both a timeout and a spin counter in the same loop
>   feels redundant and error-prone, as the behavior in practice
>   would likely depend a lot on the platform. What is the reason
>   for keeping the counter if we already have a fixed timeout
>   condition?

The main reason was that the time check is expensive in power terms.
Which is fine for platforms with a WFE like primitive but others
want to do the time check only infrequently. That's why poll_idle()
introduced a rate limit on polling (which the generic definition
reused here.)

    commit 4dc2375c1a4e88ed2701f6961e0e4f9a7696ad3c
    Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Date:   Tue Mar 27 23:58:45 2018 +0200

    cpuidle: poll_state: Avoid invoking local_clock() too often

    Rik reports that he sees an increase in CPU use in one benchmark
    due to commit 612f1a22f067 "cpuidle: poll_state: Add time limit to
    poll_idle()" that caused poll_idle() to call local_clock() in every
    iteration of the loop.  Utilization increase generally means more
    non-idle time with respect to total CPU time (on the average) which
    implies reduced CPU frequency.

    Doug reports that limiting the rate of local_clock() invocations
    in there causes much less power to be drawn during a CPU-intensive
    parallel workload (with idle states 1 and 2 disabled to enforce more
    state 0 residency).

    These two reports together suggest that executing local_clock() on
    multiple CPUs in parallel at a high rate may cause chips to get hot
    and trigger thermal/power limits on them to kick in, so reduce the
    rate of local_clock() invocations in poll_idle() to avoid that issue.

> - I generally dislike the type-agnostic macros like this one,
>   it adds a lot of extra complexity here that I feel can be
>   completely avoided if we make explicitly 32-bit and 64-bit
>   wide versions of these macros. We probably won't be able
>   to resolve this as part of your series, but ideally I'd like
>   have explicitly-typed versions of cmpxchg(), smp_load_acquire()
>   and all the related ones, the same way we do for atomic_*()
>   and atomic64_*().

Ah. And the caller uses say smp_load_acquire_long() or whatever, and
that resolves to whatever makes sense for the arch.

The __unqual_scalar_typeof() does look pretty ugly when looking at the
preprocesed version but other than that smp_cond_load() etc look
pretty straight forward. Just for my curiousity could you elaborate on
the complexity?

--
ankur

next prev parent reply	other threads:[~2025-10-29  3:17 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28  5:31 [RESEND PATCH v7 0/7] barrier: Add smp_cond_load_*_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 1/7] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28  9:42   ` Arnd Bergmann
2025-10-29  3:17     ` Ankur Arora [this message]
2025-11-02 21:52       ` Arnd Bergmann
2025-11-03 21:41         ` Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28  8:42   ` Arnd Bergmann
2025-10-28 16:21     ` Christoph Lameter (Ampere)
2025-10-28 18:01     ` Ankur Arora
2025-10-28 21:17       ` Catalin Marinas
2025-11-02 21:39         ` Arnd Bergmann
2025-11-03 21:00           ` Ankur Arora
2025-11-04 13:55             ` Catalin Marinas
2025-11-05  8:27               ` Ankur Arora
2025-11-05 10:37                 ` Arnd Bergmann
2025-11-06  0:36                   ` Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 3/7] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 4/7] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 5/7] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 6/7] rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 7/7] cpuidle/poll_state: Poll via smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28 12:30   ` Rafael J. Wysocki
2025-10-29  4:41     ` Ankur Arora
2025-10-29 18:53       ` Rafael J. Wysocki
2025-10-29 19:13         ` Ankur Arora
2025-10-29 20:29           ` Rafael J. Wysocki
2025-10-29 21:01             ` Ankur Arora
2025-11-04 18:07               ` Rafael J. Wysocki
2025-11-05  8:30                 ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874irimm6d.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=harisokn@amazon.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=memxor@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).