linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org,
	Linux-Arch <linux-arch@vger.kernel.org>,
	linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org,
	bpf@vger.kernel.org, Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Haris Okanovic <harisokn@amazon.com>,
	"Christoph Lameter (Ampere)" <cl@gentwo.org>,
	Alexei Starovoitov <ast@kernel.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	zhenglifeng1@huawei.com, xueshuai@linux.alibaba.com,
	Joao Martins <joao.m.martins@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()
Date: Mon, 03 Nov 2025 13:00:33 -0800	[thread overview]
Message-ID: <87ikfqesr2.fsf@oracle.com> (raw)
In-Reply-To: <746c2de4-7613-4f13-911c-c2c4e071ed73@app.fastmail.com>


Arnd Bergmann <arnd@arndb.de> writes:

> On Tue, Oct 28, 2025, at 22:17, Catalin Marinas wrote:
>> On Tue, Oct 28, 2025 at 11:01:22AM -0700, Ankur Arora wrote:
>>> Arnd Bergmann <arnd@arndb.de> writes:
>>> > On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
>>> >> +
>>> >
>>> > Since the caller knows exactly how long it wants to wait for,
>>> > we should be able to fit a 'wfet' based primitive in here and
>>> > pass the timeout as another argument.
>>>
>>> Per se, I don't disagree with this when it comes to WFET.
>>>
>>> Handling a timeout, however, is messier when we use other mechanisms.
>>>
>>> Some problems that came up in my earlier discussions with Catalin:
>>>
>>>   - when using WFE, we also need some notion of slack
>>>     - and if a caller specifies only a small or no slack, then we need
>>>       to combine WFE+cpu_relax()
>
> I don't see the difference to what you have: with the event stream,
> you implicitly define a slack to be the programmed event stream rate
> of ~100µs.

True. The thinking was that an adding an explicit timeout just begs the
question of how closely the interface adheres to the timeout and I guess
the final interface tried to sidestep all of that.

> I'm not asking for anything better in this case, only for machines
> with WFET but no event stream to also avoid the spin loop.

That makes sense. It's a good point that the WFET+event-stream-off case
would just end up using the spin lock which is quite suboptimal.

>>>   - for platforms that only use a polling primitive, we want to check
>>>     the clock only intermittently for power reasons.
>
> Right, I missed that bit.
>
>>>     Now, this could be done with an architecture specific spin-count.
>>>     However, if the caller specifies a small slack, then we might need
>>>     to we check the clock more often as we get closer to the deadline etc.
>
> Again, I think this is solved by defining the slack as architecture
> specific as well rather than an explicit argument, which is essentially
> what we already have.

Great. I think that means that I can keep more or less the same interface
with an explicit time_end. Which allows WFET to do the right thing.
And, WFE can have an architecture specific slack (event-stream period).

>>> A smaller problem was that different users want different clocks and so
>>> folding the timeout in a 'timeout_cond_expr' lets us do away with the
>>> interface having to handle any of that.
>>>
>>> I had earlier versions [v2] [v3] which had rather elaborate policies for
>>> handling timeout, slack etc. But, given that the current users of the
>>> interface don't actually care about precision, all of that seemed
>>> a little overengineered.
>>
>> Indeed, we've been through all these options and without a concrete user
>> that needs a more precise timeout, we decided it's not worth it. It can,
>> however, be improved later if such users appear.
>
> The main worry I have is that we get too many users of cpu_poll_relax()
> hardcoding the use of the event stream without a timeout argument, it
> becomes too hard to change later without introducing regressions
> from the behavior change.

True.

> As far as I can tell, the only place that currently uses the
> event stream on a functional level is the delay() loop, and that
> has a working wfet based version.

Will send out the next version with an interface on the following lines:

    /**
    * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
    * guarantees until a timeout expires.
    * @ptr: pointer to the variable to wait on
    * @cond: boolean expression to wait for
    * @time_expr: time expression in caller's preferred clock
    * @time_end: end time in nanosecond (compared against time_expr;
    * might also be used for setting up a future event.)
    *
    * Equivalent to using READ_ONCE() on the condition variable.
    *
    * Note that the expiration of the timeout might have an architecture specific
    * delay.
    */
    #ifndef smp_cond_load_relaxed_timeout
    #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, time_end_ns)	\
    ({									\
            typeof(ptr) __PTR = (ptr);					\
            __unqual_scalar_typeof(*ptr) VAL;				\
            u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;		\
            u64 __time_end_ns = (time_end_ns);				\
                                                                        \
            for (;;) {							\
                    VAL = READ_ONCE(*__PTR);				\
                    if (cond_expr)					\
                            break;					\
                    cpu_poll_relax(__PTR, VAL, __time_end_ns);		\
                    if (++__n < __spin)				\
                            continue;					\
                    if ((time_expr) >= __time_end_ns) {		\
                            VAL = READ_ONCE(*__PTR);			\
                            break;					\
                    }							\
                    __n = 0;						\
            }								\
            (typeof(*ptr))VAL;						\
    })
    #endif

That allows for a __cmpwait_timeout() as you had outlined and similar to
these two patches:

 https://lore.kernel.org/lkml/20241107190818.522639-15-ankur.a.arora@oracle.com/
 https://lore.kernel.org/lkml/20241107190818.522639-16-ankur.a.arora@oracle.com/
 (this one incorporating some changes that Catalin had suggested:
  https://lore.kernel.org/lkml/aKRTRyQAaWFtRvDv@arm.com/)

--
ankur


  reply	other threads:[~2025-11-03 21:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-28  5:31 [RESEND PATCH v7 0/7] barrier: Add smp_cond_load_*_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 1/7] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28  9:42   ` Arnd Bergmann
2025-10-29  3:17     ` Ankur Arora
2025-11-02 21:52       ` Arnd Bergmann
2025-11-03 21:41         ` Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28  8:42   ` Arnd Bergmann
2025-10-28 16:21     ` Christoph Lameter (Ampere)
2025-10-28 18:01     ` Ankur Arora
2025-10-28 21:17       ` Catalin Marinas
2025-11-02 21:39         ` Arnd Bergmann
2025-11-03 21:00           ` Ankur Arora [this message]
2025-11-04 13:55             ` Catalin Marinas
2025-11-05  8:27               ` Ankur Arora
2025-11-05 10:37                 ` Arnd Bergmann
2025-11-06  0:36                   ` Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 3/7] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 4/7] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 5/7] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 6/7] rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2025-10-28  5:31 ` [RESEND PATCH v7 7/7] cpuidle/poll_state: Poll via smp_cond_load_relaxed_timeout() Ankur Arora
2025-10-28 12:30   ` Rafael J. Wysocki
2025-10-29  4:41     ` Ankur Arora
2025-10-29 18:53       ` Rafael J. Wysocki
2025-10-29 19:13         ` Ankur Arora
2025-10-29 20:29           ` Rafael J. Wysocki
2025-10-29 21:01             ` Ankur Arora
2025-11-04 18:07               ` Rafael J. Wysocki
2025-11-05  8:30                 ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikfqesr2.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=harisokn@amazon.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=memxor@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).