From: Ankur Arora <ankur.a.arora@oracle.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, bpf@vger.kernel.org,
arnd@arndb.de, will@kernel.org, peterz@infradead.org,
akpm@linux-foundation.org, mark.rutland@arm.com,
harisokn@amazon.com, cl@gentwo.org, ast@kernel.org,
memxor@gmail.com, zhenglifeng1@huawei.com,
xueshuai@linux.alibaba.com, joao.m.martins@oracle.com,
boris.ostrovsky@oracle.com, konrad.wilk@oracle.com
Subject: Re: [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait()
Date: Fri, 23 May 2025 20:22:12 -0700 [thread overview]
Message-ID: <87v7pqzo9n.fsf@oracle.com> (raw)
In-Reply-To: <aC4dcZ2veeavM2dR@arm.com>
Catalin Marinas <catalin.marinas@arm.com> writes:
> Hi Ankur,
>
> Sorry, it took me some time to get back to this series (well, I tried
> once and got stuck on what wait_policy is supposed to mean, so decided
> to wait until I had more coffee ;)).
I suppose that's as good a sign as any that the wait_policy stuff needs
to change ;).
> On Fri, May 02, 2025 at 01:52:17AM -0700, Ankur Arora wrote:
>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>> index d4f581c1e21d..a7be98e906f4 100644
>> --- a/include/asm-generic/barrier.h
>> +++ b/include/asm-generic/barrier.h
>> @@ -273,6 +273,64 @@ do { \
>> })
>> #endif
>>
>> +/*
>> + * Non-spin primitive that allows waiting for stores to an address,
>> + * with support for a timeout. This works in conjunction with an
>> + * architecturally defined wait_policy.
>> + */
>> +#ifndef __smp_timewait_store
>> +#define __smp_timewait_store(ptr, val) do { } while (0)
>> +#endif
>> +
>> +#ifndef __smp_cond_load_relaxed_timewait
>> +#define __smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
>> + time_expr, time_end) ({ \
>> + typeof(ptr) __PTR = (ptr); \
>> + __unqual_scalar_typeof(*ptr) VAL; \
>> + u32 __n = 0, __spin = 0; \
>> + u64 __prev = 0, __end = (time_end); \
>> + bool __wait = false; \
>> + \
>> + for (;;) { \
>> + VAL = READ_ONCE(*__PTR); \
>> + if (cond_expr) \
>> + break; \
>> + cpu_relax(); \
>> + if (++__n < __spin) \
>> + continue; \
>> + if (!(__prev = wait_policy((time_expr), __prev, __end, \
>> + &__spin, &__wait))) \
>> + break; \
>> + if (__wait) \
>> + __smp_timewait_store(__PTR, VAL); \
>> + __n = 0; \
>> + } \
>> + (typeof(*ptr))VAL; \
>> +})
>> +#endif
>> +
>> +/**
>> + * smp_cond_load_relaxed_timewait() - (Spin) wait for cond with no ordering
>> + * guarantees until a timeout expires.
>> + * @ptr: pointer to the variable to wait on
>> + * @cond: boolean expression to wait for
>> + * @wait_policy: policy handler that adjusts the number of times we spin or
>> + * wait for cacheline to change (depends on architecture, not supported in
>> + * generic code.) before evaluating the time-expr.
>> + * @time_expr: monotonic expression that evaluates to the current time
>> + * @time_end: compared against time_expr
>> + *
>> + * Equivalent to using READ_ONCE() on the condition variable.
>> + */
>> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr, wait_policy, \
>> + time_expr, time_end) ({ \
>> + __unqual_scalar_typeof(*ptr) _val;; \
>> + _val = __smp_cond_load_relaxed_timewait(ptr, cond_expr, \
>> + wait_policy, time_expr, \
>> + time_end); \
>> + (typeof(*ptr))_val; \
>> +})
>
> IIUC, a generic user of this interface would need a wait_policy() that
> is aware of the arch details (event stream, WFET etc.), given the
> __smp_timewait_store() implementation in patch 3. This becomes clearer
> in patch 7 where one needs to create rqspinlock_cond_timewait().
Yes, if a caller can't work with the __smp_cond_timewait_coarse() etc,
they would need to know the mechanics of how to do that on each arch.
I meant the two policies to be somewhat generic, but having to know
the internals is a problem.
> The __spin count can be arch specific, not part of some wait_policy,
> even if such policy is most likely implemented in the arch code (as the
> generic caller has no clue what it means). The __wait decision, again, I
> don't think it should be the caller of this API to decide how to handle,
> it's something internal to the API implementation based on whether the
> event stream (or later WFET) is available.
>
> The ___cond_timewait() implementation in patch 4 sets __wait if either
> the event stream of WFET is available. However, __smp_timewait_store()
> only uses WFE as per the __cmpwait_relaxed() implementation. So you
> can't really decouple wait_policy() from how the spinning is done, in an
> arch-specific way.
Agreed.
> In this implementation, wait_policy() would need to
> say how to wait - WFE, WFET. That's not captured (and I don't think it
> should, we can't expand the API every time we have a new method of
> waiting).
The idea was both the wait_policy and the arch specific interface would
evolve together and so once __cmpwait_relaxed() supports WFET, the
wait_policy would also change alongside.
However, as you say, for users that define their own wait_policy, the
interface becomes a mess to maintain.
> I still think this interface can be simpler and fairly generic, not with
> wait_policy specific to rqspinlock or poll_idle. Maybe you can keep a
> policy argument for an internal __smp_cond_load_relaxed_timewait() if
> it's easier to structure the code this way but definitely not for
> smp_cond_*().
Yeah. I think that's probably the way to do this. The main reason I felt
that we need an explicit wait_policy was to address the rqspinlock case
but as you point out, that makes the interface unmaintainable.
So, this should work (see below for one proviso), for most users:
#define smp_cond_load_relaxed_timewait(ptr, cond_expr,
time_expr, time_end, slack_us)
(Though, I would use slack_us instead of slack_ns and also keep time_expr
and time_end denominated in us.)
And users like rqspinlock could use __smp_cond_load_relaxed_timewait()
with a policy argument where they can combine rqspinock policy plus
with the common wait policy so wouldn't need to know the internals of
the waiting mechanisms.
> Another aspect I'm not keen on is the arbitrary fine/coarse constants.
> Can we not have the caller pass a slack value (in ns or 0 if it doesn't
> care) to smp_cond_load_relaxed_timewait() and let the arch code decide
> which policy to use?
Yeah, as you probably noticed, that's pretty much how what they are
implemented internally already.
> In summary, I see the API something like:
>
> #define smp_cond_load_relaxed_timewait(ptr, cond_expr,
> time_expr, time_end, slack_ns)
Ack.
> We can even drop time_end if we capture it in time_expr returning a bool
> (like we do with cond_expr).
I'm not sure we can combine time_expr, time_end. Given that we have two
ways to wait: spin and wait, both with different granularity, just a
binary check won't suffice.
For switching between wait and spin, we would also need to compare the
granularity of the mechanism, derive the time-remaining, check against
slack etc.
Thanks for the comments. Most helpful.
--
ankur
next prev parent reply other threads:[~2025-05-24 3:22 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-02 8:52 [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 1/7] asm-generic: barrier: add smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-21 18:37 ` Catalin Marinas
2025-05-24 3:22 ` Ankur Arora [this message]
2025-05-02 8:52 ` [PATCH v2 2/7] asm-generic: barrier: add wait_policy handlers Ankur Arora
2025-05-02 8:52 ` [PATCH v2 3/7] arm64: barrier: enable waiting in smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 4/7] arm64: barrier: add coarse wait for smp_cond_load_relaxed_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 5/7] arm64: barrier: add fine " Ankur Arora
2025-05-02 8:52 ` [PATCH v2 6/7] asm-generic: barrier: add smp_cond_load_acquire_timewait() Ankur Arora
2025-05-02 8:52 ` [PATCH v2 7/7] bpf: rqspinlock: add rqspinlock policy handler for arm64 Ankur Arora
2025-05-02 16:42 ` [PATCH v2 0/7] barrier: introduce smp_cond_load_*_timewait() Christoph Lameter (Ampere)
2025-05-02 20:05 ` Ankur Arora
2025-05-05 16:13 ` Christoph Lameter (Ampere)
2025-05-05 17:08 ` Ankur Arora
2025-05-16 22:50 ` Okanovic, Haris
2025-05-17 1:16 ` Ankur Arora
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v7pqzo9n.fsf@oracle.com \
--to=ankur.a.arora@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=ast@kernel.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=harisokn@amazon.com \
--cc=joao.m.martins@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=memxor@gmail.com \
--cc=peterz@infradead.org \
--cc=will@kernel.org \
--cc=xueshuai@linux.alibaba.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.