public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: David Laight <david.laight.linux@gmail.com>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org,
	bpf@vger.kernel.org, arnd@arndb.de, catalin.marinas@arm.com,
	will@kernel.org, peterz@infradead.org, akpm@linux-foundation.org,
	mark.rutland@arm.com, harisokn@amazon.com, cl@gentwo.org,
	ast@kernel.org, rafael@kernel.org, daniel.lezcano@linaro.org,
	memxor@gmail.com, zhenglifeng1@huawei.com,
	xueshuai@linux.alibaba.com, rdunlap@infradead.org,
	joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com, ashok.bhat@arm.com
Subject: Re: [PATCH v11 01/14] asm-generic: barrier: Add smp_cond_load_relaxed_timeout()
Date: Wed, 06 May 2026 13:54:06 -0700	[thread overview]
Message-ID: <87o6isl0nl.fsf@oracle.com> (raw)
In-Reply-To: <20260506095836.216d9cc5@pumpkin>


David Laight <david.laight.linux@gmail.com> writes:

> On Wed, 06 May 2026 00:30:29 -0700
> Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
>> Ankur Arora <ankur.a.arora@oracle.com> writes:
>>
>> > Add smp_cond_load_relaxed_timeout(), which extends
>> > smp_cond_load_relaxed() to allow waiting for a duration.
>> >
>> > We loop around waiting for the condition variable to change while
>> > peridically doing a time-check. The loop uses cpu_poll_relax() to slow
>> > down the busy-wait, which, unless overridden by the architecture
>> > code, amounts to a cpu_relax().
>> >
>> > Note that there are two ways for the time-check to fail: the timeout
>> > case or, @time_expr_ns returning an invalid value (negative or zero).
>> > The second failure mode allows for clocks attached to the clock-domain
>> > of @cond_expr --  which might cease to operate meaningfully once some
>> > state internal to @cond_expr has changed -- to fail.
>> >
>> > Evaluation of @time_expr_ns: in the fastpath we want to keep the
>> > performance close to smp_cond_load_relaxed(). So defer evaluation
>> > of the potentially costly @time_expr_ns to the slowpath.
>> >
>> > This also means that there will always be some hardware dependent
>> > duration that has passed in cpu_poll_relax() iterations at the time
>> > of first evaluation. Additionally cpu_poll_relax() is not guaranteed
>> > to return at timeout boundary. In sum, expect timeout overshoot when
>> > we exit due to expiration of the timeout.
>> >
>> > The number of spin iterations before time-check, SMP_TIMEOUT_POLL_COUNT
>> > is chosen to be 200 by default. With a cpu_poll_relax() iteration
>> > taking ~20-30 cycles (measured on a variety of x86 platforms), we
>> > expect a time-check every ~4000-6000 cycles.
>> >
>> > The outer limit of the overshoot is double that when working with the
>> > parameters above. This might be higher or lower depending on the
>> > implementation of cpu_poll_relax() across architectures.
>> >
>> > Lastly, config option ARCH_HAS_CPU_RELAX indicates availability of a
>> > cpu_poll_relax() that is cheaper than polling. This might be relevant
>> > for cases with a long timeout.
>> >
>> > Cc: Arnd Bergmann <arnd@arndb.de>
>> > Cc: Will Deacon <will@kernel.org>
>> > Cc: Catalin Marinas <catalin.marinas@arm.com>
>> > Cc: Peter Zijlstra <peterz@infradead.org>
>> > Cc: linux-arch@vger.kernel.org
>> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> > Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> > ---
>> > Notes:
>> >    - add a comment mentioning that smp_cond_load_relaxed_timeout() might
>> >      be using architectural primitives that don't support MMIO.
>> >      (David Laight, Catalin Marinas)
>> >
>> >  include/asm-generic/barrier.h | 69 +++++++++++++++++++++++++++++++++++
>> >  1 file changed, 69 insertions(+)
>> >
>> > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>> > index d4f581c1e21d..e5a6a1c04649 100644
>> > --- a/include/asm-generic/barrier.h
>> > +++ b/include/asm-generic/barrier.h
>> > @@ -273,6 +273,75 @@ do {									\
>> >  })
>> >  #endif
>> >
>> > +/*
>> > + * Number of times we iterate in the loop before doing the time check.
>> > + * Note that the iteration count assumes that the loop condition is
>> > + * relatively cheap.
>> > + */
>> > +#ifndef SMP_TIMEOUT_POLL_COUNT
>> > +#define SMP_TIMEOUT_POLL_COUNT		200
>> > +#endif
>> > +
>> > +/*
>> > + * Platforms with ARCH_HAS_CPU_RELAX have a cpu_poll_relax() implementation
>> > + * that is expected to be cheaper (lower power) than pure polling.
>> > + */
>> > +#ifndef cpu_poll_relax
>> > +#define cpu_poll_relax(ptr, val, timeout_ns)	cpu_relax()
>> > +#endif
>> > +
>> > +/**
>> > + * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
>> > + * guarantees until a timeout expires.
>> > + * @ptr: pointer to the variable to wait on.
>> > + * @cond_expr: boolean expression to wait for.
>> > + * @time_expr_ns: expression that evaluates to monotonic time (in ns) or,
>> > + *  on failure, returns a negative value.
>> > + * @timeout_ns: timeout value in ns
>> > + * Both of the above are assumed to be compatible with s64; the signed
>> > + * value is used to handle the failure case in @time_expr_ns.
>> > + *
>> > + * Equivalent to using READ_ONCE() on the condition variable.
>> > + *
>> > + * Callers that expect to wait for prolonged durations might want
>> > + * to take into account the availability of ARCH_HAS_CPU_RELAX.
>> > + *
>> > + * Note that @ptr is expected to point to a memory address. Using this
>> > + * interface with MMIO will be slower (since SMP_TIMEOUT_POLL_COUNT is
>> > + * tuned for memory) and might also break in interesting architecture
>> > + * dependent ways.
>> > + */
>> > +#ifndef smp_cond_load_relaxed_timeout
>> > +#define smp_cond_load_relaxed_timeout(ptr, cond_expr,			\
>> > +				      time_expr_ns, timeout_ns)		\
>> > +({									\
>> > +	typeof(ptr) __PTR = (ptr);					\
>> > +	__unqual_scalar_typeof(*ptr) VAL;				\
>> > +	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
>> > +	s64 __timeout = (s64)timeout_ns;				\
>> > +	s64 __time_now, __time_end = 0;					\
>> > +									\
>> > +	for (;;) {							\
>> > +		VAL = READ_ONCE(*__PTR);				\
>> > +		if (cond_expr)						\
>> > +			break;						\
>> > +		cpu_poll_relax(__PTR, VAL, (u64)__timeout);		\
>> > +		if (++__n < __spin)					\
>> > +			continue;					\
>> > +		__time_now = (s64)(time_expr_ns);			\
>> > +		if (unlikely(__time_end == 0))				\
>> > +			__time_end = __time_now + __timeout;		\
>> > +		__timeout = __time_end - __time_now;			\
>> > +		if (__time_now <= 0 || __timeout <= 0) {		\
>> > +			VAL = READ_ONCE(*__PTR);			\
>> > +			break;						\
>> > +		}							\
>> > +		__n = 0;						\
>> > +	}								\
>> > +	(typeof(*ptr))VAL;						\
>> > +})
>> > +#endif
>> > +
>>
>> A cluster of issues that got flagged by sashiko was around timeout_ns
>> being specified as s64 and a bunch of potential edge cases around
>> that.
>>
>> These were mostly caused by an implicit assumption in the code that
>> the timeout specified by the caller is generally reasonable. So, way
>> below S64_MAX, not 0 etc.
>
> There are plenty of ways kernel code can break things.
> Provided this code doesn't itself overwrite anywhere (rather than
> just loop forever or return immediately etc) I'd be tempted to
> just document the valid range rather than slow everything down
> with the extra tests.

I don't disagree. In this case, however, it's somewhat borderline.

On the pro side, we get rid of some of the implicit type conversions
and assumptions around those.

On the negative, it adds an extra modulo operation in the slow path.
And, the for loop is structured a little differently from the usual
version.

On balance, I think this is a good change if only because it makes
the types a little more explicit.

Ankur

> 	David
>
>>
>> I think this is worth cleaning up a bit. The change is mostly around
>> introducing a u32 __itertime and explicitly computing the waiting time.
>> And adding a check to ensure that we start with a valid value.
>>
>> This does make the implementation a little more involved. So just wanted
>> to see if people have any opinions on this?
>>
>> +#ifndef smp_cond_load_relaxed_timeout
>> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr,          \
>> +                                     time_expr_ns, timeout_ns) \
>> +({                                                             \
>> +       typeof(ptr) __PTR = (ptr);                              \
>> +       __unqual_scalar_typeof(*(ptr)) VAL;                     \
>> +       u32 __count = 0, __spin = SMP_TIMEOUT_POLL_COUNT;       \
>> +       s64 __timeout = (s64)(timeout_ns);                      \
>> +       s64 __time_now, __time_end = 0;                         \
>> +       u32 __maybe_unused __itertime;                          \
>> +                                                               \
>> +       for (__itertime = NSEC_PER_USEC;                        \
>> +               VAL = READ_ONCE(*__PTR), __timeout > 0; ) {     \
>> +               if (cond_expr)                                  \
>> +                       break;                                  \
>> +               cpu_poll_relax(__PTR, VAL, __itertime);         \
>> +               if (++__count < __spin)                         \
>> +                       continue;                               \
>> +               __time_now = (s64)(time_expr_ns);               \
>> +               if (unlikely(__time_end == 0))                  \
>> +                       __time_end = __time_now + __timeout;    \
>> +               __timeout = __time_end - __time_now;            \
>> +               if (__time_now <= 0 || __timeout <= 0) {        \
>> +                       VAL = READ_ONCE(*__PTR);                \
>> +                       break;                                  \
>> +               }                                               \
>> +               __itertime = __timeout % NSEC_PER_MSEC +        \
>> +                               NSEC_PER_USEC;                  \
>> +               __count = 0;                                    \
>> +       }                                                       \
>> +       (typeof(*(ptr)))VAL;                                    \
>> +})
>> +#endif
>>
>> Thanks
>>
>> --
>> ankur
>>


--
ankur

  reply	other threads:[~2026-05-06 20:54 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08 12:25 [PATCH v11 00/14] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 01/14] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2026-05-06  7:30   ` Ankur Arora
2026-05-06  8:58     ` David Laight
2026-05-06 20:54       ` Ankur Arora [this message]
2026-04-08 12:25 ` [PATCH v11 02/14] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 03/14] arm64/delay: move some constants out to a separate header Ankur Arora
2026-04-08 12:25 ` [PATCH v11 04/14] arm64: support WFET in smp_cond_load_relaxed_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 05/14] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 06/14] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 07/14] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 08/14] locking/atomic: scripts: build atomic_long_cond_read_*_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 09/14] bpf/rqspinlock: switch check_timeout() to a clock interface Ankur Arora
2026-04-08 12:25 ` [PATCH v11 10/14] bpf/rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 11/14] sched: add need-resched timed wait interface Ankur Arora
2026-04-08 12:25 ` [PATCH v11 12/14] cpuidle/poll_state: Wait for need-resched via tif_need_resched_relaxed_wait() Ankur Arora
2026-04-20 16:57   ` Okanovic, Haris
2026-04-20 17:50     ` Ankur Arora
2026-04-21  7:15       ` Catalin Marinas
2026-04-20 22:12     ` Christoph Lameter (Ampere)
2026-04-08 12:25 ` [PATCH v11 13/14] kunit: enable testing smp_cond_load_relaxed_timeout() Ankur Arora
2026-04-08 12:25 ` [PATCH v11 14/14] kunit: add tests for smp_cond_load_relaxed_timeout() Ankur Arora
2026-04-23 17:16 ` [PATCH v11 00/14] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Andrew Morton
2026-04-23 19:29   ` Ankur Arora
2026-04-24 14:16     ` [PATCH v11 00/14] barrier: Add smp_cond_load_{relaxed, acquire}_timeout() Okanovic, Haris
2026-04-24 14:10   ` Okanovic, Haris
2026-04-24 14:28 ` [PATCH v11 00/14] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Andrew Morton
2026-04-24 18:10   ` Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o6isl0nl.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=ashok.bhat@arm.com \
    --cc=ast@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=david.laight.linux@gmail.com \
    --cc=harisokn@amazon.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=memxor@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox