All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, bpf@vger.kernel.org,
	arnd@arndb.de, will@kernel.org, peterz@infradead.org,
	akpm@linux-foundation.org, mark.rutland@arm.com,
	harisokn@amazon.com, cl@gentwo.org, ast@kernel.org,
	memxor@gmail.com, zhenglifeng1@huawei.com,
	xueshuai@linux.alibaba.com, joao.m.martins@oracle.com,
	boris.ostrovsky@oracle.com, konrad.wilk@oracle.com,
	rafael@kernel.org, daniel.lezcano@linaro.org
Subject: Re: [PATCH v3 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait()
Date: Mon, 18 Aug 2025 18:55:02 +0100	[thread overview]
Message-ID: <aKNo9pxx2w9sjJjc@arm.com> (raw)
In-Reply-To: <87plctwq7x.fsf@oracle.com>

On Sun, Aug 17, 2025 at 03:14:26PM -0700, Ankur Arora wrote:
> So, I tried to pare back the code and the following (untested) is
> what I came up with. Given the straight-forward rate-limiting, and the
> current users not needing accurate timekeeping, this uses a
> bool time_check_expr. Figured I'd keep it simple until someone actually
> needs greater complexity as you suggested.
> 
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index d4f581c1e21d..e8793347a395 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -273,6 +273,34 @@ do {                                                                       \
>  })
>  #endif
> 
> +
> +#ifndef SMP_TIMEWAIT_SPIN_COUNT
> +#define SMP_TIMEWAIT_SPIN_COUNT                200
> +#endif
> +
> +#ifndef smp_cond_load_relaxed_timewait
> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr,                 \
> +                                       time_check_expr)                \
> +({                                                                     \
> +       typeof(ptr) __PTR = (ptr);                                      \
> +       __unqual_scalar_typeof(*ptr) VAL;                               \
> +       u32 __n = 0, __spin = SMP_TIMEWAIT_SPIN_COUNT;                  \
> +                                                                       \
> +       for (;;) {                                                      \
> +               VAL = READ_ONCE(*__PTR);                                \
> +               if (cond_expr)                                          \
> +                       break;                                          \
> +               cpu_relax();                                            \
> +               if (++__n < __spin)                                     \
> +                       continue;                                       \
> +               if ((time_check_expr))                                  \
> +                       break;                                          \
> +               __n = 0;                                                \
> +       }                                                               \
> +       (typeof(*ptr))VAL;                                              \
> +})
> +#endif

This looks fine, at least as it would be used by poll_idle(). The only
reason for not folding time_check_expr into cond_expr is the poll_idle()
requirement to avoid calling time_check_expr too often.

> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index f5801b0ba9e9..c9934ab68da2 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -219,6 +219,43 @@ do {                                                                       \
>         (typeof(*ptr))VAL;                                              \
>  })
> 
> +extern bool arch_timer_evtstrm_available(void);
> +
> +#ifndef SMP_TIMEWAIT_SPIN_COUNT
> +#define SMP_TIMEWAIT_SPIN_COUNT                200
> +#endif
> +
> +#define smp_cond_load_relaxed_timewait(ptr, cond_expr,                 \
> +                                         time_check_expr)              \
> +({                                                                     \
> +       typeof(ptr) __PTR = (ptr);                                      \
> +       __unqual_scalar_typeof(*ptr) VAL;                               \
> +       u32 __n = 0, __spin = 0;                                        \
> +       bool __wfet = alternative_has_cap_unlikely(ARM64_HAS_WFXT);     \
> +       bool __wfe = arch_timer_evtstrm_available();                    \
> +       bool __wait = false;                                            \
> +                                                                       \
> +       if (__wfet || __wfe)                                            \
> +               __wait = true;                                          \
> +       else                                                            \
> +               __spin = SMP_TIMEWAIT_SPIN_COUNT;                       \
> +                                                                       \
> +       for (;;) {                                                      \
> +               VAL = READ_ONCE(*__PTR);                                \
> +               if (cond_expr)                                          \
> +                       break;                                          \
> +               cpu_relax();                                            \
> +               if (++__n < __spin)                                     \
> +                       continue;                                       \
> +               if ((time_check_expr))                                  \
> +                       break;                                          \
> +               if (__wait)                                             \
> +                       __cmpwait_relaxed(__PTR, VAL);                  \
> +               __n = 0;                                                \
> +       }                                                               \
> +       (typeof(*ptr))VAL;                                              \
> +})

For arm64, I wouldn't bother with the spin count. Since cpu_relax()
doesn't do anything, I doubt it makes any difference, especially as we
are likely to use WFE anyway. If we do add one, I'd like it backed by
some numbers to show it makes a difference in practice.

The question is whether 100us granularity is good enough for poll_idle()
(I came to the conclusion it's fine for rqspinlock, given their 1ms
deadlock check).

>  #include <asm-generic/barrier.h>
> 
> __cmpwait_relaxed() will need adjustment to set a deadline for WFET.

Yeah, __cmpwait_relaxed() doesn't use WFET as it doesn't need a timeout
(it just happens to have one with the event stream).

We could extend this or create a new one that uses WFET and takes an
argument. If extending this one, for example a timeout argument of 0
means WFE, non-zero means WFET cycles. This adds a couple of more
instructions.

What I had in mind of time_expr was a ktime_t would be something like:

	for (;;) {
		VAL = READ_ONCE(*__PTR);
		if (cond_expr)
			break;

		cycles = some_func_of(time_expr);	// see __udelay()
		if (cycles <= 0)
			break;

		if (__wfet) {
			__cmpwait_relaxed(__PTR, VAL, get_cycles() + cycles);
		} else if (__wfe && cycles >= timer_evt_period) {
			__cmpwait_relaxed(__PTR, VAL, 0);
		} else {
			cpu_relax();
		}
	}

Now, if we don't care about the time check granularity (for now) and
time_check_expr is a bool (this seems to work better for rqspinlock), I
think we could do something like:

	for (;;) {
		VAL = READ_ONCE(*__PTR);
		if (cond_expr)
			break;
		if (time_check_expr)
			break;

		if (__wfe) {
			__cmpwait_relaxed(__PTR, VAL, 0);
		} else if (__wfet) {
			__cmpwait_relaxed(__PTR, VAL, get_cycles() + timer_evt_period);
		} else {
			cpu_relax();
		}
	}

We go with WFE first in this case to avoid get_cycles() unnecessarily.

I'd suggest we add the WFET support in __cmpwait_relaxed() (or a
different function) as a separate patch, doesn't even need to be part of
this series. WFE is good enough to get things moving. WFET will only
make a difference if (1) we disable the event stream or (2) we need
better accuracy of the timeout.

> AFAICT the rqspinlock code should be able to work by specifying something
> like:
>   ((ktime_get_mono_fast_ns() > tval)) || (deadlock_check(&lock_context)))
> as the time_check_expr.

Why not the whole RES_CHECK_TIMEOUT(...) as in rqspinlock.c? It does the
deadlock check only after a timeout over a millisecond. Just follow the
res_atomic_cond_read_acquire() calls but replace '||' with a comma.

> I think they also want to rate limit how often deadlock_check() is
> called, so they can redefine SMP_TIMEWAIT_SPIN_COUNT to some large
> value for arm64.

Everyone would want a different rate of checking other stuff, so I think
this needs to go in their time_check_expr.

-- 
Catalin

  reply	other threads:[~2025-08-18 17:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-27  4:48 [PATCH v3 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora
2025-06-27  4:48 ` [PATCH v3 1/5] asm-generic: barrier: Add smp_cond_load_relaxed_timewait() Ankur Arora
2025-08-08 10:51   ` Catalin Marinas
2025-08-11 21:15     ` Ankur Arora
2025-08-13 16:09       ` Catalin Marinas
2025-08-13 16:29         ` Arnd Bergmann
2025-08-13 16:54           ` Christoph Lameter (Ampere)
2025-08-14 13:00           ` Catalin Marinas
2025-08-18 11:51             ` Arnd Bergmann
2025-08-18 18:28               ` Catalin Marinas
2025-08-14  7:30         ` Ankur Arora
2025-08-14 11:39           ` Catalin Marinas
2025-08-17 22:14             ` Ankur Arora
2025-08-18 17:55               ` Catalin Marinas [this message]
2025-08-18 19:15                 ` Ankur Arora
2025-08-19 10:34                   ` Catalin Marinas
2025-06-27  4:48 ` [PATCH v3 2/5] asm-generic: barrier: Handle spin-wait in smp_cond_load_relaxed_timewait() Ankur Arora
2025-06-27  4:48 ` [PATCH v3 3/5] asm-generic: barrier: Add smp_cond_load_acquire_timewait() Ankur Arora
2025-08-08  9:38   ` Catalin Marinas
2025-08-12  5:18     ` Ankur Arora
2025-06-27  4:48 ` [PATCH v3 4/5] arm64: barrier: Support waiting in smp_cond_load_relaxed_timewait() Ankur Arora
2025-06-27  4:48 ` [PATCH v3 5/5] arm64: barrier: Handle " Ankur Arora
2025-06-30 16:33   ` Christoph Lameter (Ampere)
2025-06-30 21:05     ` Ankur Arora
2025-07-01  5:55       ` Ankur Arora
2025-07-28 19:03 ` [PATCH v3 0/5] barrier: Add smp_cond_load_*_timewait() Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aKNo9pxx2w9sjJjc@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=ankur.a.arora@oracle.com \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=harisokn@amazon.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=memxor@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.