public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org,
	bpf@vger.kernel.org, arnd@arndb.de, catalin.marinas@arm.com,
	will@kernel.org, peterz@infradead.org, mark.rutland@arm.com,
	harisokn@amazon.com, cl@gentwo.org, ast@kernel.org,
	rafael@kernel.org, daniel.lezcano@linaro.org, memxor@gmail.com,
	zhenglifeng1@huawei.com, xueshuai@linux.alibaba.com,
	rdunlap@infradead.org, david.laight.linux@gmail.com,
	joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()
Date: Mon, 16 Mar 2026 15:08:07 -0700	[thread overview]
Message-ID: <874imftol4.fsf@oracle.com> (raw)
In-Reply-To: <20260315184925.b6f93386e918ca79614843e3@linux-foundation.org>


Andrew Morton <akpm@linux-foundation.org> writes:

> On Sun, 15 Mar 2026 18:36:39 -0700 Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
>> Hi,
>>
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
>>
>> ...
>>
>
> How are we to determine that this change is successful, useful, etc?

Good point. So this series was split off from this one here:
  https://lore.kernel.org/lkml/20250218213337.377987-1-ankur.a.arora@oracle.com/

The series enables ARCH_HAS_CPU_RELAX on arm64 which should allow
relatively cheap polling in idle on arm64.
However, it does need a few more patches from the series above to do that.

> Reduced CPU consumption?  Reduced energy usage?  Improved latencies?

With the additional patches this should improve wakeup latency:

  I ran the sched-pipe test with processes on VCPUs 4 and 5 with
  kvm-arm.wfi_trap_policy=notrap.

  # perf stat -r 5 --cpu 4,5 -e task-clock,cycles,instructions,sched:sched_wake_idle_without_ipi \
  perf bench sched pipe -l 1000000 -c 4

  # No haltpoll (and, no TIF_POLLING_NRFLAG):

  Performance counter stats for 'CPU(s) 4,5' (5 runs):

         25,229.57 msec task-clock                       #    2.000 CPUs utilized               ( +-  7.75% )
    45,821,250,284      cycles                           #    1.816 GHz                         ( +- 10.07% )
    26,557,496,665      instructions                     #    0.58  insn per cycle              ( +-  0.21% )
                 0      sched:sched_wake_idle_without_ipi #    0.000 /sec

       12.615 +- 0.977 seconds time elapsed  ( +-  7.75% )


  # Haltpoll:

  Performance counter stats for 'CPU(s) 4,5' (5 runs):

         15,131.58 msec task-clock                       #    2.000 CPUs utilized               ( +- 10.00% )
    34,158,188,839      cycles                           #    2.257 GHz                         ( +-  6.91% )
    20,824,950,916      instructions                     #    0.61  insn per cycle              ( +-  0.09% )
         1,983,822      sched:sched_wake_idle_without_ipi #  131.105 K/sec                       ( +-  0.78% )

        7.566 +- 0.756 seconds time elapsed  ( +- 10.00% )

  We get a decent boost just because we are executing ~20% fewer
  instructions. Not sure how the cpu frequency scaling works in a VM but
  we also run at a higher frequency.

(That specifically applies to guests but that series also adds enables this
with acpi-idle for baremetal.)

(From: https://lore.kernel.org/lkml/877c9zhk68.fsf@oracle.com/)

>> Finally update poll_idle() and resilient queued spinlocks to use them.
>
> Have you identified other suitable sites for conversion?

Haven't found other places in the core kernel where this could be used.
I think one reason is that the typical kernel wait is unbounded.

There are some in drivers/ that have this pattern. For instance I think
this in drivers/iommu/arm/arm-smmu-v3 could be converted:
__arm_smmu_cmdq_poll_until_msi().

However, as David Laight pointed out in this thread
(https://lore.kernel.org/lkml/20260214113122.70627a8b@pumpkin/)
that this would be fine so long as the polling is on memory, but would
need some work to handle MMIO.

>>  Documentation/atomic_t.txt           | 14 +++--
>>  arch/arm64/Kconfig                   |  3 +
>>  arch/arm64/include/asm/barrier.h     | 23 +++++++
>>  arch/arm64/include/asm/cmpxchg.h     | 62 +++++++++++++++----
>>  arch/arm64/include/asm/delay-const.h | 27 +++++++++
>>  arch/arm64/include/asm/rqspinlock.h  | 85 --------------------------
>>  arch/arm64/lib/delay.c               | 15 ++---
>>  drivers/cpuidle/poll_state.c         | 21 +------
>>  drivers/soc/qcom/rpmh-rsc.c          |  8 +--
>>  include/asm-generic/barrier.h        | 90 ++++++++++++++++++++++++++++
>>  include/linux/atomic.h               | 10 ++++
>>  include/linux/atomic/atomic-long.h   | 18 +++---
>>  include/linux/sched/idle.h           | 29 +++++++++
>>  kernel/bpf/rqspinlock.c              | 77 +++++++++++++++---------
>>  scripts/atomic/gen-atomic-long.sh    | 16 +++--
>>  15 files changed, 320 insertions(+), 178 deletions(-)
>>  create mode 100644 arch/arm64/include/asm/delay-const.h
>
> Some sort of testing in lib/tests/ would be appropriate and useful.

Makes sense. Will add.

Thanks
--
ankur

  reply	other threads:[~2026-03-16 22:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-16  1:36 [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 01/12] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 02/12] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 03/12] arm64/delay: move some constants out to a separate header Ankur Arora
2026-03-16  1:36 ` [PATCH v10 04/12] arm64: support WFET in smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 05/12] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2026-03-24  1:41   ` Kumar Kartikeya Dwivedi
2026-03-25  5:58     ` Ankur Arora
2026-03-16  1:36 ` [PATCH v10 06/12] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 07/12] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 08/12] locking/atomic: scripts: build atomic_long_cond_read_*_timeout() Ankur Arora
2026-03-16  1:36 ` [PATCH v10 09/12] bpf/rqspinlock: switch check_timeout() to a clock interface Ankur Arora
2026-03-24  1:43   ` Kumar Kartikeya Dwivedi
2026-03-25  5:57     ` Ankur Arora
2026-03-16  1:36 ` [PATCH v10 10/12] bpf/rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2026-03-24  1:46   ` Kumar Kartikeya Dwivedi
2026-03-16  1:36 ` [PATCH v10 11/12] sched: add need-resched timed wait interface Ankur Arora
2026-03-16  1:36 ` [PATCH v10 12/12] cpuidle/poll_state: Wait for need-resched via tif_need_resched_relaxed_wait() Ankur Arora
2026-03-16  1:49 ` [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Andrew Morton
2026-03-16 22:08   ` Ankur Arora [this message]
2026-03-16 23:37     ` David Laight
2026-03-17  6:53       ` Ankur Arora
2026-03-17  9:17         ` David Laight
2026-03-25 13:53           ` Catalin Marinas
2026-03-25 15:42             ` David Laight
2026-03-25 16:32               ` Catalin Marinas
2026-03-25 20:23                 ` David Laight
2026-03-26 15:39                   ` Catalin Marinas
2026-03-25 15:55       ` Catalin Marinas
2026-03-25 19:36         ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874imftol4.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=ast@kernel.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cl@gentwo.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=david.laight.linux@gmail.com \
    --cc=harisokn@amazon.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=memxor@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=will@kernel.org \
    --cc=xueshuai@linux.alibaba.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox