From: Ankur Arora <ankur.a.arora@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-pm@vger.kernel.org,
bpf@vger.kernel.org, arnd@arndb.de, catalin.marinas@arm.com,
will@kernel.org, peterz@infradead.org, mark.rutland@arm.com,
harisokn@amazon.com, cl@gentwo.org, ast@kernel.org,
rafael@kernel.org, daniel.lezcano@linaro.org, memxor@gmail.com,
zhenglifeng1@huawei.com, xueshuai@linux.alibaba.com,
rdunlap@infradead.org, david.laight.linux@gmail.com,
joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
konrad.wilk@oracle.com
Subject: Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()
Date: Mon, 16 Mar 2026 15:08:07 -0700 [thread overview]
Message-ID: <874imftol4.fsf@oracle.com> (raw)
In-Reply-To: <20260315184925.b6f93386e918ca79614843e3@linux-foundation.org>
Andrew Morton <akpm@linux-foundation.org> writes:
> On Sun, 15 Mar 2026 18:36:39 -0700 Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
>> Hi,
>>
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
>>
>> ...
>>
>
> How are we to determine that this change is successful, useful, etc?
Good point. So this series was split off from this one here:
https://lore.kernel.org/lkml/20250218213337.377987-1-ankur.a.arora@oracle.com/
The series enables ARCH_HAS_CPU_RELAX on arm64 which should allow
relatively cheap polling in idle on arm64.
However, it does need a few more patches from the series above to do that.
> Reduced CPU consumption? Reduced energy usage? Improved latencies?
With the additional patches this should improve wakeup latency:
I ran the sched-pipe test with processes on VCPUs 4 and 5 with
kvm-arm.wfi_trap_policy=notrap.
# perf stat -r 5 --cpu 4,5 -e task-clock,cycles,instructions,sched:sched_wake_idle_without_ipi \
perf bench sched pipe -l 1000000 -c 4
# No haltpoll (and, no TIF_POLLING_NRFLAG):
Performance counter stats for 'CPU(s) 4,5' (5 runs):
25,229.57 msec task-clock # 2.000 CPUs utilized ( +- 7.75% )
45,821,250,284 cycles # 1.816 GHz ( +- 10.07% )
26,557,496,665 instructions # 0.58 insn per cycle ( +- 0.21% )
0 sched:sched_wake_idle_without_ipi # 0.000 /sec
12.615 +- 0.977 seconds time elapsed ( +- 7.75% )
# Haltpoll:
Performance counter stats for 'CPU(s) 4,5' (5 runs):
15,131.58 msec task-clock # 2.000 CPUs utilized ( +- 10.00% )
34,158,188,839 cycles # 2.257 GHz ( +- 6.91% )
20,824,950,916 instructions # 0.61 insn per cycle ( +- 0.09% )
1,983,822 sched:sched_wake_idle_without_ipi # 131.105 K/sec ( +- 0.78% )
7.566 +- 0.756 seconds time elapsed ( +- 10.00% )
We get a decent boost just because we are executing ~20% fewer
instructions. Not sure how the cpu frequency scaling works in a VM but
we also run at a higher frequency.
(That specifically applies to guests but that series also adds enables this
with acpi-idle for baremetal.)
(From: https://lore.kernel.org/lkml/877c9zhk68.fsf@oracle.com/)
>> Finally update poll_idle() and resilient queued spinlocks to use them.
>
> Have you identified other suitable sites for conversion?
Haven't found other places in the core kernel where this could be used.
I think one reason is that the typical kernel wait is unbounded.
There are some in drivers/ that have this pattern. For instance I think
this in drivers/iommu/arm/arm-smmu-v3 could be converted:
__arm_smmu_cmdq_poll_until_msi().
However, as David Laight pointed out in this thread
(https://lore.kernel.org/lkml/20260214113122.70627a8b@pumpkin/)
that this would be fine so long as the polling is on memory, but would
need some work to handle MMIO.
>> Documentation/atomic_t.txt | 14 +++--
>> arch/arm64/Kconfig | 3 +
>> arch/arm64/include/asm/barrier.h | 23 +++++++
>> arch/arm64/include/asm/cmpxchg.h | 62 +++++++++++++++----
>> arch/arm64/include/asm/delay-const.h | 27 +++++++++
>> arch/arm64/include/asm/rqspinlock.h | 85 --------------------------
>> arch/arm64/lib/delay.c | 15 ++---
>> drivers/cpuidle/poll_state.c | 21 +------
>> drivers/soc/qcom/rpmh-rsc.c | 8 +--
>> include/asm-generic/barrier.h | 90 ++++++++++++++++++++++++++++
>> include/linux/atomic.h | 10 ++++
>> include/linux/atomic/atomic-long.h | 18 +++---
>> include/linux/sched/idle.h | 29 +++++++++
>> kernel/bpf/rqspinlock.c | 77 +++++++++++++++---------
>> scripts/atomic/gen-atomic-long.sh | 16 +++--
>> 15 files changed, 320 insertions(+), 178 deletions(-)
>> create mode 100644 arch/arm64/include/asm/delay-const.h
>
> Some sort of testing in lib/tests/ would be appropriate and useful.
Makes sense. Will add.
Thanks
--
ankur
next prev parent reply other threads:[~2026-03-16 22:09 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 1:36 [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 01/12] asm-generic: barrier: Add smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 02/12] arm64: barrier: Support smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 03/12] arm64/delay: move some constants out to a separate header Ankur Arora
2026-03-16 1:36 ` [PATCH v10 04/12] arm64: support WFET in smp_cond_load_relaxed_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 05/12] arm64: rqspinlock: Remove private copy of smp_cond_load_acquire_timewait() Ankur Arora
2026-03-24 1:41 ` Kumar Kartikeya Dwivedi
2026-03-25 5:58 ` Ankur Arora
2026-03-16 1:36 ` [PATCH v10 06/12] asm-generic: barrier: Add smp_cond_load_acquire_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 07/12] atomic: Add atomic_cond_read_*_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 08/12] locking/atomic: scripts: build atomic_long_cond_read_*_timeout() Ankur Arora
2026-03-16 1:36 ` [PATCH v10 09/12] bpf/rqspinlock: switch check_timeout() to a clock interface Ankur Arora
2026-03-24 1:43 ` Kumar Kartikeya Dwivedi
2026-03-25 5:57 ` Ankur Arora
2026-03-16 1:36 ` [PATCH v10 10/12] bpf/rqspinlock: Use smp_cond_load_acquire_timeout() Ankur Arora
2026-03-24 1:46 ` Kumar Kartikeya Dwivedi
2026-03-16 1:36 ` [PATCH v10 11/12] sched: add need-resched timed wait interface Ankur Arora
2026-03-16 1:36 ` [PATCH v10 12/12] cpuidle/poll_state: Wait for need-resched via tif_need_resched_relaxed_wait() Ankur Arora
2026-03-16 1:49 ` [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout() Andrew Morton
2026-03-16 22:08 ` Ankur Arora [this message]
2026-03-16 23:37 ` David Laight
2026-03-17 6:53 ` Ankur Arora
2026-03-17 9:17 ` David Laight
2026-03-25 13:53 ` Catalin Marinas
2026-03-25 15:42 ` David Laight
2026-03-25 16:32 ` Catalin Marinas
2026-03-25 20:23 ` David Laight
2026-03-26 15:39 ` Catalin Marinas
2026-03-25 15:55 ` Catalin Marinas
2026-03-25 19:36 ` David Laight
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874imftol4.fsf@oracle.com \
--to=ankur.a.arora@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=ast@kernel.org \
--cc=boris.ostrovsky@oracle.com \
--cc=bpf@vger.kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cl@gentwo.org \
--cc=daniel.lezcano@linaro.org \
--cc=david.laight.linux@gmail.com \
--cc=harisokn@amazon.com \
--cc=joao.m.martins@oracle.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=memxor@gmail.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rdunlap@infradead.org \
--cc=will@kernel.org \
--cc=xueshuai@linux.alibaba.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox