All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-pm@vger.kernel.org, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	will@kernel.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, x86@kernel.org, hpa@zytor.com, pbonzini@redhat.com,
	wanpengli@tencent.com, vkuznets@redhat.com, rafael@kernel.org,
	daniel.lezcano@linaro.org, peterz@infradead.org, arnd@arndb.de,
	lenb@kernel.org, mark.rutland@arm.com, harisokn@amazon.com,
	joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH 0/9] Enable haltpoll for arm64
Date: Tue, 30 Apr 2024 11:56:48 -0700	[thread overview]
Message-ID: <87y18ubrwv.fsf@oracle.com> (raw)
In-Reply-To: <20240430183730.561960-1-ankur.a.arora@oracle.com>


> Subject: Re: [PATCH 0/9] Enable haltpoll for arm64

A correction: please read the subject for the series as [PATCH v5] ...

Missed the version number it while sending out.

Thanks
Ankur

Ankur Arora <ankur.a.arora@oracle.com> writes:

> This patchset enables the cpuidle-haltpoll driver and its namesake
> governor on arm64. This is specifically interesting for KVM guests by
> reducing the IPC latencies.
>
> Comparing idle switching latencies on an arm64 KVM guest with
> perf bench sched pipe:
>
>                                      usecs/op       %stdev
>
>   no haltpoll (baseline)               13.48       +-  5.19%
>   with haltpoll                         6.84       +- 22.07%
>
>
> No change in performance for a similar test on x86:
>
>                                      usecs/op        %stdev
>
>   haltpoll w/ cpu_relax() (baseline)     4.75      +-  1.76%
>   haltpoll w/ smp_cond_load_relaxed()    4.78      +-  2.31%
>
> Both sets of tests were on otherwise idle systems with guest VCPUs
> pinned to specific PCPUs. One reason for the higher stdev on arm64
> is that trapping of the WFE instruction by the host KVM is contingent
> on the number of tasks on the runqueue.
>
>
> The patch series is organized in four parts:
>  - patches 1, 2 mangle the config option ARCH_HAS_CPU_RELAX, renaming
>    and moving it from x86 to common architectural code.
>  - next, patches 3-5, reorganize the haltpoll selection and init logic
>    to allow architecture code to select it.
>  - patch 6, reorganizes the poll_idle() loop, switching from using
>    cpu_relax() directly to smp_cond_load_relaxed().
>  - and finally, patches 7-9, add the bits for arm64 support.
>
> What is still missing: this series largely completes the haltpoll side
> of functionality for arm64. There are, however, a few related areas
> that still need to be threshed out:
>
>  - WFET support: WFE on arm64 does not guarantee that poll_idle()
>    would terminate in halt_poll_ns. Using WFET would address this.
>  - KVM_NO_POLL support on arm64
>  - KVM TWED support on arm64: allow the host to limit time spent in
>    WFE.
>
>
> Changelog:
>
> v5:
>  - rework the poll_idle() loop around smp_cond_load_relaxed() (review
>    comment from Tomohiro Misono.)
>  - also rework selection of cpuidle-haltpoll. Now selected based
>    on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
>  - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
>    arm64 now depends on the event-stream being enabled.
>  - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
>  - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
>
> v4 changes from v3:
>  - change 7/8 per Rafael input: drop the parens and use ret for the final check
>  - add 8/8 which renames the guard for building poll_state
>
> v3 changes from v2:
>  - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
>  - add Ack-by from Rafael Wysocki on 2/7
>
> v2 changes from v1:
>  - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
>    (this improves by 50% at least the CPU cycles consumed in the tests above:
>    10,716,881,137 now vs 14,503,014,257 before)
>  - removed the ifdef from patch 1 per RafaelW
>
> Ankur Arora (4):
>   cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
>   cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
>   arm64: support cpuidle-haltpoll
>   cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
>
> Joao Martins (4):
>   Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
>   cpuidle-haltpoll: define arch_haltpoll_supported()
>   governors/haltpoll: drop kvm_para_available() check
>   arm64: define TIF_POLLING_NRFLAG
>
> Mihai Carabas (1):
>   cpuidle/poll_state: poll via smp_cond_load_relaxed()
>
>  arch/Kconfig                              |  3 +++
>  arch/arm64/Kconfig                        | 10 ++++++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
>  arch/arm64/include/asm/thread_info.h      |  2 ++
>  arch/x86/Kconfig                          |  4 +---
>  arch/x86/include/asm/cpuidle_haltpoll.h   |  1 +
>  arch/x86/kernel/kvm.c                     | 10 ++++++++++
>  drivers/acpi/processor_idle.c             |  4 ++--
>  drivers/cpuidle/Kconfig                   |  5 ++---
>  drivers/cpuidle/Makefile                  |  2 +-
>  drivers/cpuidle/cpuidle-haltpoll.c        |  9 ++-------
>  drivers/cpuidle/governors/haltpoll.c      |  6 +-----
>  drivers/cpuidle/poll_state.c              | 21 ++++++++++++++++-----
>  drivers/idle/Kconfig                      |  1 +
>  include/linux/cpuidle.h                   |  2 +-
>  include/linux/cpuidle_haltpoll.h          |  5 +++++
>  16 files changed, 79 insertions(+), 27 deletions(-)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h


--
ankur

WARNING: multiple messages have this Message-ID (diff)
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Ankur Arora <ankur.a.arora@oracle.com>
Cc: linux-pm@vger.kernel.org, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, catalin.marinas@arm.com,
	will@kernel.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, x86@kernel.org, hpa@zytor.com, pbonzini@redhat.com,
	wanpengli@tencent.com, vkuznets@redhat.com, rafael@kernel.org,
	daniel.lezcano@linaro.org, peterz@infradead.org, arnd@arndb.de,
	lenb@kernel.org, mark.rutland@arm.com, harisokn@amazon.com,
	joao.m.martins@oracle.com, boris.ostrovsky@oracle.com,
	konrad.wilk@oracle.com
Subject: Re: [PATCH 0/9] Enable haltpoll for arm64
Date: Tue, 30 Apr 2024 11:56:48 -0700	[thread overview]
Message-ID: <87y18ubrwv.fsf@oracle.com> (raw)
In-Reply-To: <20240430183730.561960-1-ankur.a.arora@oracle.com>


> Subject: Re: [PATCH 0/9] Enable haltpoll for arm64

A correction: please read the subject for the series as [PATCH v5] ...

Missed the version number it while sending out.

Thanks
Ankur

Ankur Arora <ankur.a.arora@oracle.com> writes:

> This patchset enables the cpuidle-haltpoll driver and its namesake
> governor on arm64. This is specifically interesting for KVM guests by
> reducing the IPC latencies.
>
> Comparing idle switching latencies on an arm64 KVM guest with
> perf bench sched pipe:
>
>                                      usecs/op       %stdev
>
>   no haltpoll (baseline)               13.48       +-  5.19%
>   with haltpoll                         6.84       +- 22.07%
>
>
> No change in performance for a similar test on x86:
>
>                                      usecs/op        %stdev
>
>   haltpoll w/ cpu_relax() (baseline)     4.75      +-  1.76%
>   haltpoll w/ smp_cond_load_relaxed()    4.78      +-  2.31%
>
> Both sets of tests were on otherwise idle systems with guest VCPUs
> pinned to specific PCPUs. One reason for the higher stdev on arm64
> is that trapping of the WFE instruction by the host KVM is contingent
> on the number of tasks on the runqueue.
>
>
> The patch series is organized in four parts:
>  - patches 1, 2 mangle the config option ARCH_HAS_CPU_RELAX, renaming
>    and moving it from x86 to common architectural code.
>  - next, patches 3-5, reorganize the haltpoll selection and init logic
>    to allow architecture code to select it.
>  - patch 6, reorganizes the poll_idle() loop, switching from using
>    cpu_relax() directly to smp_cond_load_relaxed().
>  - and finally, patches 7-9, add the bits for arm64 support.
>
> What is still missing: this series largely completes the haltpoll side
> of functionality for arm64. There are, however, a few related areas
> that still need to be threshed out:
>
>  - WFET support: WFE on arm64 does not guarantee that poll_idle()
>    would terminate in halt_poll_ns. Using WFET would address this.
>  - KVM_NO_POLL support on arm64
>  - KVM TWED support on arm64: allow the host to limit time spent in
>    WFE.
>
>
> Changelog:
>
> v5:
>  - rework the poll_idle() loop around smp_cond_load_relaxed() (review
>    comment from Tomohiro Misono.)
>  - also rework selection of cpuidle-haltpoll. Now selected based
>    on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
>  - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
>    arm64 now depends on the event-stream being enabled.
>  - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
>  - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
>
> v4 changes from v3:
>  - change 7/8 per Rafael input: drop the parens and use ret for the final check
>  - add 8/8 which renames the guard for building poll_state
>
> v3 changes from v2:
>  - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
>  - add Ack-by from Rafael Wysocki on 2/7
>
> v2 changes from v1:
>  - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
>    (this improves by 50% at least the CPU cycles consumed in the tests above:
>    10,716,881,137 now vs 14,503,014,257 before)
>  - removed the ifdef from patch 1 per RafaelW
>
> Ankur Arora (4):
>   cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
>   cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
>   arm64: support cpuidle-haltpoll
>   cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
>
> Joao Martins (4):
>   Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
>   cpuidle-haltpoll: define arch_haltpoll_supported()
>   governors/haltpoll: drop kvm_para_available() check
>   arm64: define TIF_POLLING_NRFLAG
>
> Mihai Carabas (1):
>   cpuidle/poll_state: poll via smp_cond_load_relaxed()
>
>  arch/Kconfig                              |  3 +++
>  arch/arm64/Kconfig                        | 10 ++++++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++
>  arch/arm64/include/asm/thread_info.h      |  2 ++
>  arch/x86/Kconfig                          |  4 +---
>  arch/x86/include/asm/cpuidle_haltpoll.h   |  1 +
>  arch/x86/kernel/kvm.c                     | 10 ++++++++++
>  drivers/acpi/processor_idle.c             |  4 ++--
>  drivers/cpuidle/Kconfig                   |  5 ++---
>  drivers/cpuidle/Makefile                  |  2 +-
>  drivers/cpuidle/cpuidle-haltpoll.c        |  9 ++-------
>  drivers/cpuidle/governors/haltpoll.c      |  6 +-----
>  drivers/cpuidle/poll_state.c              | 21 ++++++++++++++++-----
>  drivers/idle/Kconfig                      |  1 +
>  include/linux/cpuidle.h                   |  2 +-
>  include/linux/cpuidle_haltpoll.h          |  5 +++++
>  16 files changed, 79 insertions(+), 27 deletions(-)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h


--
ankur

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2024-04-30 18:57 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 18:37 [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora
2024-04-30 18:37 ` Ankur Arora
2024-04-30 18:37 ` [PATCH 1/9] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-05-02  1:33   ` Christoph Lameter (Ampere)
2024-05-02  1:33     ` Christoph Lameter (Ampere)
2024-05-03  4:13     ` Ankur Arora
2024-05-03  4:13       ` Ankur Arora
2024-05-03 17:07       ` Christoph Lameter (Ampere)
2024-05-03 17:07         ` Christoph Lameter (Ampere)
2024-05-06 21:27         ` Ankur Arora
2024-05-06 21:27           ` Ankur Arora
2024-04-30 18:37 ` [PATCH 2/9] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-04-30 18:37 ` [PATCH 3/9] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-05-22 16:09   ` Joao Martins
2024-05-22 16:09     ` Joao Martins
2024-04-30 18:37 ` [PATCH 4/9] cpuidle-haltpoll: define arch_haltpoll_supported() Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-05-01 11:48   ` kernel test robot
2024-05-01 11:48     ` kernel test robot
2024-05-22 16:09   ` Joao Martins
2024-05-22 16:09     ` Joao Martins
2024-06-05  5:47     ` Ankur Arora
2024-06-05  5:47       ` Ankur Arora
2024-04-30 18:37 ` [PATCH 5/9] governors/haltpoll: drop kvm_para_available() check Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-04-30 18:37 ` [PATCH 6/9] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-04-30 18:37 ` [PATCH 7/9] arm64: define TIF_POLLING_NRFLAG Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-04-30 18:37 ` [PATCH 8/9] arm64: support cpuidle-haltpoll Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-05-30 23:07   ` Okanovic, Haris
2024-05-30 23:07     ` Okanovic, Haris
2024-06-04 23:09     ` Ankur Arora
2024-06-04 23:09       ` Ankur Arora
2024-06-19 12:17   ` Sudeep Holla
2024-06-21 23:59     ` Ankur Arora
2024-06-24 10:54       ` Sudeep Holla
2024-06-25  1:17         ` Ankur Arora
2024-04-30 18:37 ` [PATCH 9/9] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
2024-04-30 18:37   ` Ankur Arora
2024-04-30 18:56 ` Ankur Arora [this message]
2024-04-30 18:56   ` [PATCH 0/9] Enable haltpoll for arm64 Ankur Arora

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y18ubrwv.fsf@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=arnd@arndb.de \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=harisokn@amazon.com \
    --cc=hpa@zytor.com \
    --cc=joao.m.martins@oracle.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=lenb@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.