* [PATCH v7 00/10] Enable haltpoll on arm64
@ 2024-08-30 22:28 Ankur Arora
2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
` (9 more replies)
0 siblings, 10 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
This patchset enables the cpuidle-haltpoll driver and its namesake
governor on arm64. This is specifically interesting for KVM guests by
reducing IPC latencies.
Comparing idle switching latencies on an arm64 KVM guest with
perf bench sched pipe:
usecs/op %stdev
no haltpoll (baseline) 13.48 +- 5.19%
with haltpoll 6.84 +- 22.07%
No change in performance for a similar test on x86:
usecs/op %stdev
haltpoll w/ cpu_relax() (baseline) 4.75 +- 1.76%
haltpoll w/ smp_cond_load_relaxed() 4.78 +- 2.31%
Both sets of tests were on otherwise idle systems with guest VCPUs
pinned to specific PCPUs. One reason for the higher stdev on arm64
is that trapping of the WFE instruction by the host KVM is contingent
on the number of tasks on the runqueue.
Tomohiro Misono and Haris Okanovic also report similar latency
improvements on Grace and Graviton systems [1] [2].
The patch series is organized in three parts:
- patch 1, reorganizes the poll_idle() loop, switching to
smp_cond_load_relaxed() in the polling loop.
Relatedly patches 2, 3 mangle the config option ARCH_HAS_CPU_RELAX,
renaming it to ARCH_HAS_OPTIMIZED_POLL.
- patches 4-6 reorganize the haltpoll selection and init logic
to allow architecture code to select it.
- and finally, patches 7-10 add the bits for arm64 support.
What is still missing: this series largely completes the haltpoll side
of functionality for arm64. There are, however, a few related areas
that still need to be threshed out:
- WFET support: WFE on arm64 does not guarantee that poll_idle()
would terminate in halt_poll_ns. Using WFET would address this.
- KVM_NO_POLL support on arm64
- KVM TWED support on arm64: allow the host to limit time spent in
WFE.
Changelog:
v7: No significant logic changes. Mostly a respin of v6.
- minor cleanup in poll_idle() (Christoph Lameter)
- fixes conflicts due to code movement in arch/arm64/kernel/cpuidle.c
(Tomohiro Misono)
v6:
- reordered the patches to keep poll_idle() and ARCH_HAS_OPTIMIZED_POLL
changes together (comment from Christoph Lameter)
- threshes out the commit messages a bit more (comments from Christoph
Lameter, Sudeep Holla)
- also rework selection of cpuidle-haltpoll. Now selected based
on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
- moved back to arch_haltpoll_want() (comment from Joao Martins)
Also, arch_haltpoll_want() now takes the force parameter and is
now responsible for the complete selection (or not) of haltpoll.
- fixes the build breakage on i386
- fixes the cpuidle-haltpoll module breakage on arm64 (comment from
Tomohiro Misono, Haris Okanovic)
v5:
- rework the poll_idle() loop around smp_cond_load_relaxed() (review
comment from Tomohiro Misono.)
- also rework selection of cpuidle-haltpoll. Now selected based
on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
- arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
arm64 now depends on the event-stream being enabled.
- limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
- ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.
v4 changes from v3:
- change 7/8 per Rafael input: drop the parens and use ret for the final check
- add 8/8 which renames the guard for building poll_state
v3 changes from v2:
- fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
- add Ack-by from Rafael Wysocki on 2/7
v2 changes from v1:
- added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
(this improves by 50% at least the CPU cycles consumed in the tests above:
10,716,881,137 now vs 14,503,014,257 before)
- removed the ifdef from patch 1 per RafaelW
Please review.
[1] https://lore.kernel.org/lkml/TY3PR01MB111481E9B0AF263ACC8EA5D4AE5BA2@TY3PR01MB11148.jpnprd01.prod.outlook.com/
[2] https://lore.kernel.org/lkml/104d0ec31cb45477e27273e089402d4205ee4042.camel@amazon.com/
Ankur Arora (5):
cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
arm64: idle: export arch_cpu_idle
arm64: support cpuidle-haltpoll
cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
Joao Martins (4):
Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
cpuidle-haltpoll: define arch_haltpoll_want()
governors/haltpoll: drop kvm_para_available() check
arm64: define TIF_POLLING_NRFLAG
Mihai Carabas (1):
cpuidle/poll_state: poll via smp_cond_load_relaxed()
arch/Kconfig | 3 +++
arch/arm64/Kconfig | 10 ++++++++++
arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
arch/arm64/include/asm/thread_info.h | 2 ++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/cpuidle_haltpoll.c | 22 ++++++++++++++++++++++
arch/arm64/kernel/idle.c | 1 +
arch/x86/Kconfig | 5 ++---
arch/x86/include/asm/cpuidle_haltpoll.h | 1 +
arch/x86/kernel/kvm.c | 13 +++++++++++++
drivers/acpi/processor_idle.c | 4 ++--
drivers/cpuidle/Kconfig | 5 ++---
drivers/cpuidle/Makefile | 2 +-
drivers/cpuidle/cpuidle-haltpoll.c | 12 +-----------
drivers/cpuidle/governors/haltpoll.c | 6 +-----
drivers/cpuidle/poll_state.c | 22 ++++++++++++++++------
drivers/idle/Kconfig | 1 +
include/linux/cpuidle.h | 2 +-
include/linux/cpuidle_haltpoll.h | 5 +++++
19 files changed, 95 insertions(+), 32 deletions(-)
create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
--
2.43.5
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed()
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
` (8 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
From: Mihai Carabas <mihai.carabas@oracle.com>
The inner loop in poll_idle() polls up to POLL_IDLE_RELAX_COUNT times,
checking to see if the thread has the TIF_NEED_RESCHED bit set. The
loop exits once the condition is met, or if the poll time limit has
been exceeded.
To minimize the number of instructions executed each iteration, the
time check is done only infrequently (once every POLL_IDLE_RELAX_COUNT
iterations). In addition, each loop iteration executes cpu_relax()
which on certain platforms provides a hint to the pipeline that the
loop is busy-waiting, thus allowing the processor to reduce power
consumption.
However, cpu_relax() is defined optimally only on x86. On arm64, for
instance, it is implemented as a YIELD which only serves a hint to the
CPU that it prioritize a different hardware thread if one is available.
arm64, however, does expose a more optimal polling mechanism via
smp_cond_load_relaxed() which uses LDXR, WFE to wait until a store
to a specified region.
So restructure the loop, folding both checks in smp_cond_load_relaxed().
Also, move the time check to the head of the loop allowing it to exit
straight-away once TIF_NEED_RESCHED is set.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/poll_state.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..fc1204426158 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
raw_local_irq_enable();
if (!current_set_polling_and_test()) {
- unsigned int loop_count = 0;
u64 limit;
limit = cpuidle_poll_time(drv, dev);
while (!need_resched()) {
- cpu_relax();
- if (loop_count++ < POLL_IDLE_RELAX_COUNT)
- continue;
-
- loop_count = 0;
+ unsigned int loop_count = 0;
if (local_clock_noinstr() - time_start > limit) {
dev->poll_time_limit = true;
break;
}
+
+ smp_cond_load_relaxed(¤t_thread_info()->flags,
+ VAL & _TIF_NEED_RESCHED ||
+ loop_count++ >= POLL_IDLE_RELAX_COUNT);
}
}
raw_local_irq_disable();
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
` (7 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
ARCH_HAS_CPU_RELAX is defined on architectures that provide an
primitive (via cpu_relax()) that can be used as part of a polling
mechanism -- one that would be cheaper than spinning in a tight
loop.
However, recent changes in poll_idle() mean that a higher level
primitive -- smp_cond_load_relaxed() is used for polling. This would
in-turn use cpu_relax() or an architecture specific implementation.
On ARM64 in particular this turns into a WFE which waits on a store
to a cacheline instead of a busy poll.
Accordingly condition the polling drivers on ARCH_HAS_OPTIMIZED_POLL
instead of ARCH_HAS_CPU_RELAX. While at it, make both intel-idle
and cpuidle-haltpoll explicitly depend on ARCH_HAS_CPU_RELAX.
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/x86/Kconfig | 2 +-
drivers/acpi/processor_idle.c | 4 ++--
drivers/cpuidle/Kconfig | 2 +-
drivers/cpuidle/Makefile | 2 +-
drivers/idle/Kconfig | 1 +
include/linux/cpuidle.h | 2 +-
6 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 007bab9f2a0e..c1b49d535eb8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -373,7 +373,7 @@ config ARCH_MAY_HAVE_PC_FDC
config GENERIC_CALIBRATE_DELAY
def_bool y
-config ARCH_HAS_CPU_RELAX
+config ARCH_HAS_OPTIMIZED_POLL
def_bool y
config ARCH_HIBERNATION_POSSIBLE
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 831fa4a12159..44096406d65d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -35,7 +35,7 @@
#include <asm/cpu.h>
#endif
-#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX) ? 1 : 0)
+#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL) ? 1 : 0)
static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;
module_param(max_cstate, uint, 0400);
@@ -782,7 +782,7 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
if (max_cstate == 0)
max_cstate = 1;
- if (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX)) {
+ if (IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL)) {
cpuidle_poll_state_init(drv);
count = 1;
} else {
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index cac5997dca50..75f6e176bbc8 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -73,7 +73,7 @@ endmenu
config HALTPOLL_CPUIDLE
tristate "Halt poll cpuidle driver"
- depends on X86 && KVM_GUEST
+ depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
select CPU_IDLE_GOV_HALTPOLL
default y
help
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d103342b7cfc..f29dfd1525b0 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -7,7 +7,7 @@ obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
obj-$(CONFIG_DT_IDLE_STATES) += dt_idle_states.o
obj-$(CONFIG_DT_IDLE_GENPD) += dt_idle_genpd.o
-obj-$(CONFIG_ARCH_HAS_CPU_RELAX) += poll_state.o
+obj-$(CONFIG_ARCH_HAS_OPTIMIZED_POLL) += poll_state.o
obj-$(CONFIG_HALTPOLL_CPUIDLE) += cpuidle-haltpoll.o
##################################################################################
diff --git a/drivers/idle/Kconfig b/drivers/idle/Kconfig
index 6707d2539fc4..6f9b1d48fede 100644
--- a/drivers/idle/Kconfig
+++ b/drivers/idle/Kconfig
@@ -4,6 +4,7 @@ config INTEL_IDLE
depends on CPU_IDLE
depends on X86
depends on CPU_SUP_INTEL
+ depends on ARCH_HAS_OPTIMIZED_POLL
help
Enable intel_idle, a cpuidle driver that includes knowledge of
native Intel hardware idle features. The acpi_idle driver
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3183aeb7f5b4..7e7e58a17b07 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -275,7 +275,7 @@ static inline void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev,
}
#endif
-#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_CPU_RELAX)
+#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_OPTIMIZED_POLL)
void cpuidle_poll_state_init(struct cpuidle_driver *drv);
#else
static inline void cpuidle_poll_state_init(struct cpuidle_driver *drv) {}
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
` (6 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
From: Joao Martins <joao.m.martins@oracle.com>
ARCH_HAS_OPTIMIZED_POLL gates selection of polling while idle in
poll_idle(). Move the configuration option to arch/Kconfig to allow
non-x86 architectures to select it.
Note that ARCH_HAS_OPTIMIZED_POLL should probably be exclusive with
GENERIC_IDLE_POLL_SETUP (which controls the generic polling logic in
cpu_idle_poll()). However, that would remove boot options
(hlt=, nohlt=). So, leave it untouched for now.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/Kconfig | 3 +++
arch/x86/Kconfig | 4 +---
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 975dd22a2dbd..d43894369015 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -264,6 +264,9 @@ config HAVE_ARCH_TRACEHOOK
config HAVE_DMA_CONTIGUOUS
bool
+config ARCH_HAS_OPTIMIZED_POLL
+ bool
+
config GENERIC_SMP_IDLE_THREAD
bool
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1b49d535eb8..0d95170ea0f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -134,6 +134,7 @@ config X86
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_HAS_OPTIMIZED_POLL
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
@@ -373,9 +374,6 @@ config ARCH_MAY_HAVE_PC_FDC
config GENERIC_CALIBRATE_DELAY
def_bool y
-config ARCH_HAS_OPTIMIZED_POLL
- def_bool y
-
config ARCH_HIBERNATION_POSSIBLE
def_bool y
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want()
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (2 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
` (5 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
From: Joao Martins <joao.m.martins@oracle.com>
kvm_para_has_hint(KVM_HINTS_REALTIME) is defined only on x86. In
pursuit of making cpuidle-haltpoll architecture independent, define
arch_haltpoll_want() which handles the architectural checks for
enabling haltpoll.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/x86/include/asm/cpuidle_haltpoll.h | 1 +
arch/x86/kernel/kvm.c | 13 +++++++++++++
drivers/cpuidle/cpuidle-haltpoll.c | 12 +-----------
include/linux/cpuidle_haltpoll.h | 5 +++++
4 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/cpuidle_haltpoll.h b/arch/x86/include/asm/cpuidle_haltpoll.h
index c8b39c6716ff..8a0a12769c2e 100644
--- a/arch/x86/include/asm/cpuidle_haltpoll.h
+++ b/arch/x86/include/asm/cpuidle_haltpoll.h
@@ -4,5 +4,6 @@
void arch_haltpoll_enable(unsigned int cpu);
void arch_haltpoll_disable(unsigned int cpu);
+bool arch_haltpoll_want(bool force);
#endif
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 263f8aed4e2c..63710cb1aa63 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1151,4 +1151,17 @@ void arch_haltpoll_disable(unsigned int cpu)
smp_call_function_single(cpu, kvm_enable_host_haltpoll, NULL, 1);
}
EXPORT_SYMBOL_GPL(arch_haltpoll_disable);
+
+bool arch_haltpoll_want(bool force)
+{
+ /* Do not load haltpoll if idle= is passed */
+ if (boot_option_idle_override != IDLE_NO_OVERRIDE)
+ return false;
+
+ if (!kvm_para_available())
+ return false;
+
+ return kvm_para_has_hint(KVM_HINTS_REALTIME) || force;
+}
+EXPORT_SYMBOL_GPL(arch_haltpoll_want);
#endif
diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c
index bcd03e893a0a..e532aa2bf608 100644
--- a/drivers/cpuidle/cpuidle-haltpoll.c
+++ b/drivers/cpuidle/cpuidle-haltpoll.c
@@ -15,7 +15,6 @@
#include <linux/cpuidle.h>
#include <linux/module.h>
#include <linux/sched/idle.h>
-#include <linux/kvm_para.h>
#include <linux/cpuidle_haltpoll.h>
static bool force __read_mostly;
@@ -93,21 +92,12 @@ static void haltpoll_uninit(void)
haltpoll_cpuidle_devices = NULL;
}
-static bool haltpoll_want(void)
-{
- return kvm_para_has_hint(KVM_HINTS_REALTIME) || force;
-}
-
static int __init haltpoll_init(void)
{
int ret;
struct cpuidle_driver *drv = &haltpoll_driver;
- /* Do not load haltpoll if idle= is passed */
- if (boot_option_idle_override != IDLE_NO_OVERRIDE)
- return -ENODEV;
-
- if (!kvm_para_available() || !haltpoll_want())
+ if (!arch_haltpoll_want(force))
return -ENODEV;
cpuidle_poll_state_init(drv);
diff --git a/include/linux/cpuidle_haltpoll.h b/include/linux/cpuidle_haltpoll.h
index d50c1e0411a2..68eb7a757120 100644
--- a/include/linux/cpuidle_haltpoll.h
+++ b/include/linux/cpuidle_haltpoll.h
@@ -12,5 +12,10 @@ static inline void arch_haltpoll_enable(unsigned int cpu)
static inline void arch_haltpoll_disable(unsigned int cpu)
{
}
+
+static inline bool arch_haltpoll_want(bool force)
+{
+ return false;
+}
#endif
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (3 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
` (4 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
From: Joao Martins <joao.m.martins@oracle.com>
The haltpoll governor is selected either by the cpuidle-haltpoll
driver, or explicitly by the user.
In particular, it is never selected by default since it has the lowest
rating of all governors (menu=20, teo=19, ladder=10/25, haltpoll=9).
So, we can safely forgo the kvm_para_available() check. This also
allows cpuidle-haltpoll to be tested on baremetal.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/governors/haltpoll.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
index 663b7f164d20..c8752f793e61 100644
--- a/drivers/cpuidle/governors/haltpoll.c
+++ b/drivers/cpuidle/governors/haltpoll.c
@@ -18,7 +18,6 @@
#include <linux/tick.h>
#include <linux/sched.h>
#include <linux/module.h>
-#include <linux/kvm_para.h>
#include <trace/events/power.h>
static unsigned int guest_halt_poll_ns __read_mostly = 200000;
@@ -148,10 +147,7 @@ static struct cpuidle_governor haltpoll_governor = {
static int __init init_haltpoll(void)
{
- if (kvm_para_available())
- return cpuidle_register_governor(&haltpoll_governor);
-
- return 0;
+ return cpuidle_register_governor(&haltpoll_governor);
}
postcore_initcall(init_haltpoll);
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (4 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
` (3 subsequent siblings)
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
The cpuidle-haltpoll driver and its namesake governor are selected
under KVM_GUEST on X86. KVM_GUEST in-turn selects ARCH_CPUIDLE_HALTPOLL
and defines the requisite arch_haltpoll_{enable,disable}() functions.
So remove the explicit dependence of HALTPOLL_CPUIDLE on KVM_GUEST,
and instead use ARCH_CPUIDLE_HALTPOLL as proxy for architectural
support for haltpoll.
Also change "halt poll" to "haltpoll" in one of the summary clauses,
since the second form is used everywhere else.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/x86/Kconfig | 1 +
drivers/cpuidle/Kconfig | 5 ++---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0d95170ea0f3..6d15e7e07459 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -839,6 +839,7 @@ config KVM_GUEST
config ARCH_CPUIDLE_HALTPOLL
def_bool n
+ depends on KVM_GUEST
prompt "Disable host haltpoll when loading haltpoll driver"
help
If virtualized under KVM, disable host haltpoll.
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 75f6e176bbc8..c1bebadf22bc 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -35,7 +35,6 @@ config CPU_IDLE_GOV_TEO
config CPU_IDLE_GOV_HALTPOLL
bool "Haltpoll governor (for virtualized systems)"
- depends on KVM_GUEST
help
This governor implements haltpoll idle state selection, to be
used in conjunction with the haltpoll cpuidle driver, allowing
@@ -72,8 +71,8 @@ source "drivers/cpuidle/Kconfig.riscv"
endmenu
config HALTPOLL_CPUIDLE
- tristate "Halt poll cpuidle driver"
- depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
+ tristate "Haltpoll cpuidle driver"
+ depends on ARCH_CPUIDLE_HALTPOLL && ARCH_HAS_OPTIMIZED_POLL
select CPU_IDLE_GOV_HALTPOLL
default y
help
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (5 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-09-03 8:10 ` Will Deacon
2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
` (2 subsequent siblings)
9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
From: Joao Martins <joao.m.martins@oracle.com>
Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
cpu_do_idle().
To add support for polling via cpuidle-haltpoll, we want to use the
standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
polling.
Reuse the same bit to define TIF_POLLING_NRFLAG.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/include/asm/thread_info.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e72a3bf9e563..23ff72168e48 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -69,6 +69,7 @@ void arch_setup_new_exec(void);
#define TIF_SYSCALL_TRACEPOINT 10 /* syscall tracepoint for ftrace */
#define TIF_SECCOMP 11 /* syscall secure computing */
#define TIF_SYSCALL_EMU 12 /* syscall emulation active */
+#define TIF_POLLING_NRFLAG 16 /* set while polling in poll_idle() */
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
#define TIF_FREEZE 19
#define TIF_RESTORE_SIGMASK 20
@@ -91,6 +92,7 @@ void arch_setup_new_exec(void);
#define _TIF_SYSCALL_TRACEPOINT (1 << TIF_SYSCALL_TRACEPOINT)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
+#define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 08/10] arm64: idle: export arch_cpu_idle
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (6 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-09-03 8:14 ` Will Deacon
2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
Needed for cpuidle-haltpoll.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/kernel/idle.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index 05cfb347ec26..b85ba0df9b02 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -43,3 +43,4 @@ void __cpuidle arch_cpu_idle(void)
*/
cpu_do_idle();
}
+EXPORT_SYMBOL_GPL(arch_cpu_idle);
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 09/10] arm64: support cpuidle-haltpoll
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (7 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
2024-09-03 8:13 ` Will Deacon
2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
Add architectural support for cpuidle-haltpoll driver by defining
arch_haltpoll_*().
Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
selected, and given that we have an optimized polling mechanism
in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.
smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
a memory region in exclusive state and the WFE waiting for any
stores to it.
In the edge case -- no CPU stores to the waited region and there's no
interrupt -- the event-stream will provide the terminating condition
ensuring we don't wait forever, but because the event-stream runs at
a fixed frequency (configured at 10kHz) we might spend more time in
the polling stage than specified by cpuidle_poll_time().
This would only happen in the last iteration, since overshooting the
poll_limit means the governor moves out of the polling stage.
Tested-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/arm64/Kconfig | 10 ++++++++++
arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/cpuidle_haltpoll.c | 22 ++++++++++++++++++++++
4 files changed, 43 insertions(+)
create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a2f8ff354ca6..9bd93ce2f9d9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -36,6 +36,7 @@ config ARM64
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ select ARCH_HAS_OPTIMIZED_POLL
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
@@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
config ARCH_SUSPEND_POSSIBLE
def_bool y
+config ARCH_CPUIDLE_HALTPOLL
+ bool "Enable selection of the cpuidle-haltpoll driver"
+ default n
+ help
+ cpuidle-haltpoll allows for adaptive polling based on
+ current load before entering the idle state.
+
+ Some virtualized workloads benefit from using it.
+
endmenu # "Power management options"
menu "CPU Power Management"
diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
new file mode 100644
index 000000000000..ed615a99803b
--- /dev/null
+++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ARCH_HALTPOLL_H
+#define _ARCH_HALTPOLL_H
+
+static inline void arch_haltpoll_enable(unsigned int cpu) { }
+static inline void arch_haltpoll_disable(unsigned int cpu) { }
+
+bool arch_haltpoll_want(bool force);
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 2b112f3b7510..bbfb57eda2f1 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-y += vdso-wrap.o
obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
+obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL) += cpuidle_haltpoll.o
# Force dependency (vdso*-wrap.S includes vdso.so through incbin)
$(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
new file mode 100644
index 000000000000..63fc5ebca79b
--- /dev/null
+++ b/arch/arm64/kernel/cpuidle_haltpoll.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/kernel.h>
+#include <clocksource/arm_arch_timer.h>
+#include <asm/cpuidle_haltpoll.h>
+
+bool arch_haltpoll_want(bool force)
+{
+ /*
+ * Enabling haltpoll requires two things:
+ *
+ * - Event stream support to provide a terminating condition to the
+ * WFE in the poll loop.
+ *
+ * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
+ *
+ * Given that the second is missing, allow haltpoll to only be force
+ * loaded.
+ */
+ return (arch_timer_evtstrm_available() && false) || force;
+}
+EXPORT_SYMBOL_GPL(arch_haltpoll_want);
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
` (8 preceding siblings ...)
2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
To: linux-pm, kvm, linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
smp_cond_load_relaxed(), in its generic polling variant, polls on
the loop condition waiting for it to change, eventually exiting the
loop if the time limit has been exceeded.
To limit the frequency of the relatively expensive time check it is
limited to once every POLL_IDLE_RELAX_COUNT iterations.
arm64, however uses an event based mechanism, where instead of
polling, we wait for store to a region.
Limit the POLL_IDLE_RELAX_COUNT to 1 for that case.
Suggested-by: Haris Okanovic <harisokn@amazon.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
drivers/cpuidle/poll_state.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index fc1204426158..61df2395585e 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -8,7 +8,18 @@
#include <linux/sched/clock.h>
#include <linux/sched/idle.h>
+#ifdef CONFIG_ARM64
+/*
+ * POLL_IDLE_RELAX_COUNT determines how often we check for timeout
+ * while polling for TIF_NEED_RESCHED in thread_info->flags.
+ *
+ * Set this to a low value since arm64, instead of polling, uses a
+ * event based mechanism.
+ */
+#define POLL_IDLE_RELAX_COUNT 1
+#else
#define POLL_IDLE_RELAX_COUNT 200
+#endif
static int __cpuidle poll_idle(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
--
2.43.5
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
@ 2024-09-03 8:10 ` Will Deacon
2024-09-03 21:13 ` Ankur Arora
0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2024-09-03 8:10 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 30, 2024 at 03:28:41PM -0700, Ankur Arora wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
>
> Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
> TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
> cpu_do_idle().
>
> To add support for polling via cpuidle-haltpoll, we want to use the
> standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
> polling.
>
> Reuse the same bit to define TIF_POLLING_NRFLAG.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
> Reviewed-by: Christoph Lameter <cl@linux.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/arm64/include/asm/thread_info.h | 2 ++
> 1 file changed, 2 insertions(+)
Acked-by: Will Deacon <will@kernel.org>
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v7 09/10] arm64: support cpuidle-haltpoll
2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-09-03 8:13 ` Will Deacon
2024-09-03 21:12 ` Ankur Arora
0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2024-09-03 8:13 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 30, 2024 at 03:28:43PM -0700, Ankur Arora wrote:
> Add architectural support for cpuidle-haltpoll driver by defining
> arch_haltpoll_*().
>
> Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
> selected, and given that we have an optimized polling mechanism
> in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.
>
> smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
> a memory region in exclusive state and the WFE waiting for any
> stores to it.
>
> In the edge case -- no CPU stores to the waited region and there's no
> interrupt -- the event-stream will provide the terminating condition
> ensuring we don't wait forever, but because the event-stream runs at
> a fixed frequency (configured at 10kHz) we might spend more time in
> the polling stage than specified by cpuidle_poll_time().
>
> This would only happen in the last iteration, since overshooting the
> poll_limit means the governor moves out of the polling stage.
>
> Tested-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/arm64/Kconfig | 10 ++++++++++
> arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/cpuidle_haltpoll.c | 22 ++++++++++++++++++++++
> 4 files changed, 43 insertions(+)
> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
> create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index a2f8ff354ca6..9bd93ce2f9d9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -36,6 +36,7 @@ config ARM64
> select ARCH_HAS_MEMBARRIER_SYNC_CORE
> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> + select ARCH_HAS_OPTIMIZED_POLL
> select ARCH_HAS_PTE_DEVMAP
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_HW_PTE_YOUNG
> @@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
> config ARCH_SUSPEND_POSSIBLE
> def_bool y
>
> +config ARCH_CPUIDLE_HALTPOLL
> + bool "Enable selection of the cpuidle-haltpoll driver"
> + default n
nit: this 'default n' line is redundant.
> + help
> + cpuidle-haltpoll allows for adaptive polling based on
> + current load before entering the idle state.
> +
> + Some virtualized workloads benefit from using it.
nit: This sentence is meaningless ^^.
> +
> endmenu # "Power management options"
>
> menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..ed615a99803b
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ARCH_HALTPOLL_H
> +#define _ARCH_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
> +
> +bool arch_haltpoll_want(bool force);
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 2b112f3b7510..bbfb57eda2f1 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
> obj-$(CONFIG_ARM64_MTE) += mte.o
> obj-y += vdso-wrap.o
> obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
> +obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL) += cpuidle_haltpoll.o
>
> # Force dependency (vdso*-wrap.S includes vdso.so through incbin)
> $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
> diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
> new file mode 100644
> index 000000000000..63fc5ebca79b
> --- /dev/null
> +++ b/arch/arm64/kernel/cpuidle_haltpoll.c
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/kernel.h>
> +#include <clocksource/arm_arch_timer.h>
> +#include <asm/cpuidle_haltpoll.h>
> +
> +bool arch_haltpoll_want(bool force)
> +{
> + /*
> + * Enabling haltpoll requires two things:
> + *
> + * - Event stream support to provide a terminating condition to the
> + * WFE in the poll loop.
> + *
> + * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
> + *
> + * Given that the second is missing, allow haltpoll to only be force
> + * loaded.
> + */
> + return (arch_timer_evtstrm_available() && false) || force;
> +}
> +EXPORT_SYMBOL_GPL(arch_haltpoll_want);
This seems a bit over-the-top to justify a new C file. Just have a static
inline in the header which returns 'force'. The '&& false' is misleading
and unnecessary with the comment.
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v7 08/10] arm64: idle: export arch_cpu_idle
2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
@ 2024-09-03 8:14 ` Will Deacon
0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2024-09-03 8:14 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
On Fri, Aug 30, 2024 at 03:28:42PM -0700, Ankur Arora wrote:
> Needed for cpuidle-haltpoll.
>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/arm64/kernel/idle.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
> index 05cfb347ec26..b85ba0df9b02 100644
> --- a/arch/arm64/kernel/idle.c
> +++ b/arch/arm64/kernel/idle.c
> @@ -43,3 +43,4 @@ void __cpuidle arch_cpu_idle(void)
> */
> cpu_do_idle();
> }
> +EXPORT_SYMBOL_GPL(arch_cpu_idle);
> --
> 2.43.5
Acked-by: Will Deacon <will@kernel.org>
Will
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v7 09/10] arm64: support cpuidle-haltpoll
2024-09-03 8:13 ` Will Deacon
@ 2024-09-03 21:12 ` Ankur Arora
0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-09-03 21:12 UTC (permalink / raw)
To: Will Deacon
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
Will Deacon <will@kernel.org> writes:
> On Fri, Aug 30, 2024 at 03:28:43PM -0700, Ankur Arora wrote:
>> Add architectural support for cpuidle-haltpoll driver by defining
>> arch_haltpoll_*().
>>
>> Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
>> selected, and given that we have an optimized polling mechanism
>> in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.
>>
>> smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
>> a memory region in exclusive state and the WFE waiting for any
>> stores to it.
>>
>> In the edge case -- no CPU stores to the waited region and there's no
>> interrupt -- the event-stream will provide the terminating condition
>> ensuring we don't wait forever, but because the event-stream runs at
>> a fixed frequency (configured at 10kHz) we might spend more time in
>> the polling stage than specified by cpuidle_poll_time().
>>
>> This would only happen in the last iteration, since overshooting the
>> poll_limit means the governor moves out of the polling stage.
>>
>> Tested-by: Haris Okanovic <harisokn@amazon.com>
>> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> arch/arm64/Kconfig | 10 ++++++++++
>> arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
>> arch/arm64/kernel/Makefile | 1 +
>> arch/arm64/kernel/cpuidle_haltpoll.c | 22 ++++++++++++++++++++++
>> 4 files changed, 43 insertions(+)
>> create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>> create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index a2f8ff354ca6..9bd93ce2f9d9 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -36,6 +36,7 @@ config ARM64
>> select ARCH_HAS_MEMBARRIER_SYNC_CORE
>> select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> + select ARCH_HAS_OPTIMIZED_POLL
>> select ARCH_HAS_PTE_DEVMAP
>> select ARCH_HAS_PTE_SPECIAL
>> select ARCH_HAS_HW_PTE_YOUNG
>> @@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
>> config ARCH_SUSPEND_POSSIBLE
>> def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> + bool "Enable selection of the cpuidle-haltpoll driver"
>> + default n
>
> nit: this 'default n' line is redundant.
>
>> + help
>> + cpuidle-haltpoll allows for adaptive polling based on
>> + current load before entering the idle state.
>> +
>> + Some virtualized workloads benefit from using it.
>
> nit: This sentence is meaningless ^^.
Yeah. Yeah I think I added it to take care of a checkpatch warning.
But clearly it doesn't add anything useful. Will fix.
>> +
>> endmenu # "Power management options"
>>
>> menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..ed615a99803b
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,10 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _ARCH_HALTPOLL_H
>> +#define _ARCH_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
>> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
>> +
>> +bool arch_haltpoll_want(bool force);
>> +#endif
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 2b112f3b7510..bbfb57eda2f1 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
>> obj-$(CONFIG_ARM64_MTE) += mte.o
>> obj-y += vdso-wrap.o
>> obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
>> +obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL) += cpuidle_haltpoll.o
>>
>> # Force dependency (vdso*-wrap.S includes vdso.so through incbin)
>> $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
>> diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
>> new file mode 100644
>> index 000000000000..63fc5ebca79b
>> --- /dev/null
>> +++ b/arch/arm64/kernel/cpuidle_haltpoll.c
>> @@ -0,0 +1,22 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include <linux/kernel.h>
>> +#include <clocksource/arm_arch_timer.h>
>> +#include <asm/cpuidle_haltpoll.h>
>> +
>> +bool arch_haltpoll_want(bool force)
>> +{
>> + /*
>> + * Enabling haltpoll requires two things:
>> + *
>> + * - Event stream support to provide a terminating condition to the
>> + * WFE in the poll loop.
>> + *
>> + * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
>> + *
>> + * Given that the second is missing, allow haltpoll to only be force
>> + * loaded.
>> + */
>> + return (arch_timer_evtstrm_available() && false) || force;
>> +}
>> +EXPORT_SYMBOL_GPL(arch_haltpoll_want);
>
> This seems a bit over-the-top to justify a new C file. Just have a static
> inline in the header which returns 'force'. The '&& false' is misleading
> and unnecessary with the comment.
So, the only reason for doing it this way was that I wanted to encode the
arch_timer_evtstrm_available() dependency. But you are right that the
comment suffices since the check itself is not operative.
Will fix.
Thanks for reviewing.
--
ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
2024-09-03 8:10 ` Will Deacon
@ 2024-09-03 21:13 ` Ankur Arora
0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-09-03 21:13 UTC (permalink / raw)
To: Will Deacon
Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
catalin.marinas, tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini,
wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
konrad.wilk
Will Deacon <will@kernel.org> writes:
> On Fri, Aug 30, 2024 at 03:28:41PM -0700, Ankur Arora wrote:
>> From: Joao Martins <joao.m.martins@oracle.com>
>>
>> Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
>> TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
>> cpu_do_idle().
>>
>> To add support for polling via cpuidle-haltpoll, we want to use the
>> standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
>> polling.
>>
>> Reuse the same bit to define TIF_POLLING_NRFLAG.
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
>> Reviewed-by: Christoph Lameter <cl@linux.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> arch/arm64/include/asm/thread_info.h | 2 ++
>> 1 file changed, 2 insertions(+)
>
> Acked-by: Will Deacon <will@kernel.org>
Thanks!
--
ankur
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-09-03 21:15 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
2024-09-03 8:10 ` Will Deacon
2024-09-03 21:13 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
2024-09-03 8:14 ` Will Deacon
2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
2024-09-03 8:13 ` Will Deacon
2024-09-03 21:12 ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).