linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/10] Enable haltpoll on arm64
@ 2024-08-30 22:28 Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

This patchset enables the cpuidle-haltpoll driver and its namesake
governor on arm64. This is specifically interesting for KVM guests by
reducing IPC latencies.

Comparing idle switching latencies on an arm64 KVM guest with 
perf bench sched pipe:

                                     usecs/op       %stdev   

  no haltpoll (baseline)               13.48       +-  5.19%
  with haltpoll                         6.84       +- 22.07%


No change in performance for a similar test on x86:

                                     usecs/op        %stdev   

  haltpoll w/ cpu_relax() (baseline)     4.75      +-  1.76%
  haltpoll w/ smp_cond_load_relaxed()    4.78      +-  2.31%

Both sets of tests were on otherwise idle systems with guest VCPUs
pinned to specific PCPUs. One reason for the higher stdev on arm64
is that trapping of the WFE instruction by the host KVM is contingent
on the number of tasks on the runqueue.

Tomohiro Misono and Haris Okanovic also report similar latency
improvements on Grace and Graviton systems [1] [2].

The patch series is organized in three parts: 

 - patch 1, reorganizes the poll_idle() loop, switching to
   smp_cond_load_relaxed() in the polling loop.
   Relatedly patches 2, 3 mangle the config option ARCH_HAS_CPU_RELAX,
   renaming it to ARCH_HAS_OPTIMIZED_POLL.

 - patches 4-6 reorganize the haltpoll selection and init logic
   to allow architecture code to select it. 

 - and finally, patches 7-10 add the bits for arm64 support.

What is still missing: this series largely completes the haltpoll side
of functionality for arm64. There are, however, a few related areas
that still need to be threshed out:

 - WFET support: WFE on arm64 does not guarantee that poll_idle()
   would terminate in halt_poll_ns. Using WFET would address this.
 - KVM_NO_POLL support on arm64
 - KVM TWED support on arm64: allow the host to limit time spent in
   WFE.


Changelog:

v7: No significant logic changes. Mostly a respin of v6.

 - minor cleanup in poll_idle() (Christoph Lameter)
 - fixes conflicts due to code movement in arch/arm64/kernel/cpuidle.c
   (Tomohiro Misono)

v6:

 - reordered the patches to keep poll_idle() and ARCH_HAS_OPTIMIZED_POLL
   changes together (comment from Christoph Lameter)
 - threshes out the commit messages a bit more (comments from Christoph
   Lameter, Sudeep Holla)
 - also rework selection of cpuidle-haltpoll. Now selected based
   on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
 - moved back to arch_haltpoll_want() (comment from Joao Martins)
   Also, arch_haltpoll_want() now takes the force parameter and is
   now responsible for the complete selection (or not) of haltpoll.
 - fixes the build breakage on i386
 - fixes the cpuidle-haltpoll module breakage on arm64 (comment from
   Tomohiro Misono, Haris Okanovic)


v5:
 - rework the poll_idle() loop around smp_cond_load_relaxed() (review
   comment from Tomohiro Misono.)
 - also rework selection of cpuidle-haltpoll. Now selected based
   on the architectural selection of ARCH_CPUIDLE_HALTPOLL.
 - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on
   arm64 now depends on the event-stream being enabled.
 - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic)
 - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL.

v4 changes from v3:
 - change 7/8 per Rafael input: drop the parens and use ret for the final check
 - add 8/8 which renames the guard for building poll_state

v3 changes from v2:
 - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig
 - add Ack-by from Rafael Wysocki on 2/7

v2 changes from v1:
 - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ
   (this improves by 50% at least the CPU cycles consumed in the tests above:
   10,716,881,137 now vs 14,503,014,257 before)
 - removed the ifdef from patch 1 per RafaelW

Please review.

[1] https://lore.kernel.org/lkml/TY3PR01MB111481E9B0AF263ACC8EA5D4AE5BA2@TY3PR01MB11148.jpnprd01.prod.outlook.com/
[2] https://lore.kernel.org/lkml/104d0ec31cb45477e27273e089402d4205ee4042.camel@amazon.com/

Ankur Arora (5):
  cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
  cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
  arm64: idle: export arch_cpu_idle
  arm64: support cpuidle-haltpoll
  cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64

Joao Martins (4):
  Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
  cpuidle-haltpoll: define arch_haltpoll_want()
  governors/haltpoll: drop kvm_para_available() check
  arm64: define TIF_POLLING_NRFLAG

Mihai Carabas (1):
  cpuidle/poll_state: poll via smp_cond_load_relaxed()

 arch/Kconfig                              |  3 +++
 arch/arm64/Kconfig                        | 10 ++++++++++
 arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
 arch/arm64/include/asm/thread_info.h      |  2 ++
 arch/arm64/kernel/Makefile                |  1 +
 arch/arm64/kernel/cpuidle_haltpoll.c      | 22 ++++++++++++++++++++++
 arch/arm64/kernel/idle.c                  |  1 +
 arch/x86/Kconfig                          |  5 ++---
 arch/x86/include/asm/cpuidle_haltpoll.h   |  1 +
 arch/x86/kernel/kvm.c                     | 13 +++++++++++++
 drivers/acpi/processor_idle.c             |  4 ++--
 drivers/cpuidle/Kconfig                   |  5 ++---
 drivers/cpuidle/Makefile                  |  2 +-
 drivers/cpuidle/cpuidle-haltpoll.c        | 12 +-----------
 drivers/cpuidle/governors/haltpoll.c      |  6 +-----
 drivers/cpuidle/poll_state.c              | 22 ++++++++++++++++------
 drivers/idle/Kconfig                      |  1 +
 include/linux/cpuidle.h                   |  2 +-
 include/linux/cpuidle_haltpoll.h          |  5 +++++
 19 files changed, 95 insertions(+), 32 deletions(-)
 create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
 create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c

-- 
2.43.5



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed()
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

From: Mihai Carabas <mihai.carabas@oracle.com>

The inner loop in poll_idle() polls up to POLL_IDLE_RELAX_COUNT times,
checking to see if the thread has the TIF_NEED_RESCHED bit set. The
loop exits once the condition is met, or if the poll time limit has
been exceeded.

To minimize the number of instructions executed each iteration, the
time check is done only infrequently (once every POLL_IDLE_RELAX_COUNT
iterations). In addition, each loop iteration executes cpu_relax()
which on certain platforms provides a hint to the pipeline that the
loop is busy-waiting, thus allowing the processor to reduce power
consumption.

However, cpu_relax() is defined optimally only on x86. On arm64, for
instance, it is implemented as a YIELD which only serves a hint to the
CPU that it prioritize a different hardware thread if one is available.
arm64, however, does expose a more optimal polling mechanism via
smp_cond_load_relaxed() which uses LDXR, WFE to wait until a store
to a specified region.

So restructure the loop, folding both checks in smp_cond_load_relaxed().
Also, move the time check to the head of the loop allowing it to exit
straight-away once TIF_NEED_RESCHED is set.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 drivers/cpuidle/poll_state.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index 9b6d90a72601..fc1204426158 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -21,21 +21,20 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
 
 	raw_local_irq_enable();
 	if (!current_set_polling_and_test()) {
-		unsigned int loop_count = 0;
 		u64 limit;
 
 		limit = cpuidle_poll_time(drv, dev);
 
 		while (!need_resched()) {
-			cpu_relax();
-			if (loop_count++ < POLL_IDLE_RELAX_COUNT)
-				continue;
-
-			loop_count = 0;
+			unsigned int loop_count = 0;
 			if (local_clock_noinstr() - time_start > limit) {
 				dev->poll_time_limit = true;
 				break;
 			}
+
+			smp_cond_load_relaxed(&current_thread_info()->flags,
+					      VAL & _TIF_NEED_RESCHED ||
+					      loop_count++ >= POLL_IDLE_RELAX_COUNT);
 		}
 	}
 	raw_local_irq_disable();
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

ARCH_HAS_CPU_RELAX is defined on architectures that provide an
primitive (via cpu_relax()) that can be used as part of a polling
mechanism -- one that would be cheaper than spinning in a tight
loop.

However, recent changes in poll_idle() mean that a higher level
primitive -- smp_cond_load_relaxed() is used for polling. This would
in-turn use cpu_relax() or an architecture specific implementation.
On ARM64 in particular this turns into a WFE which waits on a store
to a cacheline instead of a busy poll.

Accordingly condition the polling drivers on ARCH_HAS_OPTIMIZED_POLL
instead of ARCH_HAS_CPU_RELAX. While at it, make both intel-idle
and cpuidle-haltpoll explicitly depend on ARCH_HAS_CPU_RELAX.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/x86/Kconfig              | 2 +-
 drivers/acpi/processor_idle.c | 4 ++--
 drivers/cpuidle/Kconfig       | 2 +-
 drivers/cpuidle/Makefile      | 2 +-
 drivers/idle/Kconfig          | 1 +
 include/linux/cpuidle.h       | 2 +-
 6 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 007bab9f2a0e..c1b49d535eb8 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -373,7 +373,7 @@ config ARCH_MAY_HAVE_PC_FDC
 config GENERIC_CALIBRATE_DELAY
 	def_bool y
 
-config ARCH_HAS_CPU_RELAX
+config ARCH_HAS_OPTIMIZED_POLL
 	def_bool y
 
 config ARCH_HIBERNATION_POSSIBLE
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 831fa4a12159..44096406d65d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -35,7 +35,7 @@
 #include <asm/cpu.h>
 #endif
 
-#define ACPI_IDLE_STATE_START	(IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX) ? 1 : 0)
+#define ACPI_IDLE_STATE_START	(IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL) ? 1 : 0)
 
 static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER;
 module_param(max_cstate, uint, 0400);
@@ -782,7 +782,7 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
 	if (max_cstate == 0)
 		max_cstate = 1;
 
-	if (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX)) {
+	if (IS_ENABLED(CONFIG_ARCH_HAS_OPTIMIZED_POLL)) {
 		cpuidle_poll_state_init(drv);
 		count = 1;
 	} else {
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index cac5997dca50..75f6e176bbc8 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -73,7 +73,7 @@ endmenu
 
 config HALTPOLL_CPUIDLE
 	tristate "Halt poll cpuidle driver"
-	depends on X86 && KVM_GUEST
+	depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
 	select CPU_IDLE_GOV_HALTPOLL
 	default y
 	help
diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
index d103342b7cfc..f29dfd1525b0 100644
--- a/drivers/cpuidle/Makefile
+++ b/drivers/cpuidle/Makefile
@@ -7,7 +7,7 @@ obj-y += cpuidle.o driver.o governor.o sysfs.o governors/
 obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
 obj-$(CONFIG_DT_IDLE_STATES)		  += dt_idle_states.o
 obj-$(CONFIG_DT_IDLE_GENPD)		  += dt_idle_genpd.o
-obj-$(CONFIG_ARCH_HAS_CPU_RELAX)	  += poll_state.o
+obj-$(CONFIG_ARCH_HAS_OPTIMIZED_POLL)	  += poll_state.o
 obj-$(CONFIG_HALTPOLL_CPUIDLE)		  += cpuidle-haltpoll.o
 
 ##################################################################################
diff --git a/drivers/idle/Kconfig b/drivers/idle/Kconfig
index 6707d2539fc4..6f9b1d48fede 100644
--- a/drivers/idle/Kconfig
+++ b/drivers/idle/Kconfig
@@ -4,6 +4,7 @@ config INTEL_IDLE
 	depends on CPU_IDLE
 	depends on X86
 	depends on CPU_SUP_INTEL
+	depends on ARCH_HAS_OPTIMIZED_POLL
 	help
 	  Enable intel_idle, a cpuidle driver that includes knowledge of
 	  native Intel hardware idle features.  The acpi_idle driver
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 3183aeb7f5b4..7e7e58a17b07 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -275,7 +275,7 @@ static inline void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev,
 }
 #endif
 
-#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_CPU_RELAX)
+#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_OPTIMIZED_POLL)
 void cpuidle_poll_state_init(struct cpuidle_driver *drv);
 #else
 static inline void cpuidle_poll_state_init(struct cpuidle_driver *drv) {}
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

From: Joao Martins <joao.m.martins@oracle.com>

ARCH_HAS_OPTIMIZED_POLL gates selection of polling while idle in
poll_idle(). Move the configuration option to arch/Kconfig to allow
non-x86 architectures to select it.

Note that ARCH_HAS_OPTIMIZED_POLL should probably be exclusive with
GENERIC_IDLE_POLL_SETUP (which controls the generic polling logic in
cpu_idle_poll()). However, that would remove boot options
(hlt=, nohlt=). So, leave it untouched for now.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/Kconfig     | 3 +++
 arch/x86/Kconfig | 4 +---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 975dd22a2dbd..d43894369015 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -264,6 +264,9 @@ config HAVE_ARCH_TRACEHOOK
 config HAVE_DMA_CONTIGUOUS
 	bool
 
+config ARCH_HAS_OPTIMIZED_POLL
+	bool
+
 config GENERIC_SMP_IDLE_THREAD
 	bool
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c1b49d535eb8..0d95170ea0f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -134,6 +134,7 @@ config X86
 	select ARCH_WANTS_NO_INSTR
 	select ARCH_WANT_GENERAL_HUGETLB
 	select ARCH_WANT_HUGE_PMD_SHARE
+	select ARCH_HAS_OPTIMIZED_POLL
 	select ARCH_WANT_LD_ORPHAN_WARN
 	select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP	if X86_64
 	select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP	if X86_64
@@ -373,9 +374,6 @@ config ARCH_MAY_HAVE_PC_FDC
 config GENERIC_CALIBRATE_DELAY
 	def_bool y
 
-config ARCH_HAS_OPTIMIZED_POLL
-	def_bool y
-
 config ARCH_HIBERNATION_POSSIBLE
 	def_bool y
 
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want()
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (2 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

From: Joao Martins <joao.m.martins@oracle.com>

kvm_para_has_hint(KVM_HINTS_REALTIME) is defined only on x86. In
pursuit of making cpuidle-haltpoll architecture independent, define
arch_haltpoll_want() which handles the architectural checks for
enabling haltpoll.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/x86/include/asm/cpuidle_haltpoll.h |  1 +
 arch/x86/kernel/kvm.c                   | 13 +++++++++++++
 drivers/cpuidle/cpuidle-haltpoll.c      | 12 +-----------
 include/linux/cpuidle_haltpoll.h        |  5 +++++
 4 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/cpuidle_haltpoll.h b/arch/x86/include/asm/cpuidle_haltpoll.h
index c8b39c6716ff..8a0a12769c2e 100644
--- a/arch/x86/include/asm/cpuidle_haltpoll.h
+++ b/arch/x86/include/asm/cpuidle_haltpoll.h
@@ -4,5 +4,6 @@
 
 void arch_haltpoll_enable(unsigned int cpu);
 void arch_haltpoll_disable(unsigned int cpu);
+bool arch_haltpoll_want(bool force);
 
 #endif
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 263f8aed4e2c..63710cb1aa63 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1151,4 +1151,17 @@ void arch_haltpoll_disable(unsigned int cpu)
 	smp_call_function_single(cpu, kvm_enable_host_haltpoll, NULL, 1);
 }
 EXPORT_SYMBOL_GPL(arch_haltpoll_disable);
+
+bool arch_haltpoll_want(bool force)
+{
+	/* Do not load haltpoll if idle= is passed */
+	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
+		return false;
+
+	if (!kvm_para_available())
+		return false;
+
+	return kvm_para_has_hint(KVM_HINTS_REALTIME) || force;
+}
+EXPORT_SYMBOL_GPL(arch_haltpoll_want);
 #endif
diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c
index bcd03e893a0a..e532aa2bf608 100644
--- a/drivers/cpuidle/cpuidle-haltpoll.c
+++ b/drivers/cpuidle/cpuidle-haltpoll.c
@@ -15,7 +15,6 @@
 #include <linux/cpuidle.h>
 #include <linux/module.h>
 #include <linux/sched/idle.h>
-#include <linux/kvm_para.h>
 #include <linux/cpuidle_haltpoll.h>
 
 static bool force __read_mostly;
@@ -93,21 +92,12 @@ static void haltpoll_uninit(void)
 	haltpoll_cpuidle_devices = NULL;
 }
 
-static bool haltpoll_want(void)
-{
-	return kvm_para_has_hint(KVM_HINTS_REALTIME) || force;
-}
-
 static int __init haltpoll_init(void)
 {
 	int ret;
 	struct cpuidle_driver *drv = &haltpoll_driver;
 
-	/* Do not load haltpoll if idle= is passed */
-	if (boot_option_idle_override != IDLE_NO_OVERRIDE)
-		return -ENODEV;
-
-	if (!kvm_para_available() || !haltpoll_want())
+	if (!arch_haltpoll_want(force))
 		return -ENODEV;
 
 	cpuidle_poll_state_init(drv);
diff --git a/include/linux/cpuidle_haltpoll.h b/include/linux/cpuidle_haltpoll.h
index d50c1e0411a2..68eb7a757120 100644
--- a/include/linux/cpuidle_haltpoll.h
+++ b/include/linux/cpuidle_haltpoll.h
@@ -12,5 +12,10 @@ static inline void arch_haltpoll_enable(unsigned int cpu)
 static inline void arch_haltpoll_disable(unsigned int cpu)
 {
 }
+
+static inline bool arch_haltpoll_want(bool force)
+{
+	return false;
+}
 #endif
 #endif
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (3 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

From: Joao Martins <joao.m.martins@oracle.com>

The haltpoll governor is selected either by the cpuidle-haltpoll
driver, or explicitly by the user.
In particular, it is never selected by default since it has the lowest
rating of all governors (menu=20, teo=19, ladder=10/25, haltpoll=9).

So, we can safely forgo the kvm_para_available() check. This also
allows cpuidle-haltpoll to be tested on baremetal.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 drivers/cpuidle/governors/haltpoll.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/cpuidle/governors/haltpoll.c b/drivers/cpuidle/governors/haltpoll.c
index 663b7f164d20..c8752f793e61 100644
--- a/drivers/cpuidle/governors/haltpoll.c
+++ b/drivers/cpuidle/governors/haltpoll.c
@@ -18,7 +18,6 @@
 #include <linux/tick.h>
 #include <linux/sched.h>
 #include <linux/module.h>
-#include <linux/kvm_para.h>
 #include <trace/events/power.h>
 
 static unsigned int guest_halt_poll_ns __read_mostly = 200000;
@@ -148,10 +147,7 @@ static struct cpuidle_governor haltpoll_governor = {
 
 static int __init init_haltpoll(void)
 {
-	if (kvm_para_available())
-		return cpuidle_register_governor(&haltpoll_governor);
-
-	return 0;
+	return cpuidle_register_governor(&haltpoll_governor);
 }
 
 postcore_initcall(init_haltpoll);
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (4 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

The cpuidle-haltpoll driver and its namesake governor are selected
under KVM_GUEST on X86. KVM_GUEST in-turn selects ARCH_CPUIDLE_HALTPOLL
and defines the requisite arch_haltpoll_{enable,disable}() functions.

So remove the explicit dependence of HALTPOLL_CPUIDLE on KVM_GUEST,
and instead use ARCH_CPUIDLE_HALTPOLL as proxy for architectural
support for haltpoll.

Also change "halt poll" to "haltpoll" in one of the summary clauses,
since the second form is used everywhere else.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/x86/Kconfig        | 1 +
 drivers/cpuidle/Kconfig | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0d95170ea0f3..6d15e7e07459 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -839,6 +839,7 @@ config KVM_GUEST
 
 config ARCH_CPUIDLE_HALTPOLL
 	def_bool n
+	depends on KVM_GUEST
 	prompt "Disable host haltpoll when loading haltpoll driver"
 	help
 	  If virtualized under KVM, disable host haltpoll.
diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
index 75f6e176bbc8..c1bebadf22bc 100644
--- a/drivers/cpuidle/Kconfig
+++ b/drivers/cpuidle/Kconfig
@@ -35,7 +35,6 @@ config CPU_IDLE_GOV_TEO
 
 config CPU_IDLE_GOV_HALTPOLL
 	bool "Haltpoll governor (for virtualized systems)"
-	depends on KVM_GUEST
 	help
 	  This governor implements haltpoll idle state selection, to be
 	  used in conjunction with the haltpoll cpuidle driver, allowing
@@ -72,8 +71,8 @@ source "drivers/cpuidle/Kconfig.riscv"
 endmenu
 
 config HALTPOLL_CPUIDLE
-	tristate "Halt poll cpuidle driver"
-	depends on X86 && KVM_GUEST && ARCH_HAS_OPTIMIZED_POLL
+	tristate "Haltpoll cpuidle driver"
+	depends on ARCH_CPUIDLE_HALTPOLL && ARCH_HAS_OPTIMIZED_POLL
 	select CPU_IDLE_GOV_HALTPOLL
 	default y
 	help
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (5 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-09-03  8:10   ` Will Deacon
  2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

From: Joao Martins <joao.m.martins@oracle.com>

Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
cpu_do_idle().

To add support for polling via cpuidle-haltpoll, we want to use the
standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
polling.

Reuse the same bit to define TIF_POLLING_NRFLAG.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/arm64/include/asm/thread_info.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e72a3bf9e563..23ff72168e48 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -69,6 +69,7 @@ void arch_setup_new_exec(void);
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
 #define TIF_SECCOMP		11	/* syscall secure computing */
 #define TIF_SYSCALL_EMU		12	/* syscall emulation active */
+#define TIF_POLLING_NRFLAG	16	/* set while polling in poll_idle() */
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
 #define TIF_FREEZE		19
 #define TIF_RESTORE_SIGMASK	20
@@ -91,6 +92,7 @@ void arch_setup_new_exec(void);
 #define _TIF_SYSCALL_TRACEPOINT	(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
 #define _TIF_SYSCALL_EMU	(1 << TIF_SYSCALL_EMU)
+#define _TIF_POLLING_NRFLAG	(1 << TIF_POLLING_NRFLAG)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_SINGLESTEP		(1 << TIF_SINGLESTEP)
 #define _TIF_32BIT		(1 << TIF_32BIT)
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 08/10] arm64: idle: export arch_cpu_idle
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (6 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-09-03  8:14   ` Will Deacon
  2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
  2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
  9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

Needed for cpuidle-haltpoll.

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/arm64/kernel/idle.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index 05cfb347ec26..b85ba0df9b02 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -43,3 +43,4 @@ void __cpuidle arch_cpu_idle(void)
 	 */
 	cpu_do_idle();
 }
+EXPORT_SYMBOL_GPL(arch_cpu_idle);
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 09/10] arm64: support cpuidle-haltpoll
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (7 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  2024-09-03  8:13   ` Will Deacon
  2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora
  9 siblings, 1 reply; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

Add architectural support for cpuidle-haltpoll driver by defining
arch_haltpoll_*().

Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
selected, and given that we have an optimized polling mechanism
in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.

smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
a memory region in exclusive state and the WFE waiting for any
stores to it.

In the edge case -- no CPU stores to the waited region and there's no
interrupt -- the event-stream will provide the terminating condition
ensuring we don't wait forever, but because the event-stream runs at
a fixed frequency (configured at 10kHz) we might spend more time in
the polling stage than specified by cpuidle_poll_time().

This would only happen in the last iteration, since overshooting the
poll_limit means the governor moves out of the polling stage.

Tested-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/arm64/Kconfig                        | 10 ++++++++++
 arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
 arch/arm64/kernel/Makefile                |  1 +
 arch/arm64/kernel/cpuidle_haltpoll.c      | 22 ++++++++++++++++++++++
 4 files changed, 43 insertions(+)
 create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
 create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a2f8ff354ca6..9bd93ce2f9d9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -36,6 +36,7 @@ config ARM64
 	select ARCH_HAS_MEMBARRIER_SYNC_CORE
 	select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
 	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+	select ARCH_HAS_OPTIMIZED_POLL
 	select ARCH_HAS_PTE_DEVMAP
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_HW_PTE_YOUNG
@@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
+config ARCH_CPUIDLE_HALTPOLL
+	bool "Enable selection of the cpuidle-haltpoll driver"
+	default n
+	help
+	  cpuidle-haltpoll allows for adaptive polling based on
+	  current load before entering the idle state.
+
+	  Some virtualized workloads benefit from using it.
+
 endmenu # "Power management options"
 
 menu "CPU Power Management"
diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
new file mode 100644
index 000000000000..ed615a99803b
--- /dev/null
+++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ARCH_HALTPOLL_H
+#define _ARCH_HALTPOLL_H
+
+static inline void arch_haltpoll_enable(unsigned int cpu) { }
+static inline void arch_haltpoll_disable(unsigned int cpu) { }
+
+bool arch_haltpoll_want(bool force);
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 2b112f3b7510..bbfb57eda2f1 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
 obj-$(CONFIG_ARM64_MTE)			+= mte.o
 obj-y					+= vdso-wrap.o
 obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
+obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL)	+= cpuidle_haltpoll.o
 
 # Force dependency (vdso*-wrap.S includes vdso.so through incbin)
 $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
new file mode 100644
index 000000000000..63fc5ebca79b
--- /dev/null
+++ b/arch/arm64/kernel/cpuidle_haltpoll.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/kernel.h>
+#include <clocksource/arm_arch_timer.h>
+#include <asm/cpuidle_haltpoll.h>
+
+bool arch_haltpoll_want(bool force)
+{
+	/*
+	 * Enabling haltpoll requires two things:
+	 *
+	 * - Event stream support to provide a terminating condition to the
+	 *   WFE in the poll loop.
+	 *
+	 * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
+	 *
+	 * Given that the second is missing, allow haltpoll to only be force
+	 * loaded.
+	 */
+	return (arch_timer_evtstrm_available() && false) || force;
+}
+EXPORT_SYMBOL_GPL(arch_haltpoll_want);
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64
  2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
                   ` (8 preceding siblings ...)
  2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-08-30 22:28 ` Ankur Arora
  9 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-08-30 22:28 UTC (permalink / raw)
  To: linux-pm, kvm, linux-arm-kernel, linux-kernel
  Cc: catalin.marinas, will, tglx, mingo, bp, dave.hansen, x86, hpa,
	pbonzini, wanpengli, vkuznets, rafael, daniel.lezcano, peterz,
	arnd, lenb, mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

smp_cond_load_relaxed(), in its generic polling variant, polls on
the loop condition waiting for it to change, eventually exiting the
loop if the time limit has been exceeded.

To limit the frequency of the relatively expensive time check it is
limited to once every POLL_IDLE_RELAX_COUNT iterations.

arm64, however uses an event based mechanism, where instead of
polling, we wait for store to a region.

Limit the POLL_IDLE_RELAX_COUNT to 1 for that case.

Suggested-by: Haris Okanovic <harisokn@amazon.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 drivers/cpuidle/poll_state.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index fc1204426158..61df2395585e 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -8,7 +8,18 @@
 #include <linux/sched/clock.h>
 #include <linux/sched/idle.h>
 
+#ifdef CONFIG_ARM64
+/*
+ * POLL_IDLE_RELAX_COUNT determines how often we check for timeout
+ * while polling for TIF_NEED_RESCHED in thread_info->flags.
+ *
+ * Set this to a low value since arm64, instead of polling, uses a
+ * event based mechanism.
+ */
+#define POLL_IDLE_RELAX_COUNT	1
+#else
 #define POLL_IDLE_RELAX_COUNT	200
+#endif
 
 static int __cpuidle poll_idle(struct cpuidle_device *dev,
 			       struct cpuidle_driver *drv, int index)
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
  2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
@ 2024-09-03  8:10   ` Will Deacon
  2024-09-03 21:13     ` Ankur Arora
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2024-09-03  8:10 UTC (permalink / raw)
  To: Ankur Arora
  Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
	tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
	vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
	mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

On Fri, Aug 30, 2024 at 03:28:41PM -0700, Ankur Arora wrote:
> From: Joao Martins <joao.m.martins@oracle.com>
> 
> Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
> TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
> cpu_do_idle().
> 
> To add support for polling via cpuidle-haltpoll, we want to use the
> standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
> polling.
> 
> Reuse the same bit to define TIF_POLLING_NRFLAG.
> 
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
> Reviewed-by: Christoph Lameter <cl@linux.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  arch/arm64/include/asm/thread_info.h | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Will Deacon <will@kernel.org>

Will


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v7 09/10] arm64: support cpuidle-haltpoll
  2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
@ 2024-09-03  8:13   ` Will Deacon
  2024-09-03 21:12     ` Ankur Arora
  0 siblings, 1 reply; 16+ messages in thread
From: Will Deacon @ 2024-09-03  8:13 UTC (permalink / raw)
  To: Ankur Arora
  Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
	tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
	vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
	mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

On Fri, Aug 30, 2024 at 03:28:43PM -0700, Ankur Arora wrote:
> Add architectural support for cpuidle-haltpoll driver by defining
> arch_haltpoll_*().
> 
> Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
> selected, and given that we have an optimized polling mechanism
> in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.
> 
> smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
> a memory region in exclusive state and the WFE waiting for any
> stores to it.
> 
> In the edge case -- no CPU stores to the waited region and there's no
> interrupt -- the event-stream will provide the terminating condition
> ensuring we don't wait forever, but because the event-stream runs at
> a fixed frequency (configured at 10kHz) we might spend more time in
> the polling stage than specified by cpuidle_poll_time().
> 
> This would only happen in the last iteration, since overshooting the
> poll_limit means the governor moves out of the polling stage.
> 
> Tested-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  arch/arm64/Kconfig                        | 10 ++++++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
>  arch/arm64/kernel/Makefile                |  1 +
>  arch/arm64/kernel/cpuidle_haltpoll.c      | 22 ++++++++++++++++++++++
>  4 files changed, 43 insertions(+)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>  create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index a2f8ff354ca6..9bd93ce2f9d9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -36,6 +36,7 @@ config ARM64
>  	select ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>  	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> +	select ARCH_HAS_OPTIMIZED_POLL
>  	select ARCH_HAS_PTE_DEVMAP
>  	select ARCH_HAS_PTE_SPECIAL
>  	select ARCH_HAS_HW_PTE_YOUNG
> @@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
>  config ARCH_SUSPEND_POSSIBLE
>  	def_bool y
>  
> +config ARCH_CPUIDLE_HALTPOLL
> +	bool "Enable selection of the cpuidle-haltpoll driver"
> +	default n

nit: this 'default n' line is redundant.

> +	help
> +	  cpuidle-haltpoll allows for adaptive polling based on
> +	  current load before entering the idle state.
> +
> +	  Some virtualized workloads benefit from using it.

nit: This sentence is meaningless ^^.

> +
>  endmenu # "Power management options"
>  
>  menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..ed615a99803b
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,10 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ARCH_HALTPOLL_H
> +#define _ARCH_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
> +
> +bool arch_haltpoll_want(bool force);
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 2b112f3b7510..bbfb57eda2f1 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
>  obj-$(CONFIG_ARM64_MTE)			+= mte.o
>  obj-y					+= vdso-wrap.o
>  obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
> +obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL)	+= cpuidle_haltpoll.o
>  
>  # Force dependency (vdso*-wrap.S includes vdso.so through incbin)
>  $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
> diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
> new file mode 100644
> index 000000000000..63fc5ebca79b
> --- /dev/null
> +++ b/arch/arm64/kernel/cpuidle_haltpoll.c
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/kernel.h>
> +#include <clocksource/arm_arch_timer.h>
> +#include <asm/cpuidle_haltpoll.h>
> +
> +bool arch_haltpoll_want(bool force)
> +{
> +	/*
> +	 * Enabling haltpoll requires two things:
> +	 *
> +	 * - Event stream support to provide a terminating condition to the
> +	 *   WFE in the poll loop.
> +	 *
> +	 * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
> +	 *
> +	 * Given that the second is missing, allow haltpoll to only be force
> +	 * loaded.
> +	 */
> +	return (arch_timer_evtstrm_available() && false) || force;
> +}
> +EXPORT_SYMBOL_GPL(arch_haltpoll_want);

This seems a bit over-the-top to justify a new C file. Just have a static
inline in the header which returns 'force'. The '&& false' is misleading
and unnecessary with the comment.

Will


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v7 08/10] arm64: idle: export arch_cpu_idle
  2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
@ 2024-09-03  8:14   ` Will Deacon
  0 siblings, 0 replies; 16+ messages in thread
From: Will Deacon @ 2024-09-03  8:14 UTC (permalink / raw)
  To: Ankur Arora
  Cc: linux-pm, kvm, linux-arm-kernel, linux-kernel, catalin.marinas,
	tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini, wanpengli,
	vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
	mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk

On Fri, Aug 30, 2024 at 03:28:42PM -0700, Ankur Arora wrote:
> Needed for cpuidle-haltpoll.
> 
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  arch/arm64/kernel/idle.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
> index 05cfb347ec26..b85ba0df9b02 100644
> --- a/arch/arm64/kernel/idle.c
> +++ b/arch/arm64/kernel/idle.c
> @@ -43,3 +43,4 @@ void __cpuidle arch_cpu_idle(void)
>  	 */
>  	cpu_do_idle();
>  }
> +EXPORT_SYMBOL_GPL(arch_cpu_idle);
> -- 
> 2.43.5

Acked-by: Will Deacon <will@kernel.org>

Will


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v7 09/10] arm64: support cpuidle-haltpoll
  2024-09-03  8:13   ` Will Deacon
@ 2024-09-03 21:12     ` Ankur Arora
  0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-09-03 21:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
	catalin.marinas, tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini,
	wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
	mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk


Will Deacon <will@kernel.org> writes:

> On Fri, Aug 30, 2024 at 03:28:43PM -0700, Ankur Arora wrote:
>> Add architectural support for cpuidle-haltpoll driver by defining
>> arch_haltpoll_*().
>>
>> Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be
>> selected, and given that we have an optimized polling mechanism
>> in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL.
>>
>> smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading
>> a memory region in exclusive state and the WFE waiting for any
>> stores to it.
>>
>> In the edge case -- no CPU stores to the waited region and there's no
>> interrupt -- the event-stream will provide the terminating condition
>> ensuring we don't wait forever, but because the event-stream runs at
>> a fixed frequency (configured at 10kHz) we might spend more time in
>> the polling stage than specified by cpuidle_poll_time().
>>
>> This would only happen in the last iteration, since overshooting the
>> poll_limit means the governor moves out of the polling stage.
>>
>> Tested-by: Haris Okanovic <harisokn@amazon.com>
>> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>  arch/arm64/Kconfig                        | 10 ++++++++++
>>  arch/arm64/include/asm/cpuidle_haltpoll.h | 10 ++++++++++
>>  arch/arm64/kernel/Makefile                |  1 +
>>  arch/arm64/kernel/cpuidle_haltpoll.c      | 22 ++++++++++++++++++++++
>>  4 files changed, 43 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>>  create mode 100644 arch/arm64/kernel/cpuidle_haltpoll.c
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index a2f8ff354ca6..9bd93ce2f9d9 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -36,6 +36,7 @@ config ARM64
>>  	select ARCH_HAS_MEMBARRIER_SYNC_CORE
>>  	select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
>>  	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> +	select ARCH_HAS_OPTIMIZED_POLL
>>  	select ARCH_HAS_PTE_DEVMAP
>>  	select ARCH_HAS_PTE_SPECIAL
>>  	select ARCH_HAS_HW_PTE_YOUNG
>> @@ -2385,6 +2386,15 @@ config ARCH_HIBERNATION_HEADER
>>  config ARCH_SUSPEND_POSSIBLE
>>  	def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> +	bool "Enable selection of the cpuidle-haltpoll driver"
>> +	default n
>
> nit: this 'default n' line is redundant.
>
>> +	help
>> +	  cpuidle-haltpoll allows for adaptive polling based on
>> +	  current load before entering the idle state.
>> +
>> +	  Some virtualized workloads benefit from using it.
>
> nit: This sentence is meaningless ^^.

Yeah. Yeah I think I added it to take care of a checkpatch warning.
But clearly it doesn't add anything useful. Will fix.

>> +
>>  endmenu # "Power management options"
>>
>>  menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..ed615a99803b
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,10 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _ARCH_HALTPOLL_H
>> +#define _ARCH_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
>> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
>> +
>> +bool arch_haltpoll_want(bool force);
>> +#endif
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 2b112f3b7510..bbfb57eda2f1 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -70,6 +70,7 @@ obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
>>  obj-$(CONFIG_ARM64_MTE)			+= mte.o
>>  obj-y					+= vdso-wrap.o
>>  obj-$(CONFIG_COMPAT_VDSO)		+= vdso32-wrap.o
>> +obj-$(CONFIG_ARCH_CPUIDLE_HALTPOLL)	+= cpuidle_haltpoll.o
>>
>>  # Force dependency (vdso*-wrap.S includes vdso.so through incbin)
>>  $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
>> diff --git a/arch/arm64/kernel/cpuidle_haltpoll.c b/arch/arm64/kernel/cpuidle_haltpoll.c
>> new file mode 100644
>> index 000000000000..63fc5ebca79b
>> --- /dev/null
>> +++ b/arch/arm64/kernel/cpuidle_haltpoll.c
>> @@ -0,0 +1,22 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +#include <linux/kernel.h>
>> +#include <clocksource/arm_arch_timer.h>
>> +#include <asm/cpuidle_haltpoll.h>
>> +
>> +bool arch_haltpoll_want(bool force)
>> +{
>> +	/*
>> +	 * Enabling haltpoll requires two things:
>> +	 *
>> +	 * - Event stream support to provide a terminating condition to the
>> +	 *   WFE in the poll loop.
>> +	 *
>> +	 * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
>> +	 *
>> +	 * Given that the second is missing, allow haltpoll to only be force
>> +	 * loaded.
>> +	 */
>> +	return (arch_timer_evtstrm_available() && false) || force;
>> +}
>> +EXPORT_SYMBOL_GPL(arch_haltpoll_want);
>
> This seems a bit over-the-top to justify a new C file. Just have a static
> inline in the header which returns 'force'. The '&& false' is misleading
> and unnecessary with the comment.

So, the only reason for doing it this way was that I wanted to encode the
arch_timer_evtstrm_available() dependency. But you are right that the
comment suffices since the check itself is not operative.

Will fix.

Thanks for reviewing.

--
ankur


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG
  2024-09-03  8:10   ` Will Deacon
@ 2024-09-03 21:13     ` Ankur Arora
  0 siblings, 0 replies; 16+ messages in thread
From: Ankur Arora @ 2024-09-03 21:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ankur Arora, linux-pm, kvm, linux-arm-kernel, linux-kernel,
	catalin.marinas, tglx, mingo, bp, dave.hansen, x86, hpa, pbonzini,
	wanpengli, vkuznets, rafael, daniel.lezcano, peterz, arnd, lenb,
	mark.rutland, harisokn, mtosatti, sudeep.holla, cl,
	misono.tomohiro, maobibo, joao.m.martins, boris.ostrovsky,
	konrad.wilk


Will Deacon <will@kernel.org> writes:

> On Fri, Aug 30, 2024 at 03:28:41PM -0700, Ankur Arora wrote:
>> From: Joao Martins <joao.m.martins@oracle.com>
>>
>> Commit 842514849a61 ("arm64: Remove TIF_POLLING_NRFLAG") had removed
>> TIF_POLLING_NRFLAG because arm64 only supported non-polled idling via
>> cpu_do_idle().
>>
>> To add support for polling via cpuidle-haltpoll, we want to use the
>> standard poll_idle() interface, which sets TIF_POLLING_NRFLAG while
>> polling.
>>
>> Reuse the same bit to define TIF_POLLING_NRFLAG.
>>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Mihai Carabas <mihai.carabas@oracle.com>
>> Reviewed-by: Christoph Lameter <cl@linux.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>  arch/arm64/include/asm/thread_info.h | 2 ++
>>  1 file changed, 2 insertions(+)
>
> Acked-by: Will Deacon <will@kernel.org>

Thanks!

--
ankur


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-09-03 21:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-30 22:28 [PATCH v7 00/10] Enable haltpoll on arm64 Ankur Arora
2024-08-30 22:28 ` [PATCH v7 01/10] cpuidle/poll_state: poll via smp_cond_load_relaxed() Ankur Arora
2024-08-30 22:28 ` [PATCH v7 02/10] cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL Ankur Arora
2024-08-30 22:28 ` [PATCH v7 03/10] Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig Ankur Arora
2024-08-30 22:28 ` [PATCH v7 04/10] cpuidle-haltpoll: define arch_haltpoll_want() Ankur Arora
2024-08-30 22:28 ` [PATCH v7 05/10] governors/haltpoll: drop kvm_para_available() check Ankur Arora
2024-08-30 22:28 ` [PATCH v7 06/10] cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL Ankur Arora
2024-08-30 22:28 ` [PATCH v7 07/10] arm64: define TIF_POLLING_NRFLAG Ankur Arora
2024-09-03  8:10   ` Will Deacon
2024-09-03 21:13     ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 08/10] arm64: idle: export arch_cpu_idle Ankur Arora
2024-09-03  8:14   ` Will Deacon
2024-08-30 22:28 ` [PATCH v7 09/10] arm64: support cpuidle-haltpoll Ankur Arora
2024-09-03  8:13   ` Will Deacon
2024-09-03 21:12     ` Ankur Arora
2024-08-30 22:28 ` [PATCH v7 10/10] cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 Ankur Arora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).