linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint()
@ 2025-06-05 15:03 Rafael J. Wysocki
  2025-06-05 15:04 ` [PATCH v1 1/5] intel_idle: Use subsys_initcall_sync() for initialization Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:03 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

Hi Everyone,

The purpose of this series is to reapply the code changes from commit
96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()") that
has been reverted because of an issue introduced by it.  This takes
place in the last patch ([5/5]) and the previous patches make
preparatory changes needed to avoid breaking systems in the field
once again.

The problem with commit 96040f7273e2 was that on SMT-capable systems
booting with "nosmt" in the kernel command line, the "dead" SMT siblings
were stuck in idle state C1 after initialization because they were
initialized before a proper cpuidle driver for the given platform got
ready.  That prevented the whole processor from entering deep package
C-states later on and pretty much ruined idle power (including power
in suspend-to-idle).

To prevent that from happening, patches [1-4/5] use the approach that
has been used for some time to address an analogous issue during resume
from hibernation, in which case the "dead" SMT siblings are also in C1
when the image kernel gets control back and they need to be put into
sufficiently deep C-states.  Namely, they are taken online and then
back offline immediately to make that happen.

The general idea is to take the "dead" SMT siblings online and then
back offline immediately when a proper cpuidle driver gets ready, but
some changes are made to avoid doing that twice in a row in vain.
For this purpose, the intel_idle driver initialization is pushed
to an earlier initialization phase (patch [1/5]) and the ACPI
processor driver only "rescans" the "dead" SMT siblings when
the ACPI idle driver is the current cpuidle driver.  It also
avoids doing this on architectures other than x86 (patch [4/5]).

Thanks!




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/5] intel_idle: Use subsys_initcall_sync() for initialization
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
@ 2025-06-05 15:04 ` Rafael J. Wysocki
  2025-06-05 15:05 ` [PATCH v1 2/5] x86/smp: PM/hibernate: Split arch_resume_nosmt() Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:04 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It is not necessary to wait until the device_initcall() stage with
intel_idle initialization.  All of its dependencies are met after
all subsys_initcall()s have run, so subsys_initcall_sync() can be
used for initializing it.

It is also better to ensure that intel_idle will always initialize
before the ACPI processor driver that uses module_init() for its
initialization.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/idle/intel_idle.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -2517,7 +2517,7 @@
 	return retval;
 
 }
-device_initcall(intel_idle_init);
+subsys_initcall_sync(intel_idle_init);
 
 /*
  * We are not really modular, but we used to support that.  Meaning we also




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 2/5] x86/smp: PM/hibernate: Split arch_resume_nosmt()
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
  2025-06-05 15:04 ` [PATCH v1 1/5] intel_idle: Use subsys_initcall_sync() for initialization Rafael J. Wysocki
@ 2025-06-05 15:05 ` Rafael J. Wysocki
  2025-06-05 15:06 ` [PATCH v1 3/5] intel_idle: Rescan "dead" SMT siblings during initialization Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:05 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Move the inner part of the arch_resume_nosmt() code into a separate
function called arch_cpu_rescan_dead_smt_siblings(), so it can be
used in other places where "dead" SMT siblings may need to be taken
online and offline again in order to get into deep idle states.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/kernel/smp.c      |   23 +++++++++++++++++++++++
 arch/x86/power/hibernate.c |   17 +++++------------
 include/linux/cpu.h        |    1 +
 3 files changed, 29 insertions(+), 12 deletions(-)

--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -299,3 +299,26 @@
 	.send_call_func_single_ipi = native_send_call_func_single_ipi,
 };
 EXPORT_SYMBOL_GPL(smp_ops);
+
+int arch_cpu_rescan_dead_smt_siblings(void)
+{
+	enum cpuhp_smt_control old = cpu_smt_control;
+	int ret;
+
+	/*
+	 * If SMT has been disabled and SMT siblings are in HLT, bring them back
+	 * online and offline them again so that they end up in MWAIT proper.
+	 *
+	 * Called with hotplug enabled.
+	 */
+	if (old != CPU_SMT_DISABLED && old != CPU_SMT_FORCE_DISABLED)
+		return 0;
+
+	ret = cpuhp_smt_enable();
+	if (ret)
+		return ret;
+
+	ret = cpuhp_smt_disable(old);
+
+	return ret;
+}
--- a/arch/x86/power/hibernate.c
+++ b/arch/x86/power/hibernate.c
@@ -188,7 +188,8 @@
 
 int arch_resume_nosmt(void)
 {
-	int ret = 0;
+	int ret;
+
 	/*
 	 * We reached this while coming out of hibernation. This means
 	 * that SMT siblings are sleeping in hlt, as mwait is not safe
@@ -202,18 +203,10 @@
 	 * Called with hotplug disabled.
 	 */
 	cpu_hotplug_enable();
-	if (cpu_smt_control == CPU_SMT_DISABLED ||
-			cpu_smt_control == CPU_SMT_FORCE_DISABLED) {
-		enum cpuhp_smt_control old = cpu_smt_control;
 
-		ret = cpuhp_smt_enable();
-		if (ret)
-			goto out;
-		ret = cpuhp_smt_disable(old);
-		if (ret)
-			goto out;
-	}
-out:
+	ret = arch_cpu_rescan_dead_smt_siblings();
+
 	cpu_hotplug_disable();
+
 	return ret;
 }
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -172,6 +172,7 @@
 void arch_tick_broadcast_enter(void);
 void arch_tick_broadcast_exit(void);
 void __noreturn arch_cpu_idle_dead(void);
+int arch_cpu_rescan_dead_smt_siblings(void);
 
 #ifdef CONFIG_ARCH_HAS_CPU_FINALIZE_INIT
 void arch_cpu_finalize_init(void);




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 3/5] intel_idle: Rescan "dead" SMT siblings during initialization
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
  2025-06-05 15:04 ` [PATCH v1 1/5] intel_idle: Use subsys_initcall_sync() for initialization Rafael J. Wysocki
  2025-06-05 15:05 ` [PATCH v1 2/5] x86/smp: PM/hibernate: Split arch_resume_nosmt() Rafael J. Wysocki
@ 2025-06-05 15:06 ` Rafael J. Wysocki
  2025-06-05 15:07 ` [PATCH v1 4/5] ACPI: processor: " Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:06 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make intel_idle_init() call arch_cpu_rescan_dead_smt_siblings() after
successfully registering intel_idle as the cpuidle driver so as to
allow the "dead" SMT siblings (if any) to go into deep idle states.

This is necessary for the processor to be able to reach deep package
C-states (like PC10) going forward which is requisite for reducing
power sufficiently in suspend-to-idle, among other things.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/idle/intel_idle.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -2506,6 +2506,8 @@
 	pr_debug("Local APIC timer is reliable in %s\n",
 		 boot_cpu_has(X86_FEATURE_ARAT) ? "all C-states" : "C1");
 
+	arch_cpu_rescan_dead_smt_siblings();
+
 	return 0;
 
 hp_setup_fail:




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 4/5] ACPI: processor: Rescan "dead" SMT siblings during initialization
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2025-06-05 15:06 ` [PATCH v1 3/5] intel_idle: Rescan "dead" SMT siblings during initialization Rafael J. Wysocki
@ 2025-06-05 15:07 ` Rafael J. Wysocki
  2025-06-05 16:14   ` Dave Hansen
  2025-06-05 15:09 ` [PATCH v1 5/5] Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" Rafael J. Wysocki
  2025-06-06 10:51 ` [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Artem Bityutskiy
  5 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:07 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make acpi_processor_driver_init() call arch_cpu_rescan_dead_smt_siblings(),
via a new wrapper function called acpi_idle_rescan_dead_smt_siblings(),
after successfully initializing the driver, to allow the "dead" SMT
siblings to go into deep idle states, which is necessary for the
processor to be able to reach deep package C-states (like PC10) going
forward, so that power can be reduced sufficiently in suspend-to-idle,
among other things.

However, do it only if the ACPI idle driver is the current cpuidle
driver (otherwise it is assumed that another cpuidle driver will take
care of this) and avoid doing it on architectures other than x86.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/internal.h         |    6 ++++++
 drivers/acpi/processor_driver.c |    3 +++
 drivers/acpi/processor_idle.c   |    8 ++++++++
 3 files changed, 17 insertions(+)

--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -175,6 +175,12 @@
 static inline void acpi_early_processor_control_setup(void) {}
 #endif
 
+#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
+void acpi_idle_rescan_dead_smt_siblings(void);
+#else
+static inline void acpi_idle_rescan_dead_smt_siblings(void) {}
+#endif
+
 /* --------------------------------------------------------------------------
                                   Embedded Controller
    -------------------------------------------------------------------------- */
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -279,6 +279,9 @@
 	 * after acpi_cppc_processor_probe() has been called for all online CPUs
 	 */
 	acpi_processor_init_invariance_cppc();
+
+	acpi_idle_rescan_dead_smt_siblings();
+
 	return 0;
 err:
 	driver_unregister(&acpi_processor_driver);
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -24,6 +24,8 @@
 #include <acpi/processor.h>
 #include <linux/context_tracking.h>
 
+#include "internal.h"
+
 /*
  * Include the apic definitions for x86 to have the APIC timer related defines
  * available also for UP (on SMP it gets magically included via linux/smp.h).
@@ -55,6 +57,12 @@
 };
 
 #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
+void acpi_idle_rescan_dead_smt_siblings(void)
+{
+	if (cpuidle_get_driver() == &acpi_idle_driver)
+		arch_cpu_rescan_dead_smt_siblings();
+}
+
 static
 DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 5/5] Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2025-06-05 15:07 ` [PATCH v1 4/5] ACPI: processor: " Rafael J. Wysocki
@ 2025-06-05 15:09 ` Rafael J. Wysocki
  2025-06-06 10:51 ` [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Artem Bityutskiy
  5 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 15:09 UTC (permalink / raw)
  To: x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

Revert commit 70523f335734 ("Revert "x86/smp: Eliminate
mwait_play_dead_cpuid_hint()"") to reapply the changes from commit
96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()")
reverted by it.

Previously, these changes caused idle power to rise on systems booting
with "nosmt" in the kernel command line because they effectively caused
"dead" SMT siblings to remain in idle state C1 after executing the HLT
instruction, which prevented the processor from reaching package idle
states deeper than PC2 going forward.

Now, the "dead" SMT siblings are rescanned after initializing a proper
cpuidle driver for the processor (either intel_idle or ACPI idle), at
which point they are able to enter a sufficiently deep idle state
in native_play_dead() via cpuidle, so the code changes in question can
be reapplied.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/kernel/smpboot.c |   54 +++++-----------------------------------------
 1 file changed, 7 insertions(+), 47 deletions(-)

--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1238,6 +1238,10 @@
 	local_irq_disable();
 }
 
+/*
+ * We need to flush the caches before going to sleep, lest we have
+ * dirty data in our caches when we come back up.
+ */
 void __noreturn mwait_play_dead(unsigned int eax_hint)
 {
 	struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
@@ -1284,50 +1288,6 @@
 }
 
 /*
- * We need to flush the caches before going to sleep, lest we have
- * dirty data in our caches when we come back up.
- */
-static inline void mwait_play_dead_cpuid_hint(void)
-{
-	unsigned int eax, ebx, ecx, edx;
-	unsigned int highest_cstate = 0;
-	unsigned int highest_subcstate = 0;
-	int i;
-
-	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
-	    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
-		return;
-	if (!this_cpu_has(X86_FEATURE_MWAIT))
-		return;
-	if (!this_cpu_has(X86_FEATURE_CLFLUSH))
-		return;
-
-	eax = CPUID_LEAF_MWAIT;
-	ecx = 0;
-	native_cpuid(&eax, &ebx, &ecx, &edx);
-
-	/*
-	 * eax will be 0 if EDX enumeration is not valid.
-	 * Initialized below to cstate, sub_cstate value when EDX is valid.
-	 */
-	if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED)) {
-		eax = 0;
-	} else {
-		edx >>= MWAIT_SUBSTATE_SIZE;
-		for (i = 0; i < 7 && edx; i++, edx >>= MWAIT_SUBSTATE_SIZE) {
-			if (edx & MWAIT_SUBSTATE_MASK) {
-				highest_cstate = i;
-				highest_subcstate = edx & MWAIT_SUBSTATE_MASK;
-			}
-		}
-		eax = (highest_cstate << MWAIT_SUBSTATE_SIZE) |
-			(highest_subcstate - 1);
-	}
-
-	mwait_play_dead(eax);
-}
-
-/*
  * Kick all "offline" CPUs out of mwait on kexec(). See comment in
  * mwait_play_dead().
  */
@@ -1377,9 +1337,9 @@
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 
-	mwait_play_dead_cpuid_hint();
-	if (cpuidle_play_dead())
-		hlt_play_dead();
+	/* Below returns only on error. */
+	cpuidle_play_dead();
+	hlt_play_dead();
 }
 
 #else /* ... !CONFIG_HOTPLUG_CPU */




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 4/5] ACPI: processor: Rescan "dead" SMT siblings during initialization
  2025-06-05 15:07 ` [PATCH v1 4/5] ACPI: processor: " Rafael J. Wysocki
@ 2025-06-05 16:14   ` Dave Hansen
  2025-06-05 18:20     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2025-06-05 16:14 UTC (permalink / raw)
  To: Rafael J. Wysocki, x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Artem Bityutskiy, Gautham R. Shenoy, Ingo Molnar, Borislav Petkov,
	Linux ACPI

On 6/5/25 08:07, Rafael J. Wysocki wrote:
>  #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
> +void acpi_idle_rescan_dead_smt_siblings(void)
> +{
> +	if (cpuidle_get_driver() == &acpi_idle_driver)
> +		arch_cpu_rescan_dead_smt_siblings();
> +}

My only thought in reading this is that maybe cpuidle_register_driver()
would be a better spot to force the arch_cpu_rescan_dead_smt_siblings().
That way, each driver would not have to do the rescan.

But that's just a little nit at worst, otherwise the series looks good
to me. Thanks for chasing this down.

For the x86 bits:

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 4/5] ACPI: processor: Rescan "dead" SMT siblings during initialization
  2025-06-05 16:14   ` Dave Hansen
@ 2025-06-05 18:20     ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-05 18:20 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Rafael J. Wysocki, x86 Maintainers, Linux PM, LKML, Len Brown,
	Peter Zijlstra, Thomas Gleixner, Dave Hansen, Artem Bityutskiy,
	Gautham R. Shenoy, Ingo Molnar, Borislav Petkov, Linux ACPI

On Thu, Jun 5, 2025 at 6:14 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 6/5/25 08:07, Rafael J. Wysocki wrote:
> >  #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
> > +void acpi_idle_rescan_dead_smt_siblings(void)
> > +{
> > +     if (cpuidle_get_driver() == &acpi_idle_driver)
> > +             arch_cpu_rescan_dead_smt_siblings();
> > +}
>
> My only thought in reading this is that maybe cpuidle_register_driver()
> would be a better spot to force the arch_cpu_rescan_dead_smt_siblings().
> That way, each driver would not have to do the rescan.

Unfortunately, this wouldn't work in the current arrangement of things
because cpuidle_register_driver() can be called in a CPU online path.

It should be possible to make this work in the future, but first things first.

> But that's just a little nit at worst, otherwise the series looks good
> to me. Thanks for chasing this down.
>
> For the x86 bits:
>
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

Thank you!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint()
  2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2025-06-05 15:09 ` [PATCH v1 5/5] Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" Rafael J. Wysocki
@ 2025-06-06 10:51 ` Artem Bityutskiy
  2025-06-06 10:53   ` Rafael J. Wysocki
  5 siblings, 1 reply; 10+ messages in thread
From: Artem Bityutskiy @ 2025-06-06 10:51 UTC (permalink / raw)
  To: Rafael J. Wysocki, x86 Maintainers, Linux PM
  Cc: LKML, Len Brown, Peter Zijlstra, Thomas Gleixner, Dave Hansen,
	Gautham R. Shenoy, Ingo Molnar, Borislav Petkov, Linux ACPI

On Thu, 2025-06-05 at 17:03 +0200, Rafael J. Wysocki wrote:
> Hi Everyone,
> 
> The purpose of this series is to reapply the code changes from commit
> 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()") that
> has been reverted because of an issue introduced by it.  This takes
> place in the last patch ([5/5]) and the previous patches make
> preparatory changes needed to avoid breaking systems in the field
> once again.

Hello,

thanks Rafael for the patches, and Peter/Dave for helping handling the
issue.

Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

I measured a Granite Rapids Xeon with v6.15 vanilla [setup 1], then
with v6.15 + revert of 96040f7273e2 + these 5 patches [setup 2].

I can see that in setup 2 the average idle package power is 54% lower
than in setup 1. This is simply because in setup 1 there is no CC6 /
PC6. And this is because every core's sibling is in CC1, which blocks
CC6 on every core.

== Reference ==

1. Small article explaining then naming I use (CC1, CC6, PC6):
https://github.com/intel/pepc/blob/main/docs/misc-cstate-namespaces.md

== Non-essential details, just for reference ==

The measurement could have been done in a simpler way, but since I have
developed tools and have a good setup to measure workloads with my open
source tools, I did the following.

Used stats-collect: https://github.com/intel/stats-collect/

# Boot the vanilla kernel on system under test (SUT) named 'gnr2' (this
# is the host name). Then run the following command on my system, that
# has root SSH access to 'gnr2' configured.

stats-collect start --stats all -H gnr2 --reportid gnr2-idle-nosmt-
v6.15-vanilla -o raw/gnr2-idle-nosmt-v6.15-vanilla -- sleep 600

# The above SSHed to gnr2, ran workload "sleep 600", and collected a
# bunch of statistics, which were saved to
# raw/gnr2-idle-nosmt-v6.15-vanilla on my host.

# Boot the "setup 2" kernel, which I referred to as "6.15-fixed". Run
# the following command.

stats-collect start --stats all -H gnr2 --reportid gnr2-idle-nosmt-
v6.15-fixed -o raw/gnr2-idle-nosmt-v6.15-fixed -- sleep 600

# Then generate an HTML diff between those.
stats-collect report raw/gnr2-idle-nosmt-v6.15-vanilla/ raw/gnr2-idle-
nosmt-v6.15-fixed/ -o diff

# Open "diff" with a browser (I have my HTTP server, so just copied it
# to ~/public_html). Find lots of stats, in my case
# turbostat, interrupts, IPMI.

In the diff, I checked turbostat's PkgPwr (CPU chip power in Watts) and
turbostat's CPU%c6 (CC6 residency, %).

HTH,
Artem.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint()
  2025-06-06 10:51 ` [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Artem Bityutskiy
@ 2025-06-06 10:53   ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2025-06-06 10:53 UTC (permalink / raw)
  To: Artem Bityutskiy
  Cc: Rafael J. Wysocki, x86 Maintainers, Linux PM, LKML, Len Brown,
	Peter Zijlstra, Thomas Gleixner, Dave Hansen, Gautham R. Shenoy,
	Ingo Molnar, Borislav Petkov, Linux ACPI

On Fri, Jun 6, 2025 at 12:51 PM Artem Bityutskiy
<artem.bityutskiy@linux.intel.com> wrote:
>
> On Thu, 2025-06-05 at 17:03 +0200, Rafael J. Wysocki wrote:
> > Hi Everyone,
> >
> > The purpose of this series is to reapply the code changes from commit
> > 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()") that
> > has been reverted because of an issue introduced by it.  This takes
> > place in the last patch ([5/5]) and the previous patches make
> > preparatory changes needed to avoid breaking systems in the field
> > once again.
>
> Hello,
>
> thanks Rafael for the patches, and Peter/Dave for helping handling the
> issue.
>
> Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>
> I measured a Granite Rapids Xeon with v6.15 vanilla [setup 1], then
> with v6.15 + revert of 96040f7273e2 + these 5 patches [setup 2].
>
> I can see that in setup 2 the average idle package power is 54% lower
> than in setup 1. This is simply because in setup 1 there is no CC6 /
> PC6. And this is because every core's sibling is in CC1, which blocks
> CC6 on every core.
>
> == Reference ==
>
> 1. Small article explaining then naming I use (CC1, CC6, PC6):
> https://github.com/intel/pepc/blob/main/docs/misc-cstate-namespaces.md
>
> == Non-essential details, just for reference ==
>
> The measurement could have been done in a simpler way, but since I have
> developed tools and have a good setup to measure workloads with my open
> source tools, I did the following.
>
> Used stats-collect: https://github.com/intel/stats-collect/
>
> # Boot the vanilla kernel on system under test (SUT) named 'gnr2' (this
> # is the host name). Then run the following command on my system, that
> # has root SSH access to 'gnr2' configured.
>
> stats-collect start --stats all -H gnr2 --reportid gnr2-idle-nosmt-
> v6.15-vanilla -o raw/gnr2-idle-nosmt-v6.15-vanilla -- sleep 600
>
> # The above SSHed to gnr2, ran workload "sleep 600", and collected a
> # bunch of statistics, which were saved to
> # raw/gnr2-idle-nosmt-v6.15-vanilla on my host.
>
> # Boot the "setup 2" kernel, which I referred to as "6.15-fixed". Run
> # the following command.
>
> stats-collect start --stats all -H gnr2 --reportid gnr2-idle-nosmt-
> v6.15-fixed -o raw/gnr2-idle-nosmt-v6.15-fixed -- sleep 600
>
> # Then generate an HTML diff between those.
> stats-collect report raw/gnr2-idle-nosmt-v6.15-vanilla/ raw/gnr2-idle-
> nosmt-v6.15-fixed/ -o diff
>
> # Open "diff" with a browser (I have my HTTP server, so just copied it
> # to ~/public_html). Find lots of stats, in my case
> # turbostat, interrupts, IPMI.
>
> In the diff, I checked turbostat's PkgPwr (CPU chip power in Watts) and
> turbostat's CPU%c6 (CC6 residency, %).
>
> HTH,

Yes, it does, thanks a lot!

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-06-06 10:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-05 15:03 [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Rafael J. Wysocki
2025-06-05 15:04 ` [PATCH v1 1/5] intel_idle: Use subsys_initcall_sync() for initialization Rafael J. Wysocki
2025-06-05 15:05 ` [PATCH v1 2/5] x86/smp: PM/hibernate: Split arch_resume_nosmt() Rafael J. Wysocki
2025-06-05 15:06 ` [PATCH v1 3/5] intel_idle: Rescan "dead" SMT siblings during initialization Rafael J. Wysocki
2025-06-05 15:07 ` [PATCH v1 4/5] ACPI: processor: " Rafael J. Wysocki
2025-06-05 16:14   ` Dave Hansen
2025-06-05 18:20     ` Rafael J. Wysocki
2025-06-05 15:09 ` [PATCH v1 5/5] Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" Rafael J. Wysocki
2025-06-06 10:51 ` [PATCH v1 0/5] x86/smp: Restore the elimination of mwait_play_dead_cpuid_hint() Artem Bityutskiy
2025-06-06 10:53   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).