From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: x86 Maintainers <x86@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Linux PM <linux-pm@vger.kernel.org>, Len Brown <lenb@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@suse.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Artem Bityutskiy <artem.bityutskiy@linux.intel.com>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Ingo Molnar <mingo@redhat.com>,
Todd Brandt <todd.e.brandt@linux.intel.com>
Subject: [PATCH v1 1/2] Revert "x86/smp: Eliminate mwait_play_dead_cpuid_hint()"
Date: Wed, 28 May 2025 14:53:50 +0200 [thread overview]
Message-ID: <7811828.EvYhyI6sBW@rjwysocki.net> (raw)
In-Reply-To: <2006806.PYKUYFuaPT@rjwysocki.net>
From: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Revert commit 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()")
because it introduced a significant power regression on systems that start
with "nosmt" in the kernel command line.
Namely, on such systems, SMT siblings permanently go offline early,
when cpuidle has not been initialized yet, so after the above commit,
hlt_play_dead() is called for them. Later on, when the processor
attempts to enter a deep package C-state, including PC10 which is
requisite for reaching minimum power in suspend-to-idle, it is not
able to do that because of the SMT siblings staying in C1 (which
they have been put into by HLT).
Fixes: 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()")
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Cc: 6.15+ <stable@vger.kernel.org> # 6.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
arch/x86/kernel/smpboot.c | 54 ++++++++++++++++++++++++++++++++++++++++------
1 file changed, 47 insertions(+), 7 deletions(-)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1238,10 +1238,6 @@
local_irq_disable();
}
-/*
- * We need to flush the caches before going to sleep, lest we have
- * dirty data in our caches when we come back up.
- */
void __noreturn mwait_play_dead(unsigned int eax_hint)
{
struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead);
@@ -1288,6 +1284,50 @@
}
/*
+ * We need to flush the caches before going to sleep, lest we have
+ * dirty data in our caches when we come back up.
+ */
+static inline void mwait_play_dead_cpuid_hint(void)
+{
+ unsigned int eax, ebx, ecx, edx;
+ unsigned int highest_cstate = 0;
+ unsigned int highest_subcstate = 0;
+ int i;
+
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD ||
+ boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
+ return;
+ if (!this_cpu_has(X86_FEATURE_MWAIT))
+ return;
+ if (!this_cpu_has(X86_FEATURE_CLFLUSH))
+ return;
+
+ eax = CPUID_LEAF_MWAIT;
+ ecx = 0;
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+
+ /*
+ * eax will be 0 if EDX enumeration is not valid.
+ * Initialized below to cstate, sub_cstate value when EDX is valid.
+ */
+ if (!(ecx & CPUID5_ECX_EXTENSIONS_SUPPORTED)) {
+ eax = 0;
+ } else {
+ edx >>= MWAIT_SUBSTATE_SIZE;
+ for (i = 0; i < 7 && edx; i++, edx >>= MWAIT_SUBSTATE_SIZE) {
+ if (edx & MWAIT_SUBSTATE_MASK) {
+ highest_cstate = i;
+ highest_subcstate = edx & MWAIT_SUBSTATE_MASK;
+ }
+ }
+ eax = (highest_cstate << MWAIT_SUBSTATE_SIZE) |
+ (highest_subcstate - 1);
+ }
+
+ mwait_play_dead(eax);
+}
+
+/*
* Kick all "offline" CPUs out of mwait on kexec(). See comment in
* mwait_play_dead().
*/
@@ -1337,9 +1377,9 @@
play_dead_common();
tboot_shutdown(TB_SHUTDOWN_WFS);
- /* Below returns only on error. */
- cpuidle_play_dead();
- hlt_play_dead();
+ mwait_play_dead_cpuid_hint();
+ if (cpuidle_play_dead())
+ hlt_play_dead();
}
#else /* ... !CONFIG_HOTPLUG_CPU */
next prev parent reply other threads:[~2025-05-28 13:59 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-28 12:53 [PATCH v1 0/2] x86/smp: Fix power regression introduced by commit 96040f7273e2 Rafael J. Wysocki
2025-05-28 12:53 ` Rafael J. Wysocki [this message]
2025-05-28 12:54 ` [PATCH v1 2/2] x86/smp: Prefer cpuidle_play_dead() to mwait_play_dead_cpuid_hint() Rafael J. Wysocki
2025-05-28 13:17 ` [PATCH v1 0/2] x86/smp: Fix power regression introduced by commit 96040f7273e2 Peter Zijlstra
2025-05-28 13:20 ` Rafael J. Wysocki
2025-05-28 13:38 ` Peter Zijlstra
2025-05-28 14:25 ` Rafael J. Wysocki
2025-05-28 16:05 ` Peter Zijlstra
2025-05-28 17:09 ` Rafael J. Wysocki
2025-05-29 8:53 ` Peter Zijlstra
2025-05-29 9:38 ` Rafael J. Wysocki
2025-05-30 8:07 ` Peter Zijlstra
2025-05-30 9:18 ` Rafael J. Wysocki
2025-05-30 9:27 ` Rafael J. Wysocki
2025-05-30 16:59 ` Rafael J. Wysocki
2025-05-30 17:55 ` Rafael J. Wysocki
2025-05-29 13:40 ` [PATCH v2] Revert "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" Rafael J. Wysocki
2025-05-29 14:25 ` Dave Hansen
2025-05-29 15:39 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7811828.EvYhyI6sBW@rjwysocki.net \
--to=rjw@rjwysocki.net \
--cc=artem.bityutskiy@linux.intel.com \
--cc=bp@suse.de \
--cc=dave.hansen@linux.intel.com \
--cc=gautham.shenoy@amd.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=todd.e.brandt@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox