linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon
@ 2025-08-25  7:57 K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  7:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, K Prateek Nayak, Naveen N Rao

When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
was noticed with recent kernels:

    [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200

Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
common topology code for AMD") where, after the rework, the initial
APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
the value from CPUID leaf 0xb EDX[31:0] previously.

This led us down a rabbit hole of XTOPOLOGY vs TOPOEXT support, preferred
order of their parsing, and QEMU nuances like [1] where QEMU 0's out the
CPUID leaf 0x8000001e on CPUs where Core ID crosses 255 fearing a
Core ID collision in the 8 bit field which leads to the reported FW_BUG.

Following were major observations during the debug which the two
patches address respectively:

1. The support for CPUID leaf 0xb is independent of the TOPOEXT feature
   and is rather linked to the x2APIC enablement. In an effort to keep
   all the topology bits together during x2APIC enablement on AMD, the
   parsing ox the extended topology leaf 0xb was incorrectly put behind
   a X86_FEATURE_TOPOEXT check.

   On baremetal, this has not been a problem since TOPOEXT support
   (Fam 0x15 and above) predates the support for CPUID leaf 0xb
   (Fam 0x17[Zen2] and above) however, in virtualized environment, the
   support for x2APIC can be enabled independent of topoext where QEMU
   expects the guest to parse the topology and the APICID from CPUID
   leaf 0xb.

   Boris asked why QEMU doesn't force enable TOPOEXT feature with x2APIC
   [2] and Naveen discovered there were historic reasons to not enable
   TOPOEXT by default when using "-cpu host" on AMD systems [3]. The
   same behavior continues unless an EPYC cpu model is explicitly passed
   to QEMU.

2. Since CPUID leaf 0x8000001e cannot represent Core ID without
   collision for guests with > 255 cores, and QEMU 0's out the entire
   leaf when Core ID crosses 255.

   Prefer initial APIC read from the XTOPOLOGY leaf (0x80000026 / 0xb)
   which can represent up to 2^16 cores, before falling back to the APIC
   ID from 0x8000001e which is still better than 8-bit APICID from leaf
   0x1 EBX[31:24].

More details are enclosed in the commit logs.

Ideally, these changes should not affect baremetal AMD/Hygon platforms
as they have supported TOPOEXT long before the support for CPUID leaf
0xb and the extended CPUID leaf 0x80000026 (famous last words).

Patch 3 and 4 is yak shaving to explicitly define a raw MSR value used
in the topology parsing bits and simplify the flow around "has_topoext"
when the same can be discovered using X86_FEATURE_XTOPOLOGY.

Previous version of this series has been tested on baremetal Zen1
(contains topoext but not 0xb leaf), Zen3 (contains both topoext and 0xb
leaf), and Zen4 (contains topoext, 0xb leaf, and 0x80000026 leaf)
servers with no changes observed in "/sys/kernel/debug/x86/topo/"
directory.

The series was also tested on 255 and 512 vCPU (each vCPU is an
individual core from QEMU topology being passed) EPYC-Genoa guest with
and without x2apic and topoext enabled and this series solves the FW_BUG
seen on guest with > 255 VCPUs. No changes observed in
"/sys/kernel/debug/x86/topo/" for all other cases without warning.
0xb leaf is provided unconditionally on these guests (with or without
topoext, even with x2apic disabled on guests with <= 255 vCPU).

In all the cases initial_apicid matched the apicid in
"/sys/kernel/debug/x86/topo/" after applying this series.

Relevant bits of QEMU cmdline used during testing are as follows:

    qemu-system-x86_64 \
    -enable-kvm -m 32G -smp cpus=512,cores=512 \
    -cpu EPYC-Genoa,x2apic=on,kvm-msi-ext-dest-id=on,+kvm-pv-unhalt,kvm-pv-tlb-flush,kvm-pv-ipi,kvm-pv-sched-yield,[-topoext]  \
    -machine q35,kernel_irqchip=split \
    -global kvm-pit.lost_tick_policy=discard
    ...

References:

[1] https://github.com/qemu/qemu/commit/35ac5dfbcaa4b
[2] https://lore.kernel.org/lkml/20250819113447.GJaKRhVx6lBPUc6NMz@fat_crate.local/
[3] https://lore.kernel.org/qemu-devel/20180809221852.15285-1-ehabkost@redhat.com/

Series is based on tip:master at commit 7182bf4176f9 ("Merge branch into
tip/master: 'x86/tdx'") and applies cleanly on top of tip:x86/cpu at
commit f3285344a5a3 ("x86/cpu/cacheinfo: Simplify
cacheinfo_amd_init_llc_id() using _cpuid4_info")

---
Changelog v3..v4:

o Renamed the series title to better capture the purpose. Based on the
  readout of the APM and PPR, this problem was only exposed by QEMU
  and QEMU is not doing anything wrong considering the spec.

o Fixed references to X86_FEATURE_XTOPOLOGY (XTOPOLOGY) which was
  mistakenly referred to as XTOPOEXT. (Boris)

o Reordered the patches to have the fixes before cleanups. (Thomas)

o Refreshed the diff of Patch 1 with the one Thomas suggested in
  https://lore.kernel.org/lkml/87ms7o3kn6.ffs@tglx/. (Thomas)

o Quoted the relevant sections of the APM and the PPR to support the
  changes. (Mentioned on v3 by Naveen and Boris)

Note: The debate on "CoreId" from CPUID 0x8000001e EBX has not been
addressed yet. I'll check internally and follow up on the QEMU bits once
H/W folks confirm what their strategy is with the 8-bit field in future
processors.

The updates in this series ensures the usage of the topology information
from the XTOPOLOGY leaves (0x80000026 / 0xb)  when they are present and
systems that support more than 256 CPUs need x2APIC enabled to address
all the CPUs present thus removing the dependency on CPUID leaf
0x8000001e for Core ID.

v3: https://lore.kernel.org/lkml/20250818060435.2452-1-kprateek.nayak@amd.com/

Changelog v2..v3:

o Patch 1 was added to the series.
o Use cpu_feature_enabled() in Patch 3.
o Rebased on top of tip:x86/cpu.

v2: https://lore.kernel.org/lkml/20250725110622.59743-1-kprateek.nayak@amd.com/

Changelog v1..v2:

o Collected tags from Naveen. (Thank you for testing!)
o Rebased the series on tip:x86/cpu.
o Swapped Patch 1 and Patch 2 from v1.
o Merged the body of two if blocks in Patch 1 to allow for cleanup in
  Patch 3.

v1: https://lore.kernel.org/lkml/20250612072921.15107-1-kprateek.nayak@amd.com/

---
K Prateek Nayak (4):
  x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON
  x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing
    has_topoext
  x86/msr-index: Define AMD64_CPUID_FN_EXT MSR

 arch/x86/include/asm/msr-index.h   |  5 ++++
 arch/x86/kernel/cpu/topology_amd.c | 48 +++++++++++++++---------------
 2 files changed, 29 insertions(+), 24 deletions(-)


base-commit: 7182bf4176f93be42225d2ef983894febfa4a1b1
-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
@ 2025-08-25  7:57 ` K Prateek Nayak
  2025-08-25  8:17   ` K Prateek Nayak
  2025-08-27 10:34   ` [tip: x86/urgent] " tip-bot2 for K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  7:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, K Prateek Nayak, Naveen N Rao,
	stable

Prior to the topology parsing rewrite and the switchover to the new
parsing logic for AMD processors in commit c749ce393b8f ("x86/cpu: Use
common topology code for AMD"), the "initial_apicid" on these platforms
was:

- First initialized to the LocalApicId from CPUID leaf 0x1 EBX[31:24].

- Then overwritten by the ExtendedLocalApicId in CPUID leaf 0xb
  EDX[31:0] on processors that supported topoext.

With the new parsing flow introduced in commit f7fb3b2dd92c ("x86/cpu:
Provide an AMD/HYGON specific topology parser"), parse_8000_001e() now
unconditionally overwrites the "initial_apicid" already parsed during
cpu_parse_topology_ext().

Although this has not been a problem on baremetal platforms, on
virtualized AMD guests that feature more than 255 cores, QEMU 0's out
the CPUID leaf 0x8000001e on CPUs with "CoreID" > 255 to prevent
collision of these IDs in EBX[7:0] which can only represent a maximum of
255 cores [1].

This results in the following FW_BUG being logged when booting a guest
with more than 255 cores:

    [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200

AMD64 Architecture Programmer's Manual Volume 2: System Programming Pub.
24593 Rev. 3.42 [2] Section 16.12 "x2APIC_ID" mentions the Extended
Enumeration leaf 0x8000001e (which was later superseded by the extended
leaf 0x80000026) provides the full x2APIC ID under all circumstances
unlike the one reported by CPUID leaf 0x8000001e EAX which depends on
the mode in which APIC is configured.

Rely on the APIC ID parsed during cpu_parse_topology_ext() from CPUID
leaf 0x80000026 or 0xb and only use the APIC ID from leaf 0x8000001e if
cpu_parse_topology_ext() failed (has_topoext is false).

On platforms that support the 0xb leaf (Zen2 or later, AMD guests on
QEMU) or the extended leaf 0x80000026 (Zen4 or later), the
"initial_apicid" is now set to the value parsed from EDX[31:0].

On older AMD/Hygon platforms that does not support the 0xb leaf but
supports the TOPOEXT extension (Fam 0x15, 0x16, 0x17[Zen1], and Hygon),
the current behavior is retained where "initial_apicid" is set using
the 0x8000001e leaf.

Cc: stable@vger.kernel.org
Link: https://github.com/qemu/qemu/commit/35ac5dfbcaa4b [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [2]
Debugged-by: Naveen N Rao (AMD) <naveen@kernel.org>
Debugged-by: Sairaj Kodilkar <sarunkod@amd.com>
Fixes: c749ce393b8f ("x86/cpu: Use common topology code for AMD")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v3..v4:

o Refreshed the diff based on Thomas' suggestion. The tags have been
  retained since there are no functional changes - only comments around
  the code has changed.

o Quoted relevant section of APM justifying the changes.

o Moved this patch up ahead.

o Cc'd stable.
---
 arch/x86/kernel/cpu/topology_amd.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 843b1655ab45..827dd0dbb6e9 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -81,20 +81,25 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
 
 	cpuid_leaf(0x8000001e, &leaf);
 
-	tscan->c->topo.initial_apicid = leaf.ext_apic_id;
-
 	/*
-	 * If leaf 0xb is available, then the domain shifts are set
-	 * already and nothing to do here. Only valid for family >= 0x17.
+	 * If leaf 0xb/0x26 is available, then the APIC ID and the domain
+	 * shifts are set already.
 	 */
-	if (!has_topoext && tscan->c->x86 >= 0x17) {
+	if (!has_topoext) {
+		tscan->c->topo.initial_apicid = leaf.ext_apic_id;
+
 		/*
-		 * Leaf 0x80000008 set the CORE domain shift already.
-		 * Update the SMT domain, but do not propagate it.
+		 * Leaf 0x8000008 sets the CORE domain shift but not the
+		 * SMT domain shift. On CPUs with family >= 0x17, there
+		 * might be hyperthreads.
 		 */
-		unsigned int nthreads = leaf.core_nthreads + 1;
+		if (tscan->c->x86 >= 0x17) {
+			/* Update the SMT domain, but do not propagate it. */
+			unsigned int nthreads = leaf.core_nthreads + 1;
 
-		topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(nthreads), nthreads);
+			topology_update_dom(tscan, TOPO_SMT_DOMAIN,
+					    get_count_order(nthreads), nthreads);
+		}
 	}
 
 	store_node(tscan, leaf.nnodes_per_socket + 1, leaf.node_id);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
@ 2025-08-25  7:57 ` K Prateek Nayak
  2025-08-30 17:19   ` Borislav Petkov
  2025-08-25  7:57 ` [PATCH v4 3/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  7:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, K Prateek Nayak, Naveen N Rao,
	stable

Support for parsing the topology on AMD/Hygon processors using CPUID
leaf 0xb was added in commit 3986a0a805e6 ("x86/CPU/AMD: Derive CPU
topology from CPUID function 0xB when available"). In an effort to keep
all the topology parsing bits in one place, this commit also introduced
a pseudo dependency on the TOPOEXT feature to parse the CPUID leaf 0xb.

TOPOEXT feature (CPUID 0x80000001 ECX[22]) advertises the support for
Cache Properties leaf 0x8000001d and the CPUID leaf 0x8000001e EAX for
"Extended APIC ID" however support for 0xb was introduced alongside the
x2APIC support not only on AMD [1], but also historically on x86 [2].

Similar to 0xb, the support for extended CPU topology leaf 0x80000026
too does not depend on the TOPOEXT feature.

The support for these leaves is expected to be confirmed by ensuring
"leaf <= {extended_}cpuid_level" and then parsing the level 0 of the
respective leaf to confirm EBX[15:0] (LogProcAtThisLevel) is non-zero as
stated in the definition of "CPUID_Fn0000000B_EAX_x00 [Extended Topology
Enumeration] (Core::X86::Cpuid::ExtTopEnumEax0)" in Processor
Programming Reference (PPR) for AMD Family 19h Model 01h Rev B1 Vol1 [3]
Sec. 2.1.15.1 "CPUID Instruction Functions".

This has not been a problem on baremetal platforms since support for
TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
(Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
feature can be enabled independent of the "topoext" feature where QEMU
expects topology and the initial APICID to be parsed using the CPUID
leaf 0xb (especially when number of cores > 255) which is populated
independent of the "topoext" feature flag.

Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
processors to first parse the topology using the XTOPOLOGY leaves
(0x80000026 / 0xb) before using the TOPOEXT leaf (0x8000001e).

Cc: stable@vger.kernel.org # Only v6.9 and above
Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthikulpanit@amd.com/ [1]
Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
Suggested-by: Naveen N Rao (AMD) <naveen@kernel.org>
Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v3..v4:

o Quoted relevant section of the PPR justifying the changes.

o Moved this patch up ahead.

o Cc'd stable and made a note that backports should target v6.9 and
  above since this depends on the x86 topology rewrite.
---
 arch/x86/kernel/cpu/topology_amd.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 827dd0dbb6e9..4e3134a5550c 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -175,18 +175,14 @@ static void topoext_fixup(struct topo_scan *tscan)
 
 static void parse_topology_amd(struct topo_scan *tscan)
 {
-	bool has_topoext = false;
-
 	/*
-	 * If the extended topology leaf 0x8000_001e is available
-	 * try to get SMT, CORE, TILE, and DIE shifts from extended
+	 * Try to get SMT, CORE, TILE, and DIE shifts from extended
 	 * CPUID leaf 0x8000_0026 on supported processors first. If
 	 * extended CPUID leaf 0x8000_0026 is not supported, try to
 	 * get SMT and CORE shift from leaf 0xb first, then try to
 	 * get the CORE shift from leaf 0x8000_0008.
 	 */
-	if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
-		has_topoext = cpu_parse_topology_ext(tscan);
+	bool has_topoext = cpu_parse_topology_ext(tscan);
 
 	if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
 		tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
@ 2025-08-25  7:57 ` K Prateek Nayak
  2025-08-25  7:57 ` [PATCH v4 4/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  7:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, K Prateek Nayak, Naveen N Rao

cpu_parse_topology_ext() sets X86_FEATURE_XTOPOLOGY before returning
true if any of the XTOPOLOGY leaf (0x80000026 / 0xb) could be parsed
successfully.

Instead of storing and passing around this return value using
"has_topoext" in parse_topology_amd(), check for X86_FEATURE_XTOPOLOGY
instead in parse_8000_001e() to simplify the flow.

No functional changes intended.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v3..v4:

o Moved this to Patch 3. No changes to diff.
---
 arch/x86/kernel/cpu/topology_amd.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 4e3134a5550c..12ece07b407b 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -59,7 +59,7 @@ static void store_node(struct topo_scan *tscan, u16 nr_nodes, u16 node_id)
 	tscan->amd_node_id = node_id;
 }
 
-static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
+static bool parse_8000_001e(struct topo_scan *tscan)
 {
 	struct {
 		// eax
@@ -85,7 +85,7 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
 	 * If leaf 0xb/0x26 is available, then the APIC ID and the domain
 	 * shifts are set already.
 	 */
-	if (!has_topoext) {
+	if (!cpu_feature_enabled(X86_FEATURE_XTOPOLOGY)) {
 		tscan->c->topo.initial_apicid = leaf.ext_apic_id;
 
 		/*
@@ -175,6 +175,9 @@ static void topoext_fixup(struct topo_scan *tscan)
 
 static void parse_topology_amd(struct topo_scan *tscan)
 {
+	if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
+		tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
+
 	/*
 	 * Try to get SMT, CORE, TILE, and DIE shifts from extended
 	 * CPUID leaf 0x8000_0026 on supported processors first. If
@@ -182,16 +185,11 @@ static void parse_topology_amd(struct topo_scan *tscan)
 	 * get SMT and CORE shift from leaf 0xb first, then try to
 	 * get the CORE shift from leaf 0x8000_0008.
 	 */
-	bool has_topoext = cpu_parse_topology_ext(tscan);
-
-	if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
-		tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
-
-	if (!has_topoext && !parse_8000_0008(tscan))
+	if (!cpu_parse_topology_ext(tscan) && !parse_8000_0008(tscan))
 		return;
 
 	/* Prefer leaf 0x8000001e if available */
-	if (parse_8000_001e(tscan, has_topoext))
+	if (parse_8000_001e(tscan))
 		return;
 
 	/* Try the NODEID MSR */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
                   ` (2 preceding siblings ...)
  2025-08-25  7:57 ` [PATCH v4 3/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
@ 2025-08-25  7:57 ` K Prateek Nayak
  2025-08-25  8:49 ` [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon Borislav Petkov
  2025-09-01 10:11 ` [RFC PATCH v4 5/4] Documentation/x86/topology: Detail CPUID leaves used for topology enumeration K Prateek Nayak
  5 siblings, 0 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  7:57 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, K Prateek Nayak, Naveen N Rao

Explicitly define the AMD64_CPUID_FN_EXT MSR used to toggle the extended
features. Also define and use the bits necessary for an old TOPOEXT
fixup on AMD Family 0x15 processors.

No functional changes intended.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v3..v4:

o Moved this to Patch 4. No changes to diff.
---
 arch/x86/include/asm/msr-index.h   | 5 +++++
 arch/x86/kernel/cpu/topology_amd.c | 7 ++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b65c3ba5fa14..e194287177db 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -631,6 +631,11 @@
 #define MSR_AMD_PPIN			0xc00102f1
 #define MSR_AMD64_CPUID_FN_7		0xc0011002
 #define MSR_AMD64_CPUID_FN_1		0xc0011004
+
+#define MSR_AMD64_CPUID_FN_EXT				0xc0011005
+#define MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT	54
+#define MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED		BIT_ULL(MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT)
+
 #define MSR_AMD64_LS_CFG		0xc0011020
 #define MSR_AMD64_DC_CFG		0xc0011022
 #define MSR_AMD64_TW_CFG		0xc0011023
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 12ece07b407b..6e8186f05cde 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -163,11 +163,12 @@ static void topoext_fixup(struct topo_scan *tscan)
 	    c->x86 != 0x15 || c->x86_model < 0x10 || c->x86_model > 0x6f)
 		return;
 
-	if (msr_set_bit(0xc0011005, 54) <= 0)
+	if (msr_set_bit(MSR_AMD64_CPUID_FN_EXT,
+			MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT) <= 0)
 		return;
 
-	rdmsrq(0xc0011005, msrval);
-	if (msrval & BIT_64(54)) {
+	rdmsrq(MSR_AMD64_CPUID_FN_EXT, msrval);
+	if (msrval & MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED) {
 		set_cpu_cap(c, X86_FEATURE_TOPOEXT);
 		pr_info_once(FW_INFO "CPU: Re-enabling disabled Topology Extensions Support.\n");
 	}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON
  2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
@ 2025-08-25  8:17   ` K Prateek Nayak
  2025-08-27 10:34   ` [tip: x86/urgent] " tip-bot2 for K Prateek Nayak
  1 sibling, 0 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  8:17 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao, stable

On 8/25/2025 1:27 PM, K Prateek Nayak wrote:
> AMD64 Architecture Programmer's Manual Volume 2: System Programming Pub.
> 24593 Rev. 3.42 [2] Section 16.12 "x2APIC_ID" mentions the Extended
> Enumeration leaf 0x8000001e (which was later superseded by the extended

The above should have been CPUID leaf 0xb (Fn0000_000B_EDX[31:0]) and
not 0x8000001e. Sorry for the oversight.

> leaf 0x80000026) provides the full x2APIC ID under all circumstances
> unlike the one reported by CPUID leaf 0x8000001e EAX which depends on
> the mode in which APIC is configured.
-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
                   ` (3 preceding siblings ...)
  2025-08-25  7:57 ` [PATCH v4 4/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
@ 2025-08-25  8:49 ` Borislav Petkov
  2025-08-25  9:03   ` K Prateek Nayak
  2025-09-01 10:11 ` [RFC PATCH v4 5/4] Documentation/x86/topology: Detail CPUID leaves used for topology enumeration K Prateek Nayak
  5 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2025-08-25  8:49 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
	Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao

On Mon, Aug 25, 2025 at 07:57:28AM +0000, K Prateek Nayak wrote:
> This led us down a rabbit hole of XTOPOLOGY vs TOPOEXT support, preferred

So in order to save people the rabbit hole wandering each time they (or we)
have to undertake, I think we should document what the whole logic and
precedences are wrt CPUID leafs and topology. What should be done where and so
on.

And those commit messages have a lot of text which explains that and I think
it would be worth the effort to start holding it down here
Documentation/arch/x86/topology.rst

No long texts, no big explanations - just the plain facts and what the current
strategy is wrt to which CPUID leafs we parse for what in what order and so
on.

You could start the AMD side, it doesn't have to be exhaustive - just the
facts from this rabbit hole trip.

And then we'll keep extending it and filling out the details so that it is
right there written down in one place.

Makes sense?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon
  2025-08-25  8:49 ` [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon Borislav Petkov
@ 2025-08-25  9:03   ` K Prateek Nayak
  0 siblings, 0 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-08-25  9:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
	Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao

Hello Boris,

On 8/25/2025 2:19 PM, Borislav Petkov wrote:
> On Mon, Aug 25, 2025 at 07:57:28AM +0000, K Prateek Nayak wrote:
>> This led us down a rabbit hole of XTOPOLOGY vs TOPOEXT support, preferred
> 
> So in order to save people the rabbit hole wandering each time they (or we)
> have to undertake, I think we should document what the whole logic and
> precedences are wrt CPUID leafs and topology. What should be done where and so
> on.
> 
> And those commit messages have a lot of text which explains that and I think
> it would be worth the effort to start holding it down here
> Documentation/arch/x86/topology.rst
> 
> No long texts, no big explanations - just the plain facts and what the current
> strategy is wrt to which CPUID leafs we parse for what in what order and so
> on.
> 
> You could start the AMD side, it doesn't have to be exhaustive - just the
> facts from this rabbit hole trip.
> 
> And then we'll keep extending it and filling out the details so that it is
> right there written down in one place.
> 
> Makes sense?

Ack. I'll start working on it.

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip: x86/urgent] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON
  2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
  2025-08-25  8:17   ` K Prateek Nayak
@ 2025-08-27 10:34   ` tip-bot2 for K Prateek Nayak
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot2 for K Prateek Nayak @ 2025-08-27 10:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, K Prateek Nayak, Borislav Petkov (AMD),
	Naveen N Rao (AMD), stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     c2415c407a2cde01290d52ce2a1f81b0616379a3
Gitweb:        https://git.kernel.org/tip/c2415c407a2cde01290d52ce2a1f81b0616379a3
Author:        K Prateek Nayak <kprateek.nayak@amd.com>
AuthorDate:    Mon, 25 Aug 2025 07:57:29 
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 27 Aug 2025 11:31:11 +02:00

x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON

Prior to the topology parsing rewrite and the switchover to the new parsing
logic for AMD processors in

  c749ce393b8f ("x86/cpu: Use common topology code for AMD"),

the initial_apicid on these platforms was:

- First initialized to the LocalApicId from CPUID leaf 0x1 EBX[31:24].

- Then overwritten by the ExtendedLocalApicId in CPUID leaf 0xb
  EDX[31:0] on processors that supported topoext.

With the new parsing flow introduced in

  f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology parser"),

parse_8000_001e() now unconditionally overwrites the initial_apicid already
parsed during cpu_parse_topology_ext().

Although this has not been a problem on baremetal platforms, on virtualized AMD
guests that feature more than 255 cores, QEMU zeros out the CPUID leaf
0x8000001e on CPUs with CoreID > 255 to prevent collision of these IDs in
EBX[7:0] which can only represent a maximum of 255 cores [1].

This results in the following FW_BUG being logged when booting a guest
with more than 255 cores:

    [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200

AMD64 Architecture Programmer's Manual Volume 2: System Programming Pub.
24593 Rev. 3.42 [2] Section 16.12 "x2APIC_ID" mentions the Extended
Enumeration leaf 0xb (Fn0000_000B_EDX[31:0])(which was later superseded by the
extended leaf 0x80000026) provides the full x2APIC ID under all circumstances
unlike the one reported by CPUID leaf 0x8000001e EAX which depends on the mode
in which APIC is configured.

Rely on the APIC ID parsed during cpu_parse_topology_ext() from CPUID leaf
0x80000026 or 0xb and only use the APIC ID from leaf 0x8000001e if
cpu_parse_topology_ext() failed (has_topoext is false).

On platforms that support the 0xb leaf (Zen2 or later, AMD guests on
QEMU) or the extended leaf 0x80000026 (Zen4 or later), the
initial_apicid is now set to the value parsed from EDX[31:0].

On older AMD/Hygon platforms that do not support the 0xb leaf but support the
TOPOEXT extension (families 0x15, 0x16, 0x17[Zen1], and Hygon), retain current
behavior where the initial_apicid is set using the 0x8000001e leaf.

Issue debugged by Naveen N Rao (AMD) <naveen@kernel.org> and Sairaj Kodilkar
<sarunkod@amd.com>.

  [ bp: Massage commit message. ]

Fixes: c749ce393b8f ("x86/cpu: Use common topology code for AMD")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Naveen N Rao (AMD) <naveen@kernel.org>
Cc: stable@vger.kernel.org
Link: https://github.com/qemu/qemu/commit/35ac5dfbcaa4b [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [2]
Link: https://lore.kernel.org/20250825075732.10694-2-kprateek.nayak@amd.com
---
 arch/x86/kernel/cpu/topology_amd.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 843b165..827dd0d 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -81,20 +81,25 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
 
 	cpuid_leaf(0x8000001e, &leaf);
 
-	tscan->c->topo.initial_apicid = leaf.ext_apic_id;
-
 	/*
-	 * If leaf 0xb is available, then the domain shifts are set
-	 * already and nothing to do here. Only valid for family >= 0x17.
+	 * If leaf 0xb/0x26 is available, then the APIC ID and the domain
+	 * shifts are set already.
 	 */
-	if (!has_topoext && tscan->c->x86 >= 0x17) {
+	if (!has_topoext) {
+		tscan->c->topo.initial_apicid = leaf.ext_apic_id;
+
 		/*
-		 * Leaf 0x80000008 set the CORE domain shift already.
-		 * Update the SMT domain, but do not propagate it.
+		 * Leaf 0x8000008 sets the CORE domain shift but not the
+		 * SMT domain shift. On CPUs with family >= 0x17, there
+		 * might be hyperthreads.
 		 */
-		unsigned int nthreads = leaf.core_nthreads + 1;
+		if (tscan->c->x86 >= 0x17) {
+			/* Update the SMT domain, but do not propagate it. */
+			unsigned int nthreads = leaf.core_nthreads + 1;
 
-		topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(nthreads), nthreads);
+			topology_update_dom(tscan, TOPO_SMT_DOMAIN,
+					    get_count_order(nthreads), nthreads);
+		}
 	}
 
 	store_node(tscan, leaf.nnodes_per_socket + 1, leaf.node_id);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  2025-08-25  7:57 ` [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
@ 2025-08-30 17:19   ` Borislav Petkov
  2025-09-01  4:21     ` K Prateek Nayak
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2025-08-30 17:19 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
	Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao, stable

On Mon, Aug 25, 2025 at 07:57:30AM +0000, K Prateek Nayak wrote:
> This has not been a problem on baremetal platforms since support for
> TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
> (Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
> feature can be enabled independent of the "topoext" feature where QEMU
> expects topology and the initial APICID to be parsed using the CPUID
> leaf 0xb (especially when number of cores > 255) which is populated
> independent of the "topoext" feature flag.

This sounds like we're fixing the kernel because someone *might* supply
a funky configuration to qemu. You need to understand that we're not wagging
the dog and fixing the kernel because of that.

If someone manages to enable some weird concoction of feature flags in qemu,
we certainly won't "fix" that in the kernel.

So let's concentrate that text around fixing the issue of parsing CPUID
topology leafs which we should parse and looking at CPUID flags only for those
feature leafs, for which those flags are existing.

If qemu wants stuff to work, then it better emulate the feature flag
configuration like the hw does.

> Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
> processors to first parse the topology using the XTOPOLOGY leaves
> (0x80000026 / 0xb) before using the TOPOEXT leaf (0x8000001e).
> 
> Cc: stable@vger.kernel.org # Only v6.9 and above
> Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthikulpanit@amd.com/ [1]
> Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
> Suggested-by: Naveen N Rao (AMD) <naveen@kernel.org>
> Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
> Changelog v3..v4:
> 
> o Quoted relevant section of the PPR justifying the changes.
> 
> o Moved this patch up ahead.
> 
> o Cc'd stable and made a note that backports should target v6.9 and
>   above since this depends on the x86 topology rewrite.

Put that explanation in the CC:stable comment.

> ---
>  arch/x86/kernel/cpu/topology_amd.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
> index 827dd0dbb6e9..4e3134a5550c 100644
> --- a/arch/x86/kernel/cpu/topology_amd.c
> +++ b/arch/x86/kernel/cpu/topology_amd.c
> @@ -175,18 +175,14 @@ static void topoext_fixup(struct topo_scan *tscan)
>  
>  static void parse_topology_amd(struct topo_scan *tscan)
>  {
> -	bool has_topoext = false;
> -
>  	/*
> -	 * If the extended topology leaf 0x8000_001e is available
> -	 * try to get SMT, CORE, TILE, and DIE shifts from extended
> +	 * Try to get SMT, CORE, TILE, and DIE shifts from extended
>  	 * CPUID leaf 0x8000_0026 on supported processors first. If
>  	 * extended CPUID leaf 0x8000_0026 is not supported, try to
>  	 * get SMT and CORE shift from leaf 0xb first, then try to
>  	 * get the CORE shift from leaf 0x8000_0008.
>  	 */
> -	if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
> -		has_topoext = cpu_parse_topology_ext(tscan);
> +	bool has_topoext = cpu_parse_topology_ext(tscan);

Ok, I see what the point here is - you want to parse topology regardless of
X86_FEATURE_TOPOEXT.

Which is true - latter "indicates support for Core::X86::Cpuid::CachePropEax0
and Core::X86::Cpuid::ExtApicId" only and the leafs cpu_parse_topology_ext()
attempts to parse are different ones.

So, "has_topoext" doesn't have anything to do with X86_FEATURE_TOPOEXT - it
simply means that the topology extensions parser found some extensions and
parsed them. So let's rename that variable differently so that there is no
confusion.

You can do the renaming in parse_8000_001e() in a later patch as that is only
a cosmetic thing and we don't want to send that to stable.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  2025-08-30 17:19   ` Borislav Petkov
@ 2025-09-01  4:21     ` K Prateek Nayak
  2025-09-01 14:04       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: K Prateek Nayak @ 2025-09-01  4:21 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
	Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao, stable

Hello Boris,

On 8/30/2025 10:49 PM, Borislav Petkov wrote:
> On Mon, Aug 25, 2025 at 07:57:30AM +0000, K Prateek Nayak wrote:
>> This has not been a problem on baremetal platforms since support for
>> TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
>> (Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
>> feature can be enabled independent of the "topoext" feature where QEMU
>> expects topology and the initial APICID to be parsed using the CPUID
>> leaf 0xb (especially when number of cores > 255) which is populated
>> independent of the "topoext" feature flag.
> 
> This sounds like we're fixing the kernel because someone *might* supply
> a funky configuration to qemu. You need to understand that we're not wagging
> the dog and fixing the kernel because of that.
> 
> If someone manages to enable some weird concoction of feature flags in qemu,
> we certainly won't "fix" that in the kernel.
> 
> So let's concentrate that text around fixing the issue of parsing CPUID
> topology leafs which we should parse and looking at CPUID flags only for those
> feature leafs, for which those flags are existing.
> 
> If qemu wants stuff to work, then it better emulate the feature flag
> configuration like the hw does.

Ack. I'll add the relevant details in
Documentation/arch/x86/topology.rst in the next version as discussed but
I believe you discovered the intentions for this fix in the kernel
below.

> 
>> Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
>> processors to first parse the topology using the XTOPOLOGY leaves
>> (0x80000026 / 0xb) before using the TOPOEXT leaf (0x8000001e).
>>
>> Cc: stable@vger.kernel.org # Only v6.9 and above
>> Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthikulpanit@amd.com/ [1]
>> Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [3]
>> Suggested-by: Naveen N Rao (AMD) <naveen@kernel.org>
>> Fixes: 3986a0a805e6 ("x86/CPU/AMD: Derive CPU topology from CPUID function 0xB when available")
>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
>> ---
>> Changelog v3..v4:
>>
>> o Quoted relevant section of the PPR justifying the changes.
>>
>> o Moved this patch up ahead.
>>
>> o Cc'd stable and made a note that backports should target v6.9 and
>>   above since this depends on the x86 topology rewrite.
> 
> Put that explanation in the CC:stable comment.

Ack.

> 
>> ---
>>  arch/x86/kernel/cpu/topology_amd.c | 8 ++------
>>  1 file changed, 2 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
>> index 827dd0dbb6e9..4e3134a5550c 100644
>> --- a/arch/x86/kernel/cpu/topology_amd.c
>> +++ b/arch/x86/kernel/cpu/topology_amd.c
>> @@ -175,18 +175,14 @@ static void topoext_fixup(struct topo_scan *tscan)
>>  
>>  static void parse_topology_amd(struct topo_scan *tscan)
>>  {
>> -	bool has_topoext = false;
>> -
>>  	/*
>> -	 * If the extended topology leaf 0x8000_001e is available
>> -	 * try to get SMT, CORE, TILE, and DIE shifts from extended
>> +	 * Try to get SMT, CORE, TILE, and DIE shifts from extended
>>  	 * CPUID leaf 0x8000_0026 on supported processors first. If
>>  	 * extended CPUID leaf 0x8000_0026 is not supported, try to
>>  	 * get SMT and CORE shift from leaf 0xb first, then try to
>>  	 * get the CORE shift from leaf 0x8000_0008.
>>  	 */
>> -	if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
>> -		has_topoext = cpu_parse_topology_ext(tscan);
>> +	bool has_topoext = cpu_parse_topology_ext(tscan);
> 
> Ok, I see what the point here is - you want to parse topology regardless of
> X86_FEATURE_TOPOEXT.
> 
> Which is true - latter "indicates support for Core::X86::Cpuid::CachePropEax0
> and Core::X86::Cpuid::ExtApicId" only and the leafs cpu_parse_topology_ext()
> attempts to parse are different ones.
> 
> So, "has_topoext" doesn't have anything to do with X86_FEATURE_TOPOEXT - it
> simply means that the topology extensions parser found some extensions and
> parsed them. So let's rename that variable differently so that there is no
> confusion.
> 
> You can do the renaming in parse_8000_001e() in a later patch as that is only
> a cosmetic thing and we don't want to send that to stable.

Ack! Does "has_xtopology" sound good or should we go for something more
explicit like "has_xtopology_0x26_0xb"?

Patch 3 will get rid of that "has_topoext" argument in parse_8000_001e()
entirely so I'll rename the local variable here and use the subsequent
cleanup for parse_8000_001e().

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v4 5/4] Documentation/x86/topology: Detail CPUID leaves used for topology enumeration
  2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
                   ` (4 preceding siblings ...)
  2025-08-25  8:49 ` [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon Borislav Petkov
@ 2025-09-01 10:11 ` K Prateek Nayak
  5 siblings, 0 replies; 13+ messages in thread
From: K Prateek Nayak @ 2025-09-01 10:11 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Sean Christopherson, Paolo Bonzini, x86
  Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
	kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
	Suravee Suthikulpanit, Naveen N Rao, K Prateek Nayak

Add a new section describing the different CPUID leaves and fields used
to parse topology on x86 systems.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Sending this as an RFC patch first to squash out the amount of details
required in topology.rst. Once clarified, I'll include this formally in
v5.
---
 Documentation/arch/x86/topology.rst | 194 ++++++++++++++++++++++++++++
 1 file changed, 194 insertions(+)

diff --git a/Documentation/arch/x86/topology.rst b/Documentation/arch/x86/topology.rst
index c12837e61bda..fd9903aab6b5 100644
--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -141,6 +141,200 @@ Thread-related topology information in the kernel:
 
 
 
+System topology enumeration
+===========================
+The topology on x86 systems can be discovered using a combination of vendor
+specific CPUID leaves introduced specifically to enumerate the processor
+topology and the cache hierarchy.
+
+The CPUID leaves in their preferred order of parsing for each x86 vendor is as
+follows:
+
+1) AMD and Hygon
+
+   On AMD and Hygon platforms, the CPUID leaves that enumerate the processor
+   topology are as follows:
+
+   1) CPUID leaf 0x80000026 [Extended CPU Topology] (Core::X86::Cpuid::ExCpuTopology)
+
+      The extended CPUID leaf 0x80000026 is the extension of the CPUID leaf 0xB
+      and provides the topology information of Core, Complex, CCD(Die), and
+      Socket in each level.
+
+      The support for the leaf is expected to be discovered by checking if the
+      supported extended CPUID level is >= 0x80000026 and then checking if
+      `LogProcAtThisLevel` in `EBX[15:0]` at a particular level (starting from
+      0) is non-zero.
+
+      The `LevelType` in `ECX[15:8]` at the level provides the detail of the
+      topology domain that the level describes - Core, Complex, CCD(Die), or
+      the Socket.
+
+      The kernel uses the `CoreMaskWidth` from `EAX[4:0]` to discover the
+      number of bits that need to be right shifted from the
+      `ExtendedLocalApicId` in `EDX[31:0]` to get a unique Topology ID for
+      the topology level. CPUs with the same Topology ID share the resources
+      at that level.
+
+      CPUID leaf 0x80000026 also provides more information regarding the
+      power and efficiency rankings, and about the core type on AMD
+      processors with heterogeneous characteristics.
+
+      If CPUID leaf 0x80000026 is supported, further parsing is not required.
+
+
+   2) CPUID leaf 0x0000000B [Extended Topology Enumeration] (Core::X86::Cpuid::ExtTopEnum)
+
+      The extended CPUID leaf 0x0000000B is the predecessor on the extended
+      CPUID leaf 0x80000026 and only describes the core, and the socket domains
+      of the processor topology.
+
+      The support for the leaf is expected to be discovered by checking if the
+      supported CPUID level is >= 0xB and then checking if `EBX[31:0]` at a
+      particular level (starting from 0) is non-zero.
+
+      The `LevelType` in `ECX[15:8]` at the level provides the detail of the
+      topology domain that the level describes - Thread, or Processor (Socket).
+
+      The kernel uses the `CoreMaskWidth` from `EAX[4:0]` to discover the
+      number of bits that need to be right shifted from the
+      `ExtendedLocalApicId` in `EDX[31:0]` to get a unique Topology ID for
+      that topology level. CPUs sharing the Topology ID share the resources
+      at that level.
+
+      If CPUID leaf 0xB is supported, further parsing is not required.
+
+
+   3) CPUID leaf 0x80000008 ECX [Size Identifiers] (Core::X86::Cpuid::SizeId)
+
+      If neither the CPUID leaf 0x80000026 or CPUID leaf 0xB is supported, the
+      number of CPUs on the package is detected using the Size Identifier leaf
+      0x80000008 ECX.
+
+      The support for the leaf is expected to be discovered by checking if the
+      supported extended CPUID level is >= 0x80000008.
+
+      The shifts from the APIC ID for the Socket ID is calculated from the
+      `ApicIdSize` field in `ECX[15:12]` if it is non-zero.
+
+      If `ApicIdSize` is reported to be zero, the shift is calculated as the
+      order of the `number of threads` calculated from `NC` field in
+      `ECX[7:0]` which describes the `number of threads - 1` on the package.
+
+      Unless Extended APIC ID is supported, the APIC ID used to find the
+      Socket ID is from the `LocalApicId` field of CPUID leaf 0x00000001
+      `EBX[31:24]`.
+
+      The topology parsing continues to detect if Extended APIC ID is
+      supported or not.
+
+
+   4) CPUID leaf 0x8000001E [Extended APIC ID, Core Identifiers, Node Identifiers]
+      (Core::X86::Cpuid::{ExtApicId,CoreId,NodeId})
+
+      The support for Extended APIC ID can be detected by checking for the
+      presence of `TopologyExtensions` in `EXC[22]` of CPUID leaf 0x80000001
+      [Feature Identifiers] (Core::X86::Cpuid::FeatureExtIdEcx).
+
+      If Topology Extensions is supported, the APIC ID from `ExtendedApicId`
+      from CPUID leaf 0x8000001E `EAX[31:0]` should be preferred over that from
+      `LocalApicId` field of CPUID leaf 0x00000001 `EBX[31:24]` for topology
+      enumeration.
+
+      On processors of Family 0x17 and above that do not support CPUID leaf
+      0x80000026 or CPUID leaf 0xB, the shifts from the APIC ID for the Core
+      ID is calculated using the order of `number of threads per core`
+      calculated using the `ThreadsPerCore` field in `EBX[15:8]` which
+      describes `number of threads per core - 1`.
+
+      On Processors of Family 0x15, the Core ID from `EBX[7:0]` is used as the
+      `cu_id` (Compute Unit ID) to detect CPUs that share the compute units.
+
+
+   All AMD and Hygon processors that support the `TopologyExtensions` feature
+   stores the `NodeId` from the `ECX[7:0]` of CPUID leaf 0x8000001E
+   (Core::X86::Cpuid::NodeId) as the per-CPU `node_id`.
+
+
+2) Intel
+
+   On Intel platforms, the CPUID leaves that enumerate the processor
+   topology are as follows:
+
+   1) CPUID leaf 0x1F (V2 Extended Topology Enumeration Leaf)
+
+      The CPUID leaf 0x1F is the extension of the CPUID leaf 0xB and provides
+      the topology information of Core, Module, Tile, Die, DieGrp, and Socket
+      in each level.
+
+      The support for the leaf is expected to be discovered by checking if
+      the supported CPUID level is >= 0x1F and then `EBX[31:0]` at a
+      particular level (starting from 0) is non-zero.
+
+      The `Domain Type` in `ECX[15:8]` of the sub-leaf provides the detail of
+      the topology domain that the level describes - Core, Module, Tile, Die,
+      DieGrp, and Socket.
+
+      The kernel uses the value from `EAX[4:0]` to discover the number of
+      bits that need to be right shifted from the `x2APIC ID` in `EDX[31:0]`
+      to get a unique Topology ID for the topology level. CPUs with the same
+      Topology ID share the resources at that level.
+
+      If CPUID leaf 0x1F is supported, further parsing is not required.
+
+
+   2) CPUID leaf 0x0000000B (Extended Topology Enumeration Leaf)
+
+      The extended CPUID leaf 0x0000000B is the predecessor of the V2 Extended
+      Topology Enumeration Leaf 0x1F and only describes the core, and the
+      socket domains of the processor topology.
+
+      The support for the leaf is expected to be discovered by checking if the
+      supported CPUID level is >= 0xB and then checking if `EBX[31:0]` at a
+      particular level (starting from 0) is non-zero.
+
+      CPUID leaf 0x0000000B shares the same layout as CPUID leaf 0x1F and
+      should be enumerated in a similar manner.
+
+      If CPUID leaf 0xB is supported, further parsing is not required.
+
+
+   3) CPUID leaf 0x00000004 (Deterministic Cache Parameters Leaf)
+
+      On Intel processors that support neither CPUID leaf 0x1F, nor CPUID leaf
+      0xB, the shifts for the SMT domains is calculated using the number of
+      CPUs sharing the L1 cache.
+
+      Processors that feature Hyper-Threading is detected using `EDX[28]` of
+      CPUID leaf 0x1 (Basic CPUID Information).
+
+      The order of `Maximum number of addressable IDs for logical processors
+      sharing this cache` from `EAX[25:14]` of level-0 of CPUID 0x4 provides
+      the shifts from the APIC ID required to compute the Core ID.
+
+      The APIC ID and Package information is computed using the data from
+      CPUID leaf 0x1.
+
+
+   4) CPUID leaf 0x00000001 (Basic CPUID Information)
+
+      The mask and shifts to derive the Physical Package (socket) ID is
+      computed using the `Maximum number of addressable IDs for logical
+      processors in this physical package` from `EBX[23:16]` of CPUID leaf
+      0x1.
+
+     The APIC ID on the legacy platforms is derived from the `Initial APIC
+     ID` field from `EBX[31:24]` of CPUID leaf 0x1.
+
+
+3) Centaur and Zhaoxin
+
+   Similar to Intel, Centaur and Zhaoxin use a combination of CPUID leaf
+   0x00000004 (Deterministic Cache Parameters Leaf) and CPUID leaf 0x00000001
+   (Basic CPUID Information) to derive the topology information.
+
+
+
 System topology examples
 ========================
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
  2025-09-01  4:21     ` K Prateek Nayak
@ 2025-09-01 14:04       ` Borislav Petkov
  0 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2025-09-01 14:04 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
	Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
	Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, Tom Lendacky,
	linux-kernel, kvm, Mario Limonciello, Gautham R. Shenoy,
	Babu Moger, Suravee Suthikulpanit, Naveen N Rao, stable

On Mon, Sep 01, 2025 at 09:51:53AM +0530, K Prateek Nayak wrote:
> Ack! Does "has_xtopology" sound good or should we go for something more
> explicit like "has_xtopology_0x26_0xb"?

has_xtopology with a comment above it to explicitly state what it means,
sounds good.

> Patch 3 will get rid of that "has_topoext" argument in parse_8000_001e()
> entirely so I'll rename the local variable here and use the subsequent
> cleanup for parse_8000_001e().

Ok.

So pls send this one now so that I can queue it as an urgent fix and then the
cleanups can go ontop later.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-09-01 14:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25  7:57 [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon K Prateek Nayak
2025-08-25  7:57 ` [PATCH v4 1/4] x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGON K Prateek Nayak
2025-08-25  8:17   ` K Prateek Nayak
2025-08-27 10:34   ` [tip: x86/urgent] " tip-bot2 for K Prateek Nayak
2025-08-25  7:57 ` [PATCH v4 2/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
2025-08-30 17:19   ` Borislav Petkov
2025-09-01  4:21     ` K Prateek Nayak
2025-09-01 14:04       ` Borislav Petkov
2025-08-25  7:57 ` [PATCH v4 3/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
2025-08-25  7:57 ` [PATCH v4 4/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
2025-08-25  8:49 ` [PATCH v4 0/4] x86/cpu/topology: Fix the preferred order of initial APIC ID parsing on AMD/Hygon Borislav Petkov
2025-08-25  9:03   ` K Prateek Nayak
2025-09-01 10:11 ` [RFC PATCH v4 5/4] Documentation/x86/topology: Detail CPUID leaves used for topology enumeration K Prateek Nayak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).