* [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
@ 2025-08-18 6:04 K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 1/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-18 6:04 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
was noticed with recent kernels:
[Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
common topology code for AMD") where, after the rework, the initial
APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
the value from CPUID leaf 0xb EDX[31:0] previously.
This led us down a rabbit hole of XTOPOEXT vs TOPOEXT support, preferred
order of their parsing, and QEMU nuances like [1] where QEMU 0's out the
CPUID leaf 0x8000001e on CPUs where Core ID crosses 255 fearing a
Core ID collision in the 8 bit field which leads to the reported FW_BUG.
Following were major observations during the debug which the two
patches address respectively:
1. The support for CPUID leaf 0xb is independent of the TOPOEXT feature
and is rather linked to the x2APIC enablement. On baremetal, this has
not been a problem since TOPOEXT support (Fam 0x15 and above)
predates the support for CPUID leaf 0xb (Fam 0x17[Zen2] and above)
however, in virtualized environment, the support for x2APIC can be
enabled independent of topoext where QEMU expects the guest to parse
the topology and the APICID from CPUID leaf 0xb.
2. Since CPUID leaf 0x8000001e cannot represent Core ID without
collision for guests with > 255 cores, and QEMU 0's out the entire
leaf when Core ID crosses 255. Prefer initial APIC read from the
XTOPOEXT leaf before falling back to the APICID from 0x8000001e
which is still better than 8-bit APICID from leaf 0x1 EBX[31:24].
More details are enclosed in the commit logs.
Ideally, these changes should not affect baremetal AMD/Hygon platforms
as they have supported TOPOEXT long before the support for CPUID leaf
0xb and the extended CPUID leaf 0x80000026 (famous last words).
Patch 1 and 4 is yak shaving to explicitly define a raw MSR value used
in the topology parsing bits and simplify the flow around "has_topoext"
when the same can be discovered using X86_FEATURE_XTOPOLOGY.
This series has been tested on baremetal Zen1 (contains topoext but not
0xb leaf), Zen3 (contains both topoext and 0xb leaf), and Zen4 (contains
topoext, 0xb leaf, and 0x80000026 leaf) servers with no changes
observed in "/sys/kernel/debug/x86/topo/" directory.
The series was also tested on 255 and 512 vCPU (each vCPU is an
individual core from QEMU topology being passed) EPYC-Genoa guest with
and without x2apic and topoext enabled and this series solves the FW_BUG
seen on guest with > 255 VCPUs. No changes observed in
"/sys/kernel/debug/x86/topo/" for all other cases without warning.
0xb leaf is provided unconditionally on these guests (with or without
topoext, even with x2apic disabled on guests with <= 255 vCPU).
In all the cases initial_apicid matched the apicid in
"/sys/kernel/debug/x86/topo/" after applying this series.
Relevant bits of QEMU cmdline used during testing are as follows:
qemu-system-x86_64 \
-enable-kvm -m 32G -smp cpus=512,cores=512 \
-cpu EPYC-Genoa,x2apic=on,kvm-msi-ext-dest-id=on,+kvm-pv-unhalt,kvm-pv-tlb-flush,kvm-pv-ipi,kvm-pv-sched-yield,[-topoext] \
-machine q35,kernel_irqchip=split \
-global kvm-pit.lost_tick_policy=discard
...
References:
[1] https://github.com/qemu/qemu/commit/35ac5dfbcaa4b
Series is based on tip:x86/cpu at commit 65f55a301766 ("x86/CPU/AMD: Add
CPUID faulting support")
---
Changelog v2..v3:
o Patch 1 was added to the series.
o Use cpu_feature_enabled() in Patch 3.
o Rebased on top of tip:x86/cpu.
v2: https://lore.kernel.org/lkml/20250725110622.59743-1-kprateek.nayak@amd.com/
Changelog v1..v2:
o Collected tags from Naveen. (Thank you for testing!)
o Rebased the series on tip:x86/cpu.
o Swapped Patch 1 and Patch 2 from v1.
o Merged the body of two if blocks in Patch 1 to allow for cleanup in
Patch 3.
v1: https://lore.kernel.org/lkml/20250612072921.15107-1-kprateek.nayak@amd.com/
---
K Prateek Nayak (4):
x86/msr-index: Define AMD64_CPUID_FN_EXT MSR
x86/cpu/topology: Use initial APICID from XTOPOEXT on AMD/HYGON
x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing
has_topoext
arch/x86/include/asm/msr-index.h | 5 +++
arch/x86/kernel/cpu/topology_amd.c | 55 ++++++++++++++++--------------
2 files changed, 34 insertions(+), 26 deletions(-)
base-commit: 5bf2f5119b9e957f773a22f226974166b58cff32
--
2.34.1
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3 1/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
@ 2025-08-18 6:04 ` K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 2/4] x86/cpu/topology: Use initial APICID from XTOPOEXT on AMD/HYGON K Prateek Nayak
` (3 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-18 6:04 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
Explicitly define the AMD64_CPUID_FN_EXT MSR used to toggle the extended
features. Also define and use the bits necessary an old TOPOEXT fixup on
AMD Family 0x15 processors.
No functional changes intended.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v2..v3:
o New patch.
Note: The definition of MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED crosses
the 100 column limit and results in a checkpatch warning.
msr-index.h contains a couple more examples where the definitions with
very explicit naming crosses 100 columns and I've decided to retain the
same despite the warning.
If this needs to be changed, please let me know and I'll address it in
the next version.
---
arch/x86/include/asm/msr-index.h | 5 +++++
arch/x86/kernel/cpu/topology_amd.c | 7 ++++---
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 2d5595bdfa27..7931f9b3250b 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -630,6 +630,11 @@
#define MSR_AMD_PPIN 0xc00102f1
#define MSR_AMD64_CPUID_FN_7 0xc0011002
#define MSR_AMD64_CPUID_FN_1 0xc0011004
+
+#define MSR_AMD64_CPUID_FN_EXT 0xc0011005
+#define MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT 54
+#define MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED BIT_ULL(MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT)
+
#define MSR_AMD64_LS_CFG 0xc0011020
#define MSR_AMD64_DC_CFG 0xc0011022
#define MSR_AMD64_TW_CFG 0xc0011023
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 843b1655ab45..bb00dc6433eb 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -158,11 +158,12 @@ static void topoext_fixup(struct topo_scan *tscan)
c->x86 != 0x15 || c->x86_model < 0x10 || c->x86_model > 0x6f)
return;
- if (msr_set_bit(0xc0011005, 54) <= 0)
+ if (msr_set_bit(MSR_AMD64_CPUID_FN_EXT,
+ MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED_BIT) <= 0)
return;
- rdmsrq(0xc0011005, msrval);
- if (msrval & BIT_64(54)) {
+ rdmsrq(MSR_AMD64_CPUID_FN_EXT, msrval);
+ if (msrval & MSR_AMD64_CPUID_FN_EXT_TOPOEXT_ENABLED) {
set_cpu_cap(c, X86_FEATURE_TOPOEXT);
pr_info_once(FW_INFO "CPU: Re-enabling disabled Topology Extensions Support.\n");
}
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 2/4] x86/cpu/topology: Use initial APICID from XTOPOEXT on AMD/HYGON
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 1/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
@ 2025-08-18 6:04 ` K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 3/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
` (2 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-18 6:04 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
Prior to the topology parsing rewrite and the switchover to the new
parsing logic for AMD processors in commit c749ce393b8f ("x86/cpu: Use
common topology code for AMD"), the "initial_apicid" on these platforms
was:
- First initialized to the LocalApicId from CPUID leaf 0x1 EBX[31:24].
- Then overwritten by the ExtendedLocalApicId in CPUID leaf 0xb
EDX[31:0] on processors that supported topoext.
With the new parsing flow introduced in commit f7fb3b2dd92c ("x86/cpu:
Provide an AMD/HYGON specific topology parser"), parse_8000_001e() now
unconditionally overwrites the "initial_apicid" already parsed during
cpu_parse_topology_ext().
Although this has not been a problem on baremetal platforms, on
virtualized AMD guests that feature more than 255 cores, QEMU 0's out
the CPUID leaf 0x8000001e on CPUs with "CoreID" > 255 to prevent
collision of these IDs in EBX[7:0] which can only represent a maximum of
255 cores [1].
This results in the following FW_BUG being logged when booting a guest
with more than 255 cores:
[Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
Rely on the APICID parsed during cpu_parse_topology_ext() from CPUID
leaf 0x80000026 or 0xb and only use the APICID from leaf 0x8000001e if
cpu_parse_topology_ext() failed (has_topoext is false).
On platforms that support the 0xb leaf (Zen2 or later, AMD guests on
QEMU) or the extended leaf 0x80000026 (Zen4 or later), the
"initial_apicid" is now set to the value parsed from EDX[31:0].
On older AMD/Hygon platforms that does not support the 0xb leaf but
supports the TOPOEXT extension (Fam 0x15, 0x16, 0x17[Zen1], and Hygon),
the current behavior is retained where "initial_apicid" is set using
the 0x8000001e leaf.
Link: https://github.com/qemu/qemu/commit/35ac5dfbcaa4b [1]
Debugged-by: Naveen N Rao (AMD) <naveen@kernel.org>
Debugged-by: Sairaj Kodilkar <sarunkod@amd.com>
Fixes: c749ce393b8f ("x86/cpu: Use common topology code for AMD")
Tested-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v2..v3:
o No changes.
---
arch/x86/kernel/cpu/topology_amd.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index bb00dc6433eb..ac0ba8495eec 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -81,20 +81,28 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
cpuid_leaf(0x8000001e, &leaf);
- tscan->c->topo.initial_apicid = leaf.ext_apic_id;
-
- /*
- * If leaf 0xb is available, then the domain shifts are set
- * already and nothing to do here. Only valid for family >= 0x17.
- */
- if (!has_topoext && tscan->c->x86 >= 0x17) {
+ if (!has_topoext) {
/*
- * Leaf 0x80000008 set the CORE domain shift already.
- * Update the SMT domain, but do not propagate it.
+ * Prefer initial_apicid parsed from XTOPOLOGY leaf
+ * 0x8000026 or 0xb if available. Otherwise prefer the
+ * one from leaf 0x8000001e over 0x1.
*/
- unsigned int nthreads = leaf.core_nthreads + 1;
+ tscan->c->topo.initial_apicid = leaf.ext_apic_id;
- topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(nthreads), nthreads);
+ /*
+ * If XTOPOLOGY leaf is available, then the domain shifts are set
+ * already and nothing to do here. Only valid for family >= 0x17.
+ */
+ if (tscan->c->x86 >= 0x17) {
+ /*
+ * Leaf 0x80000008 set the CORE domain shift already.
+ * Update the SMT domain, but do not propagate it.
+ */
+ unsigned int nthreads = leaf.core_nthreads + 1;
+
+ topology_update_dom(tscan, TOPO_SMT_DOMAIN,
+ get_count_order(nthreads), nthreads);
+ }
}
store_node(tscan, leaf.nnodes_per_socket + 1, leaf.node_id);
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 3/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 1/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 2/4] x86/cpu/topology: Use initial APICID from XTOPOEXT on AMD/HYGON K Prateek Nayak
@ 2025-08-18 6:04 ` K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
2025-08-19 11:34 ` [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon Borislav Petkov
4 siblings, 0 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-18 6:04 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
Support for parsing the topology on AMD/Hygon processors using CPUID
leaf 0xb was added in commit 3986a0a805e6 ("x86/CPU/AMD: Derive CPU
topology from CPUID function 0xB when available"). In an effort to keep
all the topology parsing bits in one place, this commit also introduced
a pseudo dependency on the TOPOEXT feature to parse the CPUID leaf 0xb.
TOPOEXT feature (CPUID 0x80000001 ECX[22]) advertises the support for
Cache Properties leaf 0x8000001d and the CPUID leaf 0x8000001e EAX for
"Extended APIC ID" however support for 0xb was introduced alongside the
x2APIC support not only on AMD [1], but also historically on x86 [2].
Similar to 0xb, the support for extended CPU topology leaf 0x80000026
too does not depend on the TOPOEXT feature. The support for these leaves
is expected to be confirmed by ensuring "leaf <= {extended_}cpuid_level"
and then parsing the level 0 of the respective leaf to confirm
ECX[15:8] (LevelType) is non-zero.
This has not been a problem on baremetal platforms since support for
TOPOEXT (Fam 0x15 and later) predates the support for CPUID leaf 0xb
(Fam 0x17[Zen2] and later), however, for AMD guests on QEMU, "x2apic"
feature can be enabled independent of the "topoext" feature where QEMU
expects topology and the initial APICID to be parsed using the CPUID
leaf 0xb (especially when number of cores > 255) which is populated
independent of the "topoext" feature flag.
Unconditionally call cpu_parse_topology_ext() on AMD and Hygon
processors to first parse the topology using the XTOPOEXT leaves before
using the TOPOEXT leaf.
Link: https://lore.kernel.org/lkml/1529686927-7665-1-git-send-email-suravee.suthikulpanit@amd.com/ [1]
Link: https://lore.kernel.org/lkml/20080818181435.523309000@linux-os.sc.intel.com/ [2]
Suggested-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v2..v3:
o No changes.
---
arch/x86/kernel/cpu/topology_amd.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index ac0ba8495eec..3d01675d94f5 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -179,18 +179,14 @@ static void topoext_fixup(struct topo_scan *tscan)
static void parse_topology_amd(struct topo_scan *tscan)
{
- bool has_topoext = false;
-
/*
- * If the extended topology leaf 0x8000_001e is available
- * try to get SMT, CORE, TILE, and DIE shifts from extended
+ * Try to get SMT, CORE, TILE, and DIE shifts from extended
* CPUID leaf 0x8000_0026 on supported processors first. If
* extended CPUID leaf 0x8000_0026 is not supported, try to
* get SMT and CORE shift from leaf 0xb first, then try to
* get the CORE shift from leaf 0x8000_0008.
*/
- if (cpu_feature_enabled(X86_FEATURE_TOPOEXT))
- has_topoext = cpu_parse_topology_ext(tscan);
+ bool has_topoext = cpu_parse_topology_ext(tscan);
if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
` (2 preceding siblings ...)
2025-08-18 6:04 ` [PATCH v3 3/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
@ 2025-08-18 6:04 ` K Prateek Nayak
2025-08-24 13:38 ` Thomas Gleixner
2025-08-19 11:34 ` [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon Borislav Petkov
4 siblings, 1 reply; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-18 6:04 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
cpu_parse_topology_ext() sets X86_FEATURE_XTOPOLOGY before returning
true if any of the XTOPOLOGY leaf could be parsed successfully.
Instead of storing and passing around this return value using
"has_topoext" in parse_topology_amd(), check for X86_FEATURE_XTOPOLOGY
instead in parse_8000_001e() to simplify the flow.
No functional changes intended.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog v2..v3:
o Use cpu_feature_enabled() when checking for X86_FEATURE_XTOPOLOGY.
---
arch/x86/kernel/cpu/topology_amd.c | 16 +++++++---------
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
index 3d01675d94f5..138a09528083 100644
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -59,7 +59,7 @@ static void store_node(struct topo_scan *tscan, u16 nr_nodes, u16 node_id)
tscan->amd_node_id = node_id;
}
-static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
+static bool parse_8000_001e(struct topo_scan *tscan)
{
struct {
// eax
@@ -81,7 +81,7 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
cpuid_leaf(0x8000001e, &leaf);
- if (!has_topoext) {
+ if (!cpu_feature_enabled(X86_FEATURE_XTOPOLOGY)) {
/*
* Prefer initial_apicid parsed from XTOPOLOGY leaf
* 0x8000026 or 0xb if available. Otherwise prefer the
@@ -179,6 +179,9 @@ static void topoext_fixup(struct topo_scan *tscan)
static void parse_topology_amd(struct topo_scan *tscan)
{
+ if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
+ tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
+
/*
* Try to get SMT, CORE, TILE, and DIE shifts from extended
* CPUID leaf 0x8000_0026 on supported processors first. If
@@ -186,16 +189,11 @@ static void parse_topology_amd(struct topo_scan *tscan)
* get SMT and CORE shift from leaf 0xb first, then try to
* get the CORE shift from leaf 0x8000_0008.
*/
- bool has_topoext = cpu_parse_topology_ext(tscan);
-
- if (cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES))
- tscan->c->topo.cpu_type = cpuid_ebx(0x80000026);
-
- if (!has_topoext && !parse_8000_0008(tscan))
+ if (!cpu_parse_topology_ext(tscan) && !parse_8000_0008(tscan))
return;
/* Prefer leaf 0x8000001e if available */
- if (parse_8000_001e(tscan, has_topoext))
+ if (parse_8000_001e(tscan))
return;
/* Try the NODEID MSR */
--
2.34.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
` (3 preceding siblings ...)
2025-08-18 6:04 ` [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
@ 2025-08-19 11:34 ` Borislav Petkov
2025-08-19 14:28 ` K Prateek Nayak
2025-08-20 8:11 ` Naveen N Rao
4 siblings, 2 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-08-19 11:34 UTC (permalink / raw)
To: K Prateek Nayak
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit
Lemme try to make some sense of this because the wild use of names and things
is making my head spin...
On Mon, Aug 18, 2025 at 06:04:31AM +0000, K Prateek Nayak wrote:
> When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
> was noticed with recent kernels:
>
> [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
>
> Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
> common topology code for AMD") where, after the rework, the initial
> APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
That's
CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)
> the value from CPUID leaf 0xb EDX[31:0] previously.
That's
CPUID_Fn0000000B_EDX [Extended Topology Enumeration]
(Core::X86::Cpuid::ExtTopEnumEdx)
> This led us down a rabbit hole of XTOPOEXT vs TOPOEXT support, preferred
What is XTOPOEXT?
CPUID_Fn0000000B_EDX?
Please define all your things properly so that we can have common base when
reading this text.
TOPOEXT is, I presume:
#define X86_FEATURE_TOPOEXT ( 6*32+22) /* "topoext" Topology extensions CPUID leafs */
Our PPR says:
CPUID_Fn80000001_ECX [Feature Identifiers] (Core::X86::Cpuid::FeatureExtIdEcx)
"22 TopologyExtensions: topology extensions support. Read-only. Reset:
Fixed,1. 1=Indicates support for Core::X86::Cpuid::CachePropEax0 and
Core::X86::Cpuid::ExtApicId."
Those leafs are:
CPUID_Fn8000001D_EAX_x00 [Cache Properties (DC)] (Core::X86::Cpuid::CachePropEax0)
DC topology info. Probably not important for this here.
and
CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
the extended APIC ID is there.
How is this APIC ID different from the extended APIC ID in
CPUID_Fn0000000B_EDX [Extended Topology Enumeration] (Core::X86::Cpuid::ExtTopEnumEdx)
?
> order of their parsing, and QEMU nuances like [1] where QEMU 0's out the
> CPUID leaf 0x8000001e on CPUs where Core ID crosses 255 fearing a
> Core ID collision in the 8 bit field which leads to the reported FW_BUG.
Is that what the hw does though?
Has this been verified instead of willy nilly clearing CPUID leafs in qemu?
> Following were major observations during the debug which the two
> patches address respectively:
>
> 1. The support for CPUID leaf 0xb is independent of the TOPOEXT feature
Yes, PPR says so.
> and is rather linked to the x2APIC enablement.
Because the SDM says:
"Bits 31-00: x2APIC ID of the current logical processor."
?
Is our version not containing the x2APIC ID?
> On baremetal, this has
> not been a problem since TOPOEXT support (Fam 0x15 and above)
> predates the support for CPUID leaf 0xb (Fam 0x17[Zen2] and above)
> however, in virtualized environment, the support for x2APIC can be
> enabled independent of topoext where QEMU expects the guest to parse
> the topology and the APICID from CPUID leaf 0xb.
So we're fixing a qemu bug?
Why isn't qemu force-enabling TOPOEXT support when one requests x2APIC?
My initial reaction: fix qemu.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-19 11:34 ` [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon Borislav Petkov
@ 2025-08-19 14:28 ` K Prateek Nayak
2025-08-19 21:24 ` Borislav Petkov
2025-08-20 8:11 ` Naveen N Rao
1 sibling, 1 reply; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-19 14:28 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit
Hello Boris,
On 8/19/2025 5:04 PM, Borislav Petkov wrote:
> Lemme try to make some sense of this because the wild use of names and things
> is making my head spin...
>
> On Mon, Aug 18, 2025 at 06:04:31AM +0000, K Prateek Nayak wrote:
>> When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
>> was noticed with recent kernels:
>>
>> [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
>>
>> Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
>> common topology code for AMD") where, after the rework, the initial
>> APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
>
> That's
>
> CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)
Small correction here, this is actually,
CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
>
>> the value from CPUID leaf 0xb EDX[31:0] previously.
>
> That's
>
> CPUID_Fn0000000B_EDX [Extended Topology Enumeration]
> (Core::X86::Cpuid::ExtTopEnumEdx)
>
>> This led us down a rabbit hole of XTOPOEXT vs TOPOEXT support, preferred
>
> What is XTOPOEXT?
>
> CPUID_Fn0000000B_EDX?
>
> Please define all your things properly so that we can have common base when
> reading this text.
Sorry about that! This should actually be "X86_FEATURE_XTOPOLOGY" which
is a synthetic feature set when topology parsing via one of the following
CPUID leaf is successful:
- 0x1f
V2 Extended Topology Enumeration Leaf
(Intel only)
- 0x80000026
CPUID_Fn80000026_E[A,B,C]X_x0[0...3] [Extended CPU Topology]
Core::X86::Cpuid::ExCpuTopologyE[a,b,c]x[0..3]
(AMD only)
- 0xb
CPUID_Fn0000000B_E[A,B,C]X_x0[0..2] [Extended Topology Enumeration]
Core::X86::Cpuid::ExtTopEnumE[a,b,c]x[0..2]
(Both Intel and AMD)
The parsing of the leaves is tried in the same order as above.
>
> TOPOEXT is, I presume:
>
> #define X86_FEATURE_TOPOEXT ( 6*32+22) /* "topoext" Topology extensions CPUID leafs */
>
> Our PPR says:
>
> CPUID_Fn80000001_ECX [Feature Identifiers] (Core::X86::Cpuid::FeatureExtIdEcx)
>
> "22 TopologyExtensions: topology extensions support. Read-only. Reset:
> Fixed,1. 1=Indicates support for Core::X86::Cpuid::CachePropEax0 and
> Core::X86::Cpuid::ExtApicId."
>
> Those leafs are:
>
> CPUID_Fn8000001D_EAX_x00 [Cache Properties (DC)] (Core::X86::Cpuid::CachePropEax0)
>
> DC topology info. Probably not important for this here.
>
> and
>
> CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
>
> the extended APIC ID is there.
>
> How is this APIC ID different from the extended APIC ID in
>
> CPUID_Fn0000000B_EDX [Extended Topology Enumeration] (Core::X86::Cpuid::ExtTopEnumEdx)
>
> ?
On baremetal, they are the same. On QEMU, when we launch a guest with
a topology that contains more than 256 cores on a single socket, QEMU
0s out all the bits in CPUID_Fn8000001E [1] since it fears a collision
in the "CoreId[7:0]" field of
"CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)"
Since
"CPUID_Fn0000000B_EBX_x01 [Extended Topology Enumeration]" and
"LogProcAtThisLevel[15:0]" can describe a domain with up to 2^16 cores,
the Core ID can always be derived correctly from this even when the
number of cores in the guest topology crosses 256.
>
>> order of their parsing, and QEMU nuances like [1] where QEMU 0's out the
>> CPUID leaf 0x8000001e on CPUs where Core ID crosses 255 fearing a
>> Core ID collision in the 8 bit field which leads to the reported FW_BUG.
>
> Is that what the hw does though?
We don't have baremetal systems with more than 256 cores per socket and
when that happens, I believe the expectation from H/W is to just use
CPUID_Fn80000026 leaf or the CPUID_Fn0000000B leaf.
>
> Has this been verified instead of willy nilly clearing CPUID leafs in qemu?
>
>> Following were major observations during the debug which the two
>> patches address respectively:
>>
>> 1. The support for CPUID leaf 0xb is independent of the TOPOEXT feature
>
> Yes, PPR says so.
>
>> and is rather linked to the x2APIC enablement.
>
> Because the SDM says:
>
> "Bits 31-00: x2APIC ID of the current logical processor."
>
> ?
SDM Vol. 3A Sec. 11.12.8 "CPUID Extensions And Topology Enumeration"
reads:
For Intel 64 and IA-32 processors that support x2APIC, a value of 1
reported by CPUID.01H:ECX[21] indicates that the processor supports
x2APIC and the extended topology enumeration leaf (CPUID.0BH).
The extended topology enumeration leaf can be accessed by executing
CPUID with EAX = 0BH. Processors that do not support x2APIC may
support CPUID leaf 0BH. Software can detect the availability of the
extended topology enumeration leaf (0BH) by performing two steps:
1. Check maximum input value for basic CPUID information by executing
CPUID with EAX= 0. If CPUID.0H:EAX is greater than or equal or 11
(0BH), then proceed to next step
2. Check CPUID.EAX=0BH, ECX=0H:EBX is non-zero.
If both of the above conditions are true, extended topology
enumeration leaf is available.
>
> Is our version not containing the x2APIC ID?
We too have the Extended APIC ID in both CPUID_Fn0000000B and
CPUID_Fn8000001E_EAX and they both match on baremetal. The problem is
only for virtualized guest whose topology contains more than 256
cores per socket because of [1]
>
>> On baremetal, this has
>> not been a problem since TOPOEXT support (Fam 0x15 and above)
>> predates the support for CPUID leaf 0xb (Fam 0x17[Zen2] and above)
>> however, in virtualized environment, the support for x2APIC can be
>> enabled independent of topoext where QEMU expects the guest to parse
>> the topology and the APICID from CPUID leaf 0xb.
>
> So we're fixing a qemu bug?
>
> Why isn't qemu force-enabling TOPOEXT support when one requests x2APIC?
>
> My initial reaction: fix qemu.
This is possible, however what should be the right thing for
CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)?
Should QEMU just wrap and start counting the Core Identifiers again
from 0?
Or Should QEMU go ahead and populate just the
CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
fields and continue to zero out EBX and ECX when CoreID > 255?
[1] https://github.com/qemu/qemu/commit/35ac5dfbcaa4b
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-19 14:28 ` K Prateek Nayak
@ 2025-08-19 21:24 ` Borislav Petkov
0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-08-19 21:24 UTC (permalink / raw)
To: K Prateek Nayak
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
Paolo Bonzini, x86, Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit
On Tue, Aug 19, 2025 at 07:58:52PM +0530, K Prateek Nayak wrote:
> This is possible, however what should be the right thing for
> CPUID_Fn8000001E_EBX [Core Identifiers] (Core::X86::Cpuid::CoreId)?
>
> Should QEMU just wrap and start counting the Core Identifiers again
> from 0?
>
> Or Should QEMU go ahead and populate just the
> CPUID_Fn8000001E_EAX [Extended APIC ID] (Core::X86::Cpuid::ExtApicId)
> fields and continue to zero out EBX and ECX when CoreID > 255?
I think the right thing to do is what the HW does (or will do), when it gets
to more than 256 APIC IDs - "cores" is ambiguous.
Perhaps something to discuss with hw folks internally first and then stick to
that plan everywhere, qemu included.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-19 11:34 ` [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon Borislav Petkov
2025-08-19 14:28 ` K Prateek Nayak
@ 2025-08-20 8:11 ` Naveen N Rao
2025-08-20 8:59 ` Borislav Petkov
1 sibling, 1 reply; 16+ messages in thread
From: Naveen N Rao @ 2025-08-20 8:11 UTC (permalink / raw)
To: Borislav Petkov
Cc: K Prateek Nayak, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86, Sairaj Kodilkar,
H. Peter Anvin, Peter Zijlstra (Intel), Xin Li (Intel),
Pawan Gupta, linux-kernel, kvm, Mario Limonciello,
Gautham R. Shenoy, Babu Moger, Suravee Suthikulpanit
On Tue, Aug 19, 2025 at 01:34:47PM +0200, Borislav Petkov wrote:
> Lemme try to make some sense of this because the wild use of names and things
> is making my head spin...
>
> On Mon, Aug 18, 2025 at 06:04:31AM +0000, K Prateek Nayak wrote:
> > When running an AMD guest on QEMU with > 255 cores, the following FW_BUG
> > was noticed with recent kernels:
> >
> > [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200
> >
> > Naveen, Sairaj debugged the cause to commit c749ce393b8f ("x86/cpu: Use
> > common topology code for AMD") where, after the rework, the initial
> > APICID was set using the CPUID leaf 0x8000001e EAX[31:0] as opposed to
>
> That's
>
> CPUID_Fn8000001E_ECX [Node Identifiers] (Core::X86::Cpuid::NodeId)
>
> > the value from CPUID leaf 0xb EDX[31:0] previously.
>
> That's
>
> CPUID_Fn0000000B_EDX [Extended Topology Enumeration]
> (Core::X86::Cpuid::ExtTopEnumEdx)
Regardless of the qemu bug with leaf 0x8000001e (with >255 cores),
section '16.12 x2APIC_ID' of the APM says:
CPUID. The x2APIC ID is reported by CPUID functions Fn0000_000B
(Extended Topology Enumeration) and CPUID Fn8000_001E (Extended APIC
ID) as follows:
- Fn0000_000B_EDX[31:0]_x0 reports the full 32-bit ID, independent of
APIC mode (i.e. even with APIC disabled)
- Fn8000_001E_EAX[31:0] conditionally reports APIC ID. There are 3
cases:
- 32-bit x2APIC_ID, in x2APIC mode.
- 8-bit APIC ID (upper 24 bits are 0), in xAPIC mode.
- 0, if the APIC is disabled.
That suggests use of leaf 0xb for the initial x2APIC ID especially
during early init. I'm not sure why leaf 0x8000001e was preferred over
leaf 0xb in commit c749ce393b8f ("x86/cpu: Use common topology code for
AMD") though.
- Naveen
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-20 8:11 ` Naveen N Rao
@ 2025-08-20 8:59 ` Borislav Petkov
2025-08-20 9:12 ` K Prateek Nayak
0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2025-08-20 8:59 UTC (permalink / raw)
To: Naveen N Rao
Cc: K Prateek Nayak, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86, Sairaj Kodilkar,
H. Peter Anvin, Peter Zijlstra (Intel), Xin Li (Intel),
Pawan Gupta, linux-kernel, kvm, Mario Limonciello,
Gautham R. Shenoy, Babu Moger, Suravee Suthikulpanit
On Wed, Aug 20, 2025 at 01:41:28PM +0530, Naveen N Rao wrote:
> That suggests use of leaf 0xb for the initial x2APIC ID especially
> during early init. I'm not sure why leaf 0x8000001e was preferred over
> leaf 0xb in commit c749ce393b8f ("x86/cpu: Use common topology code for
> AMD") though.
Well, I see parse_topology_amd() calling cpu_parse_topology_ext() if you have
TOPOEXT - which all AMD hw does - which then does cpu_parse_topology_ext() and
that one tries 0x80000026 and then falls back to 0xb and *only* *then* to
0x8000001e.
So, it looks like it DTRT to me...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-20 8:59 ` Borislav Petkov
@ 2025-08-20 9:12 ` K Prateek Nayak
2025-08-20 9:28 ` Borislav Petkov
2025-08-20 9:59 ` Gautham R. Shenoy
0 siblings, 2 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-20 9:12 UTC (permalink / raw)
To: Borislav Petkov, Naveen N Rao
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, Sean Christopherson,
Paolo Bonzini, x86, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit
Hello Boris,
On 8/20/2025 2:29 PM, Borislav Petkov wrote:
> On Wed, Aug 20, 2025 at 01:41:28PM +0530, Naveen N Rao wrote:
>> That suggests use of leaf 0xb for the initial x2APIC ID especially
>> during early init. I'm not sure why leaf 0x8000001e was preferred over
>> leaf 0xb in commit c749ce393b8f ("x86/cpu: Use common topology code for
>> AMD") though.
>
> Well, I see parse_topology_amd() calling cpu_parse_topology_ext() if you have
> TOPOEXT - which all AMD hw does - which then does cpu_parse_topology_ext() and
> that one tries 0x80000026 and then falls back to 0xb and *only* *then* to
> 0x8000001e.
>
> So, it looks like it DTRT to me...
But parse_8000_001e() then unconditionally overwrites the
"initial_apicid" with the value in 0x8000001E EAX despite it being
populated from cpu_parse_topology_ext().
The flow is as follows:
parse_topology_amd()
if (X86_FEATURE_TOPOEXT) /* True */
has_topoext = cpu_parse_topology_ext(); /* Populates "initial_apicid", returns True */
/* parse_8000_0008() is never called since has_topoext is true */
parse_8000_001e()
if (!X86_FEATURE_TOPOEXT) /* False */
return;
/* Proceeds here */
cpuid_leaf(0x8000001e, &leaf);
tscan->c->topo.initial_apicid = leaf.ext_apic_id; /*** Overwritten here ***/
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-20 9:12 ` K Prateek Nayak
@ 2025-08-20 9:28 ` Borislav Petkov
2025-08-20 9:59 ` Gautham R. Shenoy
1 sibling, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2025-08-20 9:28 UTC (permalink / raw)
To: K Prateek Nayak
Cc: Naveen N Rao, Thomas Gleixner, Ingo Molnar, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86, Sairaj Kodilkar,
H. Peter Anvin, Peter Zijlstra (Intel), Xin Li (Intel),
Pawan Gupta, linux-kernel, kvm, Mario Limonciello,
Gautham R. Shenoy, Babu Moger, Suravee Suthikulpanit
On Wed, Aug 20, 2025 at 02:42:26PM +0530, K Prateek Nayak wrote:
> tscan->c->topo.initial_apicid = leaf.ext_apic_id; /*** Overwritten here ***/
Looks like it shouldn't unconditionally overwrite it but I'll let tglx comment
here.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon
2025-08-20 9:12 ` K Prateek Nayak
2025-08-20 9:28 ` Borislav Petkov
@ 2025-08-20 9:59 ` Gautham R. Shenoy
1 sibling, 0 replies; 16+ messages in thread
From: Gautham R. Shenoy @ 2025-08-20 9:59 UTC (permalink / raw)
To: K Prateek Nayak
Cc: Borislav Petkov, Naveen N Rao, Thomas Gleixner, Ingo Molnar,
Dave Hansen, Sean Christopherson, Paolo Bonzini, x86,
Sairaj Kodilkar, H. Peter Anvin, Peter Zijlstra (Intel),
Xin Li (Intel), Pawan Gupta, linux-kernel, kvm, Mario Limonciello,
Babu Moger, Suravee Suthikulpanit
On Wed, Aug 20, 2025 at 02:42:26PM +0530, K Prateek Nayak wrote:
> Hello Boris,
>
> On 8/20/2025 2:29 PM, Borislav Petkov wrote:
> > On Wed, Aug 20, 2025 at 01:41:28PM +0530, Naveen N Rao wrote:
> >> That suggests use of leaf 0xb for the initial x2APIC ID especially
> >> during early init. I'm not sure why leaf 0x8000001e was preferred over
> >> leaf 0xb in commit c749ce393b8f ("x86/cpu: Use common topology code for
> >> AMD") though.
> >
> > Well, I see parse_topology_amd() calling cpu_parse_topology_ext() if you have
> > TOPOEXT - which all AMD hw does - which then does cpu_parse_topology_ext() and
> > that one tries 0x80000026 and then falls back to 0xb and *only* *then* to
> > 0x8000001e.
> >
> > So, it looks like it DTRT to me...
>
> But parse_8000_001e() then unconditionally overwrites the
> "initial_apicid" with the value in 0x8000001E EAX despite it being
> populated from cpu_parse_topology_ext().
>
> The flow is as follows:
>
> parse_topology_amd()
> if (X86_FEATURE_TOPOEXT) /* True */
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> cpu_parse_topology_ext();
Patch 2 from this patchset, which removes this "if" condition above
seems to be the right thing to do.
X86_FEATURE_TOPOEXT refers to CPUID 0x80000001.ECX[22] which advertises the support for 0x8000001D.EAX
and 0x8000001E.EAX.
OTOH, the function cpu_parse_topology_ext() parses the topology via the following CPUIDs in that order
* CPUID 0x1f (Intel Only)
* CPUID 0x80000026 (AMD only)
* CPUID 0xB (Both Intel and AMD)
None of these have anything to do with X86_FEATURE_TOPOEXT.
So the call to cpu_parse_topology_ext() in parse_topology_amd()
doesn't have to be gated by the presence or absence of
X86_FEATURE_TOPOEXT.
I agree that QEMU needs to sort out what needs to do something better
than clearing all the regs of CPUID 0x8000001E on encountering a
topology with more than 256 cores.
Or at the very least not clear the CPUID 0x8000001E.EAX which has the
provision to advertise a valid Extended APIC ID.
--
Thanks and Regards
gautham.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext
2025-08-18 6:04 ` [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
@ 2025-08-24 13:38 ` Thomas Gleixner
2025-08-24 14:55 ` Thomas Gleixner
0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2025-08-24 13:38 UTC (permalink / raw)
To: K Prateek Nayak, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
On Mon, Aug 18 2025 at 06:04, K. Prateek Nayak wrote:
> cpu_parse_topology_ext() sets X86_FEATURE_XTOPOLOGY before returning
> true if any of the XTOPOLOGY leaf could be parsed successfully.
>
> Instead of storing and passing around this return value using
> "has_topoext" in parse_topology_amd(), check for X86_FEATURE_XTOPOLOGY
> instead in parse_8000_001e() to simplify the flow.
>
> No functional changes intended.
>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
> Changelog v2..v3:
>
> o Use cpu_feature_enabled() when checking for X86_FEATURE_XTOPOLOGY.
> ---
> arch/x86/kernel/cpu/topology_amd.c | 16 +++++++---------
> 1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/topology_amd.c b/arch/x86/kernel/cpu/topology_amd.c
> index 3d01675d94f5..138a09528083 100644
> --- a/arch/x86/kernel/cpu/topology_amd.c
> +++ b/arch/x86/kernel/cpu/topology_amd.c
> @@ -59,7 +59,7 @@ static void store_node(struct topo_scan *tscan, u16 nr_nodes, u16 node_id)
> tscan->amd_node_id = node_id;
> }
>
> -static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
> +static bool parse_8000_001e(struct topo_scan *tscan)
> {
> struct {
> // eax
> @@ -81,7 +81,7 @@ static bool parse_8000_001e(struct topo_scan *tscan, bool has_topoext)
>
> cpuid_leaf(0x8000001e, &leaf);
>
> - if (!has_topoext) {
> + if (!cpu_feature_enabled(X86_FEATURE_XTOPOLOGY)) {
> /*
> * Prefer initial_apicid parsed from XTOPOLOGY leaf
> * 0x8000026 or 0xb if available. Otherwise prefer the
That's patently wrong.
The leaves might be "available", but are not guaranteed to be valid. So
FEATURE_XTOPOLOGY gives you the wrong answer.
The has_topoext logic is there for a reason.
https://github.com/InstLatx64/InstLatx64.git has a gazillion of CPUID
samples from various machines and there are systems which advertise 0xB,
but it's empty and you won't have an APIC ID at all...
So all what needs to be done is preventing 8..1e parsing to overwrite
the APIC ID, when a valid 0xb/0x26 was detected.
Something like the below.
Btw, your fixes tag is only half correct. The problem existed already
_before_ the topology rewrite and I remember debating exactly this
problem with Boris and Tom back then when I sanitized this nightmare.
Thanks,
tglx
---
arch/x86/kernel/cpu/topology_amd.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
--- a/arch/x86/kernel/cpu/topology_amd.c
+++ b/arch/x86/kernel/cpu/topology_amd.c
@@ -81,20 +81,25 @@ static bool parse_8000_001e(struct topo_
cpuid_leaf(0x8000001e, &leaf);
- tscan->c->topo.initial_apicid = leaf.ext_apic_id;
-
/*
- * If leaf 0xb is available, then the domain shifts are set
- * already and nothing to do here. Only valid for family >= 0x17.
+ * If leaf 0xb/0x26 is available, then the APIC ID and the domain
+ * shifts are set already.
*/
- if (!has_topoext && tscan->c->x86 >= 0x17) {
+ if (!has_topoext) {
+ tscan->c->topo.initial_apicid = leaf.ext_apic_id;
+
/*
- * Leaf 0x80000008 set the CORE domain shift already.
- * Update the SMT domain, but do not propagate it.
+ * Leaf 0x8000008 sets the CORE domain shift but not the
+ * SMT domain shift. On CPUs with family >= 0x17, there
+ * might be hyperthreads.
*/
- unsigned int nthreads = leaf.core_nthreads + 1;
+ if (tscan->c->x86 >= 0x17) {
+ /* Update the SMT domain, but do not propagate it. */
+ unsigned int nthreads = leaf.core_nthreads + 1;
- topology_update_dom(tscan, TOPO_SMT_DOMAIN, get_count_order(nthreads), nthreads);
+ topology_update_dom(tscan, TOPO_SMT_DOMAIN,
+ get_count_order(nthreads), nthreads);
+ }
}
store_node(tscan, leaf.nnodes_per_socket + 1, leaf.node_id);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext
2025-08-24 13:38 ` Thomas Gleixner
@ 2025-08-24 14:55 ` Thomas Gleixner
2025-08-25 4:05 ` K Prateek Nayak
0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2025-08-24 14:55 UTC (permalink / raw)
To: K Prateek Nayak, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit, K Prateek Nayak
On Sun, Aug 24 2025 at 15:38, Thomas Gleixner wrote:
>> - if (!has_topoext) {
>> + if (!cpu_feature_enabled(X86_FEATURE_XTOPOLOGY)) {
>> /*
>> * Prefer initial_apicid parsed from XTOPOLOGY leaf
>> * 0x8000026 or 0xb if available. Otherwise prefer the
>
> That's patently wrong.
>
> The leaves might be "available", but are not guaranteed to be valid. So
> FEATURE_XTOPOLOGY gives you the wrong answer.
>
> The has_topoext logic is there for a reason.
Hrm. I have to correct myself. It's set by the 0xb... parsing when that
finds a valid leaf. My memory tricked me on that.
So yes, it can be used for that, but that's a cleanup. The simple fix
should be applied first as that's trivial to backport.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext
2025-08-24 14:55 ` Thomas Gleixner
@ 2025-08-25 4:05 ` K Prateek Nayak
0 siblings, 0 replies; 16+ messages in thread
From: K Prateek Nayak @ 2025-08-25 4:05 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
Sean Christopherson, Paolo Bonzini, x86
Cc: Naveen rao, Sairaj Kodilkar, H. Peter Anvin,
Peter Zijlstra (Intel), Xin Li (Intel), Pawan Gupta, linux-kernel,
kvm, Mario Limonciello, Gautham R. Shenoy, Babu Moger,
Suravee Suthikulpanit
Hello Thomas,
On 8/24/2025 8:25 PM, Thomas Gleixner wrote:
> On Sun, Aug 24 2025 at 15:38, Thomas Gleixner wrote:
>>> - if (!has_topoext) {
>>> + if (!cpu_feature_enabled(X86_FEATURE_XTOPOLOGY)) {
>>> /*
>>> * Prefer initial_apicid parsed from XTOPOLOGY leaf
>>> * 0x8000026 or 0xb if available. Otherwise prefer the
>>
>> That's patently wrong.
>>
>> The leaves might be "available", but are not guaranteed to be valid. So
>> FEATURE_XTOPOLOGY gives you the wrong answer.
>>
>> The has_topoext logic is there for a reason.
>
> Hrm. I have to correct myself. It's set by the 0xb... parsing when that
> finds a valid leaf. My memory tricked me on that.
>
> So yes, it can be used for that, but that's a cleanup. The simple fix
> should be applied first as that's trivial to backport.
Thank you for taking a look at the series. I'll move the fixes ahead and
refresh the current Patch 2 with your suggested diff in the next
version.
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-08-25 4:05 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-18 6:04 [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 1/4] x86/msr-index: Define AMD64_CPUID_FN_EXT MSR K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 2/4] x86/cpu/topology: Use initial APICID from XTOPOEXT on AMD/HYGON K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 3/4] x86/cpu/topology: Always try cpu_parse_topology_ext() on AMD/Hygon K Prateek Nayak
2025-08-18 6:04 ` [PATCH v3 4/4] x86/cpu/topology: Check for X86_FEATURE_XTOPOLOGY instead of passing has_topoext K Prateek Nayak
2025-08-24 13:38 ` Thomas Gleixner
2025-08-24 14:55 ` Thomas Gleixner
2025-08-25 4:05 ` K Prateek Nayak
2025-08-19 11:34 ` [PATCH v3 0/4] x86/cpu/topology: Work around the nuances of virtualization on AMD/Hygon Borislav Petkov
2025-08-19 14:28 ` K Prateek Nayak
2025-08-19 21:24 ` Borislav Petkov
2025-08-20 8:11 ` Naveen N Rao
2025-08-20 8:59 ` Borislav Petkov
2025-08-20 9:12 ` K Prateek Nayak
2025-08-20 9:28 ` Borislav Petkov
2025-08-20 9:59 ` Gautham R. Shenoy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).