* unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore])
@ 2024-03-04 18:18 Borislav Petkov
2024-03-04 19:22 ` Liang, Kan
0 siblings, 1 reply; 8+ messages in thread
From: Borislav Petkov @ 2024-03-04 18:18 UTC (permalink / raw)
To: x86-ml
Cc: Alexander Shishkin, Adrian Hunter, Kan Liang, Alexander Antonov,
lkml
Hi all,
sending this to a bunch of people who have touched this function
recently and some more relevant Intel folks.
The machine is an old SNB:
smpboot: CPU0: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz (family: 0x6, model: 0x2d, stepping: 0x7)
and with latest linus/master + tip/master it gives the below.
It must be something new because 6.8-rc6 is fine.
...
i801_smbus 0000:00:1f.3: enabling device (0000 -> 0003)
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input5
i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
ACPI: button: Power Button [PWRF]
i2c i2c-14: 4/4 memory slots populated (from DMI)
unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore])
Call Trace:
<TASK>
? ex_handler_msr+0xcb/0x130
? fixup_exception+0x166/0x320
? exc_general_protection+0xd7/0x3f0
? asm_exc_general_protection+0x22/0x30
? snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]
uncore_box_ref.part.0+0x9c/0xc0 [intel_uncore]
? __pfx_uncore_event_cpu_online+0x10/0x10 [intel_uncore]
uncore_event_cpu_online+0x56/0x140 [intel_uncore]
? __pfx_uncore_event_cpu_online+0x10/0x10 [intel_uncore]
cpuhp_invoke_callback+0x174/0x5e0
? cpuhp_thread_fun+0x5a/0x200
cpuhp_thread_fun+0x17e/0x200
? smpboot_thread_fn+0x2b/0x250
smpboot_thread_fn+0x1ad/0x250
? __pfx_smpboot_thread_fn+0x10/0x10
kthread+0xed/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x30/0x50
? __pfx_kthread+0x10/0x10
iTCO_vendor_support: vendor-support=0
ret_from_fork_asm+0x1a/0x30
</TASK>
iTCO_wdt iTCO_wdt.1.auto: Found a Patsburg TCO device (Version=2, TCOBASE=0x0460)
iTCO_wdt iTCO_wdt.1.auto: initialized. heartbeat=30 sec (nowayout=0)
RAPL PMU: API unit is 2^-32 Joules, 2 fixed counters, 163840 ms ovfl timer
...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-04 18:18 unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) Borislav Petkov @ 2024-03-04 19:22 ` Liang, Kan 2024-03-04 20:12 ` Borislav Petkov 0 siblings, 1 reply; 8+ messages in thread From: Liang, Kan @ 2024-03-04 19:22 UTC (permalink / raw) To: Borislav Petkov, x86-ml Cc: Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Thomas Gleixner On 2024-03-04 1:18 p.m., Borislav Petkov wrote: > Hi all, > > sending this to a bunch of people who have touched this function > recently and some more relevant Intel folks. > > The machine is an old SNB: > > smpboot: CPU0: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz (family: 0x6, model: 0x2d, stepping: 0x7) > > and with latest linus/master + tip/master it gives the below. > > It must be something new because 6.8-rc6 is fine. > > ... > i801_smbus 0000:00:1f.3: enabling device (0000 -> 0003) > input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input5 > i801_smbus 0000:00:1f.3: SMBus using PCI interrupt > ACPI: button: Power Button [PWRF] > i2c i2c-14: 4/4 memory slots populated (from DMI) > unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) The 0xd84 is the box control MSR of the CBOX 4 (Please find the definition of the MSR from page 11 of https://www.intel.com/content/www/us/en/develop/download/intel-xeon-processor-e5-v2-and-e7-v2-product-families-uncore-performance-monitoring.html). It looks like the driver tries to access the CBOX 4, but it is not available on the machine. The number of available CBOXs on a SNBEP machine is determined at boot time. It should not be larger than the maximum number of cores. The recent commit 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") change the boot_cpu_data.x86_max_cores to topology_num_cores_per_package(). I guess the new function probably returns a different maximum number of cores on the machine. But I don't have a SNBEP on my hands. Could you please help to check whether a different maximum number of cores is returned? Thanks, Kan > Call Trace: > <TASK> > ? ex_handler_msr+0xcb/0x130 > ? fixup_exception+0x166/0x320 > ? exc_general_protection+0xd7/0x3f0 > ? asm_exc_general_protection+0x22/0x30 > ? snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore] > uncore_box_ref.part.0+0x9c/0xc0 [intel_uncore] > ? __pfx_uncore_event_cpu_online+0x10/0x10 [intel_uncore] > uncore_event_cpu_online+0x56/0x140 [intel_uncore] > ? __pfx_uncore_event_cpu_online+0x10/0x10 [intel_uncore] > cpuhp_invoke_callback+0x174/0x5e0 > ? cpuhp_thread_fun+0x5a/0x200 > cpuhp_thread_fun+0x17e/0x200 > ? smpboot_thread_fn+0x2b/0x250 > smpboot_thread_fn+0x1ad/0x250 > ? __pfx_smpboot_thread_fn+0x10/0x10 > kthread+0xed/0x120 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x30/0x50 > ? __pfx_kthread+0x10/0x10 > iTCO_vendor_support: vendor-support=0 > ret_from_fork_asm+0x1a/0x30 > </TASK> > iTCO_wdt iTCO_wdt.1.auto: Found a Patsburg TCO device (Version=2, TCOBASE=0x0460) > iTCO_wdt iTCO_wdt.1.auto: initialized. heartbeat=30 sec (nowayout=0) > RAPL PMU: API unit is 2^-32 Joules, 2 fixed counters, 163840 ms ovfl timer > ... > > Thx. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-04 19:22 ` Liang, Kan @ 2024-03-04 20:12 ` Borislav Petkov 2024-03-05 10:14 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Borislav Petkov @ 2024-03-04 20:12 UTC (permalink / raw) To: Liang, Kan Cc: x86-ml, Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Thomas Gleixner On Mon, Mar 04, 2024 at 02:22:50PM -0500, Liang, Kan wrote: > The number of available CBOXs on a SNBEP machine is determined at boot > time. It should not be larger than the maximum number of cores. > The recent commit 89b0f15f408f ("x86/cpu/topology: Get rid of > cpuinfo::x86_max_cores") change the boot_cpu_data.x86_max_cores to > topology_num_cores_per_package(). > I guess the new function probably returns a different maximum number of > cores on the machine. But I don't have a SNBEP on my hands. Could you > please help to check whether a different maximum number of cores is > returned? Yeah, the topo rework looks at fault: before: online: 1 initial_apicid: 0 apicid: 0 pkg_id: 0 die_id: 0 cu_id: 255 core_id: 0 logical_pkg_id: 0 logical_die_id: 0 llc_id: 0 l2c_id: 0 max_cores: 4 max_die_per_pkg: 1 smp_num_siblings: 2 after: online: 1 initial_apicid: 0 apicid: 0 pkg_id: 0 die_id: 0 cu_id: 255 core_id: 0 logical_pkg_id: 0 logical_die_id: 0 llc_id: 0 l2c_id: 0 amd_node_id: 0 amd_nodes_per_pkg: 0 num_threads: 32 num_cores: 16 max_dies_per_pkg: 1 max_threads_per_core:2 I'll let tglx poke at this. Thx! -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-04 20:12 ` Borislav Petkov @ 2024-03-05 10:14 ` Thomas Gleixner 2024-03-05 12:10 ` Borislav Petkov 0 siblings, 1 reply; 8+ messages in thread From: Thomas Gleixner @ 2024-03-05 10:14 UTC (permalink / raw) To: Borislav Petkov, Liang, Kan Cc: x86-ml, Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Rafael J. Wysocki On Mon, Mar 04 2024 at 21:12, Borislav Petkov wrote: > On Mon, Mar 04, 2024 at 02:22:50PM -0500, Liang, Kan wrote: >> The number of available CBOXs on a SNBEP machine is determined at boot >> time. It should not be larger than the maximum number of cores. >> The recent commit 89b0f15f408f ("x86/cpu/topology: Get rid of >> cpuinfo::x86_max_cores") change the boot_cpu_data.x86_max_cores to >> topology_num_cores_per_package(). >> I guess the new function probably returns a different maximum number of >> cores on the machine. But I don't have a SNBEP on my hands. Could you >> please help to check whether a different maximum number of cores is >> returned? It's a MADT problem. MADT advertises 4 present cores and 12 hotpluggable cores on the same package. At least that's what the topology evaluation figures out based on MADT and presumably CPUID leaf 0xb. CPU topo: Allowing 8 present CPUs plus 24 hotplug CPUs /sys/kernel/debug/x86/topo/domains which is based on leaf 0xb has: domain: Thread shift: 1 dom_size: 2 max_threads: 2 domain: Core shift: 5 dom_size: 16 max_threads: 32 domain: Module shift: 5 dom_size: 1 max_threads: 32 domain: Tile shift: 5 dom_size: 1 max_threads: 32 domain: Die shift: 5 dom_size: 1 max_threads: 32 domain: DieGrp shift: 5 dom_size: 1 max_threads: 32 domain: Package shift: 5 dom_size: 1 max_threads: 32 So the algorithm is correct and works as designed :) How to handle that becomes interesting. The above situation decribed by leaf 0xb and MADT can happen on virtualization, i.e. the only actual user of "physical" CPU hotplug, because virtualization people do not necessarily care about package topology much. It works so it must be correct :) It seems that none of the consumers of topology_num_cores_per_package() can actually be used on virt, so a reasonable restriction is to reject non-present CPUs on bare metal. Something like the below. Thanks tglx --- --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -24,6 +24,7 @@ #include <linux/pgtable.h> #include <asm/e820/api.h> +#include <asm/hypervisor.h> #include <asm/irqdomain.h> #include <asm/pci_x86.h> #include <asm/io_apic.h> @@ -169,11 +170,19 @@ static bool __init acpi_is_processor_usa if (lapic_flags & ACPI_MADT_ENABLED) return true; - if (!acpi_support_online_capable || - (lapic_flags & ACPI_MADT_ONLINE_CAPABLE)) + if (lapic_flags & ACPI_MADT_ONLINE_CAPABLE) return true; - return false; + if (acpi_support_online_capable) + return false; + + /* Physical hotplug on bare metal is not supported */ + if (hypervisor_is_type(X86_HYPER_NATIVE)) { + pr_info_once("Ignoring non-present APIC ID on bare metal\n"); + return false; + } + + return true; } static int __init ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-05 10:14 ` Thomas Gleixner @ 2024-03-05 12:10 ` Borislav Petkov 2024-03-06 11:17 ` Thomas Gleixner 0 siblings, 1 reply; 8+ messages in thread From: Borislav Petkov @ 2024-03-05 12:10 UTC (permalink / raw) To: Thomas Gleixner Cc: Liang, Kan, x86-ml, Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Rafael J. Wysocki On Tue, Mar 05, 2024 at 11:14:04AM +0100, Thomas Gleixner wrote: > It seems that none of the consumers of topology_num_cores_per_package() > can actually be used on virt, so a reasonable restriction is to reject > non-present CPUs on bare metal. Something like the below. Yeah, workie. Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Some relevant diffs of dmesg before and after: +ACPI: Ignoring non-present APIC ID on bare metal -CPU topo: Num. cores per package: 16 -CPU topo: Num. threads per package: 32 -CPU topo: Allowing 8 present CPUs plus 24 hotplug CPUs +CPU topo: Num. cores per package: 4 +CPU topo: Num. threads per package: 8 +CPU topo: Allowing 8 present CPUs plus 0 hotplug CPUs -setup_percpu: NR_CPUS:256 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:1 +setup_percpu: NR_CPUS:256 nr_cpumask_bits:8 nr_cpu_ids:8 nr_node_ids:1 -pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 -pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 -pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 -pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 +pcpu-alloc: [0] 0 1 2 3 [0] 4 5 6 7 Those hotpluggable CPUs ended up wasting percpu mem too. As a result, APIC is not in physical flat mode anymore: -APIC: Switched APIC routing to: physical flat I guess ship it but we'll pay attention to what else ends up complaining. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-05 12:10 ` Borislav Petkov @ 2024-03-06 11:17 ` Thomas Gleixner 2024-03-06 12:32 ` Borislav Petkov 2024-03-06 13:42 ` [tip: x86/apic] x86/topology: Ignore non-present APIC IDs in a present package tip-bot2 for Thomas Gleixner 0 siblings, 2 replies; 8+ messages in thread From: Thomas Gleixner @ 2024-03-06 11:17 UTC (permalink / raw) To: Borislav Petkov Cc: Liang, Kan, x86-ml, Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Rafael J. Wysocki On Tue, Mar 05 2024 at 13:10, Borislav Petkov wrote: > I guess ship it but we'll pay attention to what else ends up > complaining. Here is an updated version which handles it in the topology core code so that MPPARSE is covered as well. Thanks, tglx --- Subject: x86/topology: Ignore non-present APIC IDs in a present package From: Thomas Gleixner <tglx@linutronix.de> Date: Tue, 05 Mar 2024 10:57:26 +0100 Borislav reported that one of his systems has a broken MADT table which advertises eight present APICs and 24 non-present APICs in the same package. The non-present ones are considered hot-pluggable by the topology evaluation code, which is obviously bogus as there is no way to hot-plug within the same package. As the topology evaluation code accounts for hot-pluggable CPUs in a package, the maximum number of cores per package is computed wrong, which in turn causes the uncore performance counter driver to access non-existing MSRs. It will probably confuse other entities which rely on the maximum number of cores and threads per package too. Cure this by ignoring hot-pluggable APIC IDs within a present package. In theory it would be reasonable to just do this unconditionally, but then there is this thing called reality^Wvirtualization which ruins everything. Virtualization is the only existing user of "physical" hotplug and the virtualization tools allow the above scenario. Whether that is actually in use or not is unknown. As it can be argued that the virtualization case is not affected by the issues which exposed the reported problem, allow the bogosity if the kernel determined that it is running in a VM for now. Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/topology.c | 38 +++++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -157,6 +157,20 @@ static __init bool check_for_real_bsp(u3 return true; } +static unsigned int topo_unit_count(u32 lvlid, enum x86_topology_domains at_level, + unsigned long *map) +{ + unsigned int id, end, cnt = 0; + + /* Calculate the exclusive end */ + end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]); + + /* Unfortunately there is no bitmap_weight_range() */ + for (id = find_next_bit(map, end, lvlid); id < end; id = find_next_bit(map, end, ++id)) + cnt++; + return cnt; +} + static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present) { int cpu, dom; @@ -178,6 +192,20 @@ static __init void topo_register_apic(u3 cpuid_to_apicid[cpu] = apic_id; topo_set_cpuids(cpu, apic_id, acpi_id); } else { + u32 pkgid = topo_apicid(apic_id, TOPO_PKG_DOMAIN); + + /* + * Check for present APICs in the same package when running + * on bare metal. Allow the bogosity in a guest. + */ + if (hypervisor_is_type(X86_HYPER_NATIVE) && + topo_unit_count(pkgid, TOPO_PKG_DOMAIN, phys_cpu_present_map)) { + pr_info_once("Ignoring hot-pluggable APIC ID %x in present package.\n", + apic_id); + topo_info.nr_rejected_cpus++; + return; + } + topo_info.nr_disabled_cpus++; } @@ -280,7 +308,6 @@ unsigned int topology_unit_count(u32 api { /* Remove the bits below @at_level to get the proper level ID of @apicid */ unsigned int lvlid = topo_apicid(apicid, at_level); - unsigned int id, end, cnt = 0; if (lvlid >= MAX_LOCAL_APIC) return 0; @@ -290,14 +317,7 @@ unsigned int topology_unit_count(u32 api return 0; if (which_units == at_level) return 1; - - /* Calculate the exclusive end */ - end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]); - /* Unfortunately there is no bitmap_weight_range() */ - for (id = find_next_bit(apic_maps[which_units].map, end, lvlid); - id < end; id = find_next_bit(apic_maps[which_units].map, end, ++id)) - cnt++; - return cnt; + return topo_unit_count(lvlid, at_level, apic_maps[which_units].map); } #ifdef CONFIG_ACPI_HOTPLUG_CPU ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) 2024-03-06 11:17 ` Thomas Gleixner @ 2024-03-06 12:32 ` Borislav Petkov 2024-03-06 13:42 ` [tip: x86/apic] x86/topology: Ignore non-present APIC IDs in a present package tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 8+ messages in thread From: Borislav Petkov @ 2024-03-06 12:32 UTC (permalink / raw) To: Thomas Gleixner Cc: Liang, Kan, x86-ml, Alexander Shishkin, Adrian Hunter, Alexander Antonov, lkml, Rafael J. Wysocki On Wed, Mar 06, 2024 at 12:17:02PM +0100, Thomas Gleixner wrote: > On Tue, Mar 05 2024 at 13:10, Borislav Petkov wrote: > > I guess ship it but we'll pay attention to what else ends up > > complaining. > > Here is an updated version which handles it in the topology core code so > that MPPARSE is covered as well. > > Thanks, > > tglx > --- > Subject: x86/topology: Ignore non-present APIC IDs in a present package > From: Thomas Gleixner <tglx@linutronix.de> > Date: Tue, 05 Mar 2024 10:57:26 +0100 > > Borislav reported that one of his systems has a broken MADT table which > advertises eight present APICs and 24 non-present APICs in the same > package. > > The non-present ones are considered hot-pluggable by the topology > evaluation code, which is obviously bogus as there is no way to hot-plug > within the same package. > > As the topology evaluation code accounts for hot-pluggable CPUs in a > package, the maximum number of cores per package is computed wrong, which > in turn causes the uncore performance counter driver to access non-existing > MSRs. It will probably confuse other entities which rely on the maximum > number of cores and threads per package too. > > Cure this by ignoring hot-pluggable APIC IDs within a present package. > > In theory it would be reasonable to just do this unconditionally, but then > there is this thing called reality^Wvirtualization which ruins > everything. Virtualization is the only existing user of "physical" hotplug > and the virtualization tools allow the above scenario. Whether that is > actually in use or not is unknown. > > As it can be argued that the virtualization case is not affected by the > issues which exposed the reported problem, allow the bogosity if the kernel > determined that it is running in a VM for now. > > Reported-by: Borislav Petkov (AMD) <bp@alien8.de> > Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > --- > arch/x86/kernel/cpu/topology.c | 38 +++++++++++++++++++++++++++++--------- > 1 file changed, 29 insertions(+), 9 deletions(-) > > --- a/arch/x86/kernel/cpu/topology.c > +++ b/arch/x86/kernel/cpu/topology.c #include <asm/hypervisor.h> at the top here. With that, relevant new lines from dmesg: +CPU topo: Ignoring hot-pluggable APIC ID 8 in present package. and @@ -129,9 +130,10 @@ CPU topo: Max. logical packages: 1 CPU topo: Max. logical dies: 1 CPU topo: Max. dies per package: 1 CPU topo: Max. threads per core: 2 -CPU topo: Num. cores per package: 16 -CPU topo: Num. threads per package: 32 -CPU topo: Allowing 8 present CPUs plus 24 hotplug CPUs +CPU topo: Num. cores per package: 4 +CPU topo: Num. threads per package: 8 +CPU topo: Allowing 8 present CPUs plus 0 hotplug CPUs +CPU topo: Rejected CPUs 24 AFAIC, ship it. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: x86/apic] x86/topology: Ignore non-present APIC IDs in a present package 2024-03-06 11:17 ` Thomas Gleixner 2024-03-06 12:32 ` Borislav Petkov @ 2024-03-06 13:42 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 8+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2024-03-06 13:42 UTC (permalink / raw) To: linux-tip-commits Cc: Borislav Petkov (AMD), Thomas Gleixner, x86, linux-kernel The following commit has been merged into the x86/apic branch of tip: Commit-ID: f0551af021308a2a1163dc63d1f1bba3594208bd Gitweb: https://git.kernel.org/tip/f0551af021308a2a1163dc63d1f1bba3594208bd Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Wed, 06 Mar 2024 12:17:02 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Wed, 06 Mar 2024 14:35:30 +01:00 x86/topology: Ignore non-present APIC IDs in a present package Borislav reported that one of his systems has a broken MADT table which advertises eight present APICs and 24 non-present APICs in the same package. The non-present ones are considered hot-pluggable by the topology evaluation code, which is obviously bogus as there is no way to hot-plug within the same package. As the topology evaluation code accounts for hot-pluggable CPUs in a package, the maximum number of cores per package is computed wrong, which in turn causes the uncore performance counter driver to access non-existing MSRs. It will probably confuse other entities which rely on the maximum number of cores and threads per package too. Cure this by ignoring hot-pluggable APIC IDs within a present package. In theory it would be reasonable to just do this unconditionally, but then there is this thing called reality^Wvirtualization which ruins everything. Virtualization is the only existing user of "physical" hotplug and the virtualization tools allow the above scenario. Whether that is actually in use or not is unknown. As it can be argued that the virtualization case is not affected by the issues which exposed the reported problem, allow the bogosity if the kernel determined that it is running in a VM for now. Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/87a5nbvccx.ffs@tglx --- arch/x86/kernel/cpu/topology.c | 39 +++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index 43650fe..3259b1d 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -27,6 +27,7 @@ #include <xen/xen.h> #include <asm/apic.h> +#include <asm/hypervisor.h> #include <asm/io_apic.h> #include <asm/mpspec.h> #include <asm/smp.h> @@ -157,6 +158,20 @@ static __init bool check_for_real_bsp(u32 apic_id) return true; } +static unsigned int topo_unit_count(u32 lvlid, enum x86_topology_domains at_level, + unsigned long *map) +{ + unsigned int id, end, cnt = 0; + + /* Calculate the exclusive end */ + end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]); + + /* Unfortunately there is no bitmap_weight_range() */ + for (id = find_next_bit(map, end, lvlid); id < end; id = find_next_bit(map, end, ++id)) + cnt++; + return cnt; +} + static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present) { int cpu, dom; @@ -178,6 +193,20 @@ static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present) cpuid_to_apicid[cpu] = apic_id; topo_set_cpuids(cpu, apic_id, acpi_id); } else { + u32 pkgid = topo_apicid(apic_id, TOPO_PKG_DOMAIN); + + /* + * Check for present APICs in the same package when running + * on bare metal. Allow the bogosity in a guest. + */ + if (hypervisor_is_type(X86_HYPER_NATIVE) && + topo_unit_count(pkgid, TOPO_PKG_DOMAIN, phys_cpu_present_map)) { + pr_info_once("Ignoring hot-pluggable APIC ID %x in present package.\n", + apic_id); + topo_info.nr_rejected_cpus++; + return; + } + topo_info.nr_disabled_cpus++; } @@ -280,7 +309,6 @@ unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_uni { /* Remove the bits below @at_level to get the proper level ID of @apicid */ unsigned int lvlid = topo_apicid(apicid, at_level); - unsigned int id, end, cnt = 0; if (lvlid >= MAX_LOCAL_APIC) return 0; @@ -290,14 +318,7 @@ unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_uni return 0; if (which_units == at_level) return 1; - - /* Calculate the exclusive end */ - end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]); - /* Unfortunately there is no bitmap_weight_range() */ - for (id = find_next_bit(apic_maps[which_units].map, end, lvlid); - id < end; id = find_next_bit(apic_maps[which_units].map, end, ++id)) - cnt++; - return cnt; + return topo_unit_count(lvlid, at_level, apic_maps[which_units].map); } #ifdef CONFIG_ACPI_HOTPLUG_CPU ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-03-06 13:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-04 18:18 unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) Borislav Petkov 2024-03-04 19:22 ` Liang, Kan 2024-03-04 20:12 ` Borislav Petkov 2024-03-05 10:14 ` Thomas Gleixner 2024-03-05 12:10 ` Borislav Petkov 2024-03-06 11:17 ` Thomas Gleixner 2024-03-06 12:32 ` Borislav Petkov 2024-03-06 13:42 ` [tip: x86/apic] x86/topology: Ignore non-present APIC IDs in a present package tip-bot2 for Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox