public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Borislav Petkov <bp@alien8.de>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>, x86-ml <x86@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Antonov <alexander.antonov@linux.intel.com>,
	lkml <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>
Subject: Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore])
Date: Wed, 06 Mar 2024 12:17:02 +0100	[thread overview]
Message-ID: <87a5nbvccx.ffs@tglx> (raw)
In-Reply-To: <20240305121014.GCZecLppQTzWmpI_yR@fat_crate.local>

On Tue, Mar 05 2024 at 13:10, Borislav Petkov wrote:
> I guess ship it but we'll pay attention to what else ends up
> complaining.

Here is an updated version which handles it in the topology core code so
that MPPARSE is covered as well.

Thanks,

        tglx
---
Subject: x86/topology: Ignore non-present APIC IDs in a present package
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 05 Mar 2024 10:57:26 +0100

Borislav reported that one of his systems has a broken MADT table which
advertises eight present APICs and 24 non-present APICs in the same
package.

The non-present ones are considered hot-pluggable by the topology
evaluation code, which is obviously bogus as there is no way to hot-plug
within the same package.

As the topology evaluation code accounts for hot-pluggable CPUs in a
package, the maximum number of cores per package is computed wrong, which
in turn causes the uncore performance counter driver to access non-existing
MSRs. It will probably confuse other entities which rely on the maximum
number of cores and threads per package too.

Cure this by ignoring hot-pluggable APIC IDs within a present package.

In theory it would be reasonable to just do this unconditionally, but then
there is this thing called reality^Wvirtualization which ruins
everything. Virtualization is the only existing user of "physical" hotplug
and the virtualization tools allow the above scenario. Whether that is
actually in use or not is unknown.

As it can be argued that the virtualization case is not affected by the
issues which exposed the reported problem, allow the bogosity if the kernel
determined that it is running in a VM for now.

Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/topology.c |   38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -157,6 +157,20 @@ static __init bool check_for_real_bsp(u3
 	return true;
 }
 
+static unsigned int topo_unit_count(u32 lvlid, enum x86_topology_domains at_level,
+				    unsigned long *map)
+{
+	unsigned int id, end, cnt = 0;
+
+	/* Calculate the exclusive end */
+	end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]);
+
+	/* Unfortunately there is no bitmap_weight_range() */
+	for (id = find_next_bit(map, end, lvlid); id < end; id = find_next_bit(map, end, ++id))
+		cnt++;
+	return cnt;
+}
+
 static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
 {
 	int cpu, dom;
@@ -178,6 +192,20 @@ static __init void topo_register_apic(u3
 		cpuid_to_apicid[cpu] = apic_id;
 		topo_set_cpuids(cpu, apic_id, acpi_id);
 	} else {
+		u32 pkgid = topo_apicid(apic_id, TOPO_PKG_DOMAIN);
+
+		/*
+		 * Check for present APICs in the same package when running
+		 * on bare metal. Allow the bogosity in a guest.
+		 */
+		if (hypervisor_is_type(X86_HYPER_NATIVE) &&
+		    topo_unit_count(pkgid, TOPO_PKG_DOMAIN, phys_cpu_present_map)) {
+			pr_info_once("Ignoring hot-pluggable APIC ID %x in present package.\n",
+				     apic_id);
+			topo_info.nr_rejected_cpus++;
+			return;
+		}
+
 		topo_info.nr_disabled_cpus++;
 	}
 
@@ -280,7 +308,6 @@ unsigned int topology_unit_count(u32 api
 {
 	/* Remove the bits below @at_level to get the proper level ID of @apicid */
 	unsigned int lvlid = topo_apicid(apicid, at_level);
-	unsigned int id, end, cnt = 0;
 
 	if (lvlid >= MAX_LOCAL_APIC)
 		return 0;
@@ -290,14 +317,7 @@ unsigned int topology_unit_count(u32 api
 		return 0;
 	if (which_units == at_level)
 		return 1;
-
-	/* Calculate the exclusive end */
-	end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]);
-	/* Unfortunately there is no bitmap_weight_range() */
-	for (id = find_next_bit(apic_maps[which_units].map, end, lvlid);
-	     id < end; id = find_next_bit(apic_maps[which_units].map, end, ++id))
-		cnt++;
-	return cnt;
+	return topo_unit_count(lvlid, at_level, apic_maps[which_units].map);
 }
 
 #ifdef CONFIG_ACPI_HOTPLUG_CPU

  reply	other threads:[~2024-03-06 11:17 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 18:18 unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore]) Borislav Petkov
2024-03-04 19:22 ` Liang, Kan
2024-03-04 20:12   ` Borislav Petkov
2024-03-05 10:14     ` Thomas Gleixner
2024-03-05 12:10       ` Borislav Petkov
2024-03-06 11:17         ` Thomas Gleixner [this message]
2024-03-06 12:32           ` Borislav Petkov
2024-03-06 13:42           ` [tip: x86/apic] x86/topology: Ignore non-present APIC IDs in a present package tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5nbvccx.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.antonov@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox