public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, Tom Lendacky <thomas.lendacky@amd.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Huang Rui <ray.huang@amd.com>, Juergen Gross <jgross@suse.com>,
	Dimitri Sivanich <dimitri.sivanich@hpe.com>,
	Michael Kelley <mikelley@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Pu Wen <puwen@hygon.cn>,
	Qiuxu Zhuo <qiuxu.zhuo@intel.com>,
	Sohil Mehta <sohil.mehta@intel.com>
Subject: [patch V4 28/41] x86/cpu: Provide a sane leaf 0xb/0x1f parser
Date: Mon, 14 Aug 2023 10:54:17 +0200 (CEST)	[thread overview]
Message-ID: <20230814085113.705691574@linutronix.de> (raw)
In-Reply-To: 20230814085006.593997112@linutronix.de

detect_extended_topology() along with it's early() variant is a classic
example for duct tape engineering:

  - It evaluates an array of subleafs with a boatload of local variables
    for the relevant topology levels instead of using an array to save the
    enumerated information and propagate it to the right level

  - It has no boundary checks for subleafs

  - It prevents updating the die_id with a crude workaround instead of
    checking for leaf 0xb which does not provide die information.

  - It's broken vs. the number of dies evaluation as it uses:

      num_processors[DIE_LEVEL] / num_processors[CORE_LEVEL]

    which "works" only correctly if there is none of the intermediate
    topology levels (MODULE/TILE) enumerated.

There is zero value in trying to "fix" that code as the only proper fix is
to rewrite it from scratch.

Implement a sane parser with proper code documentation, which will be used
for the consolidated topology evaluation in the next step.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>

---
V4: Handle unknown domain types gracefully - Rui
V2: Fixed up the comment alignment for registers - Peterz
---
 arch/x86/kernel/cpu/Makefile       |    2 
 arch/x86/kernel/cpu/topology.h     |   12 +++
 arch/x86/kernel/cpu/topology_ext.c |  132 +++++++++++++++++++++++++++++++++++++
 3 files changed, 145 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -18,7 +18,7 @@ KMSAN_SANITIZE_common.o := n
 KCSAN_SANITIZE_common.o := n
 
 obj-y			:= cacheinfo.o scattered.o
-obj-y			+= topology_common.o topology.o
+obj-y			+= topology_common.o topology_ext.o topology.o
 obj-y			+= common.o
 obj-y			+= rdrand.o
 obj-y			+= match.o
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -16,6 +16,7 @@ void cpu_init_topology(struct cpuinfo_x8
 void cpu_parse_topology(struct cpuinfo_x86 *c);
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
 		      unsigned int shift, unsigned int ncpus);
+bool cpu_parse_topology_ext(struct topo_scan *tscan);
 
 static inline u32 topo_shift_apicid(u32 apicid, enum x86_topology_domains dom)
 {
@@ -36,4 +37,15 @@ static inline u32 topo_domain_mask(enum
 	return (1U << x86_topo_system.dom_shifts[dom]) - 1;
 }
 
+/*
+ * Update a domain level after the fact without propagating. Used to fixup
+ * broken CPUID enumerations.
+ */
+static inline void topology_update_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
+				       unsigned int shift, unsigned int ncpus)
+{
+	tscan->dom_shifts[dom] = shift;
+	tscan->dom_ncpus[dom] = ncpus;
+}
+
 #endif /* ARCH_X86_TOPOLOGY_H */
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology_ext.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/cpu.h>
+
+#include <asm/apic.h>
+#include <asm/memtype.h>
+#include <asm/processor.h>
+
+#include "cpu.h"
+
+enum topo_types {
+	INVALID_TYPE	= 0,
+	SMT_TYPE	= 1,
+	CORE_TYPE	= 2,
+	MODULE_TYPE	= 3,
+	TILE_TYPE	= 4,
+	DIE_TYPE	= 5,
+	DIEGRP_TYPE	= 6,
+	MAX_TYPE	= 7,
+};
+
+/*
+ * Use a lookup table for the case that there are future types > 6 which
+ * describe an intermediate domain level which does not exist today.
+ *
+ * A table will also be handy to parse the new AMD 0x80000026 leaf which
+ * has defined different domain types, but otherwise uses the same layout
+ * with some of the reserved bits used for new information.
+ */
+static const unsigned int topo_domain_map[MAX_TYPE] = {
+	[SMT_TYPE]	= TOPO_SMT_DOMAIN,
+	[CORE_TYPE]	= TOPO_CORE_DOMAIN,
+	[MODULE_TYPE]	= TOPO_MODULE_DOMAIN,
+	[TILE_TYPE]	= TOPO_TILE_DOMAIN,
+	[DIE_TYPE]	= TOPO_DIE_DOMAIN,
+	[DIEGRP_TYPE]	= TOPO_PKG_DOMAIN,
+};
+
+static inline bool topo_subleaf(struct topo_scan *tscan, u32 leaf, u32 subleaf,
+				unsigned int *last_dom)
+{
+	unsigned int dom, maxtype = leaf == 0xb ? CORE_TYPE + 1 : MAX_TYPE;
+	struct {
+		// eax
+		u32	x2apic_shift	:  5, // Number of bits to shift APIC ID right
+					      // for the topology ID at the next level
+			__rsvd0		: 27; // Reserved
+		// ebx
+		u32	num_processors	: 16, // Number of processors at current level
+			__rsvd1		: 16; // Reserved
+		// ecx
+		u32	level		:  8, // Current topology level. Same as sub leaf number
+			type		:  8, // Level type. If 0, invalid
+			__rsvd2		: 16; // Reserved
+		// edx
+		u32	x2apic_id	: 32; // X2APIC ID of the current logical processor
+	} sl;
+
+	cpuid_subleaf(leaf, subleaf, &sl);
+
+	if (!sl.num_processors || sl.type == INVALID_TYPE)
+		return false;
+
+	if (sl.type >= maxtype) {
+		pr_err_once("Topology: leaf 0x%x:%d Unknown domain type %u\n",
+			    leaf, subleaf, sl.type);
+		/*
+		 * The subleafs are ordered in domain level order so
+		 * propagate it into the next domain level carefully: if
+		 * the last domain level was PKG, then overwrite PKG
+		 * as otherwise this would end up in the root domain.
+		 *
+		 * It really would have been too obvious to make the domain
+		 * type space sparse and leave a few reserved types between
+		 * the points which might change instead of following the
+		 * usual "this can be fixed in software" principle.
+		 */
+		dom = *last_dom == TOPO_PKG_DOMAIN ? TOPO_PKG_DOMAIN : *last_dom + 1;
+	} else {
+		dom = topo_domain_map[sl.type];
+		*last_dom = dom;
+	}
+
+	if (!dom) {
+		tscan->c->topo.initial_apicid = sl.x2apic_id;
+	} else if (tscan->c->topo.initial_apicid != sl.x2apic_id) {
+		pr_warn_once(FW_BUG "CPUID leaf 0x%x subleaf %d APIC ID mismatch %x != %x\n",
+			     leaf, subleaf, tscan->c->topo.initial_apicid, sl.x2apic_id);
+	}
+
+	topology_set_dom(tscan, dom, sl.x2apic_shift, sl.num_processors);
+	return true;
+}
+
+static bool parse_topology_leaf(struct topo_scan *tscan, u32 leaf)
+{
+	unsigned int last_dom;
+	u32 subleaf;
+
+	if (tscan->c->cpuid_level < leaf)
+		return false;
+
+	/* Read all available subleafs and populate the levels */
+	for (subleaf = 0, last_dom = 0; topo_subleaf(tscan, leaf, subleaf, &last_dom); subleaf++);
+
+	/* If subleaf 0 failed to parse, give up */
+	if (!subleaf)
+		return false;
+
+	/*
+	 * There are machines in the wild which have shift 0 in the subleaf
+	 * 0, but advertise 2 logical processors at that level. They are
+	 * truly SMT.
+	 */
+	if (!tscan->dom_shifts[TOPO_SMT_DOMAIN] && tscan->dom_ncpus[TOPO_SMT_DOMAIN] > 1) {
+		unsigned int sft = get_count_order(tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+
+		pr_warn_once(FW_BUG "CPUID leaf 0x%x subleaf 0 has shift level 0 but %u CPUs\n",
+			     leaf, tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+		topology_update_dom(tscan, TOPO_SMT_DOMAIN, sft, tscan->dom_ncpus[TOPO_SMT_DOMAIN]);
+	}
+
+	set_cpu_cap(tscan->c, X86_FEATURE_XTOPOLOGY);
+	return true;
+}
+
+bool cpu_parse_topology_ext(struct topo_scan *tscan)
+{
+	/* Try lead 0x1F first. If not available try leaf 0x0b */
+	if (parse_topology_leaf(tscan, 0x1f))
+		return true;
+	return parse_topology_leaf(tscan, 0x0b);
+}


  parent reply	other threads:[~2023-08-14  8:56 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14  8:53 [patch V4 00/41] x86/cpu: Rework the topology evaluation Thomas Gleixner
2023-08-14  8:53 ` [patch V4 01/41] x86/cpu/hygon: Fix the CPU topology evaluation for real Thomas Gleixner
2023-08-14  8:53 ` [patch V4 02/41] cpu/SMT: Make SMT control more robust against enumeration failures Thomas Gleixner
2023-08-15 21:15   ` Dave Hansen
2023-10-10 12:18     ` Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 03/41] x86/apic: Fake primary thread mask for XEN/PV Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 04/41] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 05/41] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 06/41] x86/cpu: Move cpu_die_id " Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 07/41] scsi: lpfc: Use topology_core_id() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 08/41] hwmon: (fam15h_power) " Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 09/41] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 10/41] x86/cpu: Move cu_id " Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 11/41] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 12/41] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 13/41] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 14/41] x86/apic: Use BAD_APICID consistently Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 15/41] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:53 ` [patch V4 16/41] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 17/41] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 18/41] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 19/41] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 20/41] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-10-13 10:32   ` [tip: x86/core] x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too tip-bot2 for Ingo Molnar
2023-08-14  8:54 ` [patch V4 21/41] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 22/41] x86/cpu: Provide debug interface Thomas Gleixner
2023-10-13  9:38   ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-08-14  8:54 ` [patch V4 23/41] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
2023-08-14  8:54 ` [patch V4 24/41] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
2023-08-28  6:07   ` K Prateek Nayak
2023-08-28 10:05     ` Thomas Gleixner
2023-08-28 14:03       ` Arjan van de Ven
2023-08-28 14:28       ` K Prateek Nayak
2023-08-28 14:34         ` Arjan van de Ven
2023-08-29  3:16           ` K Prateek Nayak
2023-09-15 11:54             ` Peter Zijlstra
2023-09-15 14:04               ` Arjan van de Ven
2023-09-19  3:54                 ` K Prateek Nayak
2023-09-19  8:13                   ` Thomas Gleixner
2023-09-20  3:23                     ` K Prateek Nayak
2023-09-19 13:44                   ` Arjan van de Ven
2023-09-20  3:21                     ` K Prateek Nayak
2023-09-14  9:20         ` K Prateek Nayak
2023-09-15 11:46           ` Thomas Gleixner
2023-08-14  8:54 ` [patch V4 25/41] x86/cpu: Add legacy topology parser Thomas Gleixner
2023-08-14  8:54 ` [patch V4 26/41] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
2023-08-14  8:54 ` [patch V4 27/41] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
2023-08-14  8:54 ` Thomas Gleixner [this message]
2023-08-16 12:09   ` [patch V4 28/41] x86/cpu: Provide a sane leaf 0xb/0x1f parser Zhang, Rui
2023-08-17  9:11     ` Thomas Gleixner
2023-08-14  8:54 ` [patch V4 29/41] x86/cpu: Use common topology code for Intel Thomas Gleixner
2023-08-14  8:54 ` [patch V4 30/41] x86/cpu/amd: Provide a separate accessor for Node ID Thomas Gleixner
2023-08-14  8:54 ` [patch V4 31/41] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
2023-08-14  8:54 ` [patch V4 32/41] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
2023-08-14  8:54 ` [patch V4 33/41] x86/cpu: Use common topology code for AMD Thomas Gleixner
2023-08-14  8:54 ` [patch V4 34/41] x86/cpu: Use common topology code for HYGON Thomas Gleixner
2023-08-14  8:54 ` [patch V4 35/41] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
2023-08-14  8:54 ` [patch V4 36/41] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
2023-08-14  8:54 ` [patch V4 37/41] x86/cpu: Remove topology.c Thomas Gleixner
2023-08-14  8:54 ` [patch V4 38/41] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
2023-08-14  8:54 ` [patch V4 39/41] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
2023-08-14  8:54 ` [patch V4 40/41] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
2023-08-14  8:54 ` [patch V4 41/41] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
2023-08-14 14:36 ` [patch V4 00/41] x86/cpu: Rework the topology evaluation Peter Zijlstra
2023-08-16 11:36 ` Zhang, Rui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230814085113.705691574@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=puwen@hygon.cn \
    --cc=qiuxu.zhuo@intel.com \
    --cc=ray.huang@amd.com \
    --cc=sohil.mehta@intel.com \
    --cc=thomas.lendacky@amd.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox