* [patch 00/30] x86/apic: Rework APIC registration
@ 2024-02-13 21:05 Thomas Gleixner
2024-02-13 21:05 ` [patch 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
` (29 more replies)
0 siblings, 30 replies; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
This is a follow up to the V2 series which can be found here:
https://lore.kernel.org/all/20240118123127.055361964@linutronix.de
addressing the issues of the current topology code:
- Wrong core count on hybrid systems
- Heuristics based size information for packages and dies which
are failing to work correctly with certain command line parameters.
- Full evaluation fail for a theoretical hybrid system which boots
from an E-core
- The complete insanity of manipulating global data from firmware parsers
or the XEN/PV fake SMP enumeration. The latter is really a piece of art.
This series addresses this by
- Consolidating all topology relevant functionality into one place
- Providing separate interfaces for boot time and ACPI hotplug operations
- A sane ordering of command line options and restrictions
- A sensible way to handle the BSP problem in kdump kernels instead of
the unreliable command line option.
- Confinement of topology relevant variables by replacing the XEN/PV SMP
enumeration fake with something halfways sensible.
- Evaluation of sizes by analysing the topology via the CPUID provided
APIC ID segmentation and the actual APIC IDs which are registered at
boot time.
- Removal of heuristics and broken size calculations
The idea behind this is the following:
The APIC IDs describe the system topology in multiple domain levels. The
CPUID topology parser provides the information which part of the APIC ID is
associated to the individual levels (Intel terminology):
[ROOT][PACKAGE][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]
The root space contains the package (socket) IDs. Not enumerated levels
consume 0 bits space, but conceptually they are always represented. If
e.g. only CORE and THREAD levels are enumerated then the DIEGRP, DIE,
MODULE and TILE have the same physical ID as the PACKAGE.
If SMT is not supported, then the THREAD domain is still used. It then
has the same physical ID as the CORE domain and is the only child of
the core domain.
This allows an unified view on the system independent of the enumerated
domain levels without requiring any conditionals in the code.
AMD does only expose 4 domain levels with obviously different terminology,
but that can be easily mapped into the Intel variant with a trivial lookup
table added to the CPUID parser.
The resulting topology information of an ADL hybrid system with 8 P-Cores
and 8 E-Cores looks like this:
CPU topo: Max. logical packages: 1
CPU topo: Max. logical dies: 1
CPU topo: Max. dies per package: 1
CPU topo: Max. threads per core: 2
CPU topo: Num. cores per package: 16
CPU topo: Num. threads per package: 24
CPU topo: Allowing 24 present CPUs plus 0 hotplug CPUs
CPU topo: Thread : 24
CPU topo: Core : 16
CPU topo: Module : 1
CPU topo: Tile : 1
CPU topo: Die : 1
CPU topo: Package : 1
This is happening on the boot CPU before any of the APs is started and
provides correct size information right from the start.
Even the XEN/PV trainwreck makes use of this now. On Dom0 it utilizes the
MADT and on DomU it provides fake APIC IDs, which combined with the
provided CPUID information make it at least look halfways realistic instead
of claiming to have one CPU per package as the current upstream code does.
This is solely addressing the core topology issues, but there is a plan for
further consolidation of other topology related information into one single
source of information instead of having a gazillion of localized special
parsers and representations all over the place. There are quite some other
things which can be simplified on top of this, like updating the various
cpumasks during CPU bringup, but that's all left for later.
Changes vs. V2:
- Rebase on topo-cleanup-v3
- Fix the SMT calculation thinko (Rui)
- Fix the BSP detection code (Michael, Sohil)
- Rename a misnomed function (Arjan)
The current series applies on top of
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cleanup-v3
and is available from git here:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-full-v3
Thanks,
tglx
---
Documentation/admin-guide/kdump/kdump.rst | 7
Documentation/admin-guide/kernel-parameters.txt | 9
Documentation/arch/x86/topology.rst | 24
arch/x86/events/intel/cstate.c | 2
arch/x86/events/intel/uncore.c | 2
arch/x86/events/intel/uncore_nhmex.c | 4
arch/x86/events/intel/uncore_snb.c | 8
arch/x86/events/intel/uncore_snbep.c | 18
arch/x86/events/rapl.c | 2
arch/x86/include/asm/apic.h | 10
arch/x86/include/asm/cpu.h | 10
arch/x86/include/asm/mpspec.h | 2
arch/x86/include/asm/perf_event_p4.h | 4
arch/x86/include/asm/processor.h | 2
arch/x86/include/asm/smp.h | 6
arch/x86/include/asm/topology.h | 53 -
arch/x86/kernel/acpi/boot.c | 59 -
arch/x86/kernel/apic/apic.c | 186 ---
arch/x86/kernel/cpu/Makefile | 12
arch/x86/kernel/cpu/cacheinfo.c | 2
arch/x86/kernel/cpu/common.c | 33
arch/x86/kernel/cpu/debugfs.c | 7
arch/x86/kernel/cpu/mce/inject.c | 3
arch/x86/kernel/cpu/microcode/intel.c | 2
arch/x86/kernel/cpu/topology.c | 489 ++++++++++
arch/x86/kernel/cpu/topology.h | 11
arch/x86/kernel/cpu/topology_common.c | 45
arch/x86/kernel/devicetree.c | 2
arch/x86/kernel/jailhouse.c | 2
arch/x86/kernel/mpparse.c | 17
arch/x86/kernel/process.c | 2
arch/x86/kernel/setup.c | 9
arch/x86/kernel/smpboot.c | 219 ----
arch/x86/xen/apic.c | 14
arch/x86/xen/enlighten_pv.c | 3
arch/x86/xen/smp.c | 2
arch/x86/xen/smp.h | 2
arch/x86/xen/smp_pv.c | 58 -
drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2
drivers/hwmon/coretemp.c | 2
drivers/hwmon/fam15h_power.c | 2
drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c | 2
drivers/powercap/intel_rapl_common.c | 2
drivers/thermal/intel/intel_hfi.c | 2
drivers/thermal/intel/intel_powerclamp.c | 2
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2
46 files changed, 703 insertions(+), 655 deletions(-)
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 01/30] x86/cpu/topology: Move registration out of APIC code
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 02/30] x86/cpu/topology: Provide separate APIC registration functions Thomas Gleixner
` (28 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
The APIC/CPU registration sits in the middle of the APIC code. In fact this
is a topology evaluation function and has nothing to do with the inner
workings of the local APIC.
Move it out into a file which reflects what this is about.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/apic.h | 2
arch/x86/kernel/apic/apic.c | 185 -----------------------------------------
arch/x86/kernel/cpu/Makefile | 12 +-
arch/x86/kernel/cpu/topology.c | 184 ++++++++++++++++++++++++++++++++++++++++
4 files changed, 195 insertions(+), 188 deletions(-)
create mode 100644 arch/x86/kernel/cpu/topology.c
---
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -171,6 +171,8 @@ extern bool apic_needs_pit(void);
extern void apic_send_IPI_allbutself(unsigned int vector);
+extern void topology_register_boot_apic(u32 apic_id);
+
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
#define local_apic_timer_c2_ok 1
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -68,26 +68,12 @@
#include "local.h"
-unsigned int num_processors;
-
-unsigned disabled_cpus;
-
/* Processor that is doing the boot up */
u32 boot_cpu_physical_apicid __ro_after_init = BAD_APICID;
EXPORT_SYMBOL_GPL(boot_cpu_physical_apicid);
u8 boot_cpu_apic_version __ro_after_init;
-/* Bitmap of physically present CPUs. */
-DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC);
-
-/*
- * Processor to be disabled specified by kernel parameter
- * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
- * avoid undefined behaviour caused by sending INIT from AP to BSP.
- */
-static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-
/*
* This variable controls which CPUs receive external NMIs. By default,
* external NMIs are delivered only to the BSP.
@@ -107,14 +93,6 @@ static inline bool apic_accessible(void)
return x2apic_mode || apic_mmio_base;
}
-/*
- * Map cpu index to physical APIC ID
- */
-DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid, BAD_APICID);
-DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid, CPU_ACPIID_INVALID);
-EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
-EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
-
#ifdef CONFIG_X86_32
/* Local APIC was disabled by the BIOS and enabled by the kernel */
static int enabled_via_apicbase __ro_after_init;
@@ -1676,8 +1654,6 @@ void apic_ap_setup(void)
end_local_APIC_setup();
}
-static __init void cpu_set_boot_apic(void);
-
static __init void apic_read_boot_cpu_id(bool x2apic)
{
/*
@@ -1692,7 +1668,8 @@ static __init void apic_read_boot_cpu_id
boot_cpu_physical_apicid = read_apic_id();
boot_cpu_apic_version = GET_APIC_VERSION(apic_read(APIC_LVR));
}
- cpu_set_boot_apic();
+ topology_register_boot_apic(boot_cpu_physical_apicid);
+ x86_32_probe_bigsmp_early();
}
#ifdef CONFIG_X86_X2APIC
@@ -2291,155 +2268,6 @@ void disconnect_bsp_APIC(int virt_wire_s
apic_write(APIC_LVT1, value);
}
-/*
- * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
- * contiguously, it equals to current allocated max logical CPU ID plus 1.
- * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
- * so the maximum of nr_logical_cpuids is nr_cpu_ids.
- *
- * NOTE: Reserve 0 for BSP.
- */
-static int nr_logical_cpuids = 1;
-
-/*
- * Used to store mapping between logical CPU IDs and APIC IDs.
- */
-u32 cpuid_to_apicid[] = { [0 ... NR_CPUS - 1] = BAD_APICID, };
-
-bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
-{
- return phys_id == (u64)cpuid_to_apicid[cpu];
-}
-
-#ifdef CONFIG_SMP
-static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
-{
- /* Isolate the SMT bit(s) in the APICID and check for 0 */
- u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
-
- if (smp_num_siblings == 1 || !(apicid & mask))
- cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
-}
-
-/*
- * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
- * during early boot. Initialize the primary thread mask before SMP
- * bringup.
- */
-static int __init smp_init_primary_thread_mask(void)
-{
- unsigned int cpu;
-
- /*
- * XEN/PV provides either none or useless topology information.
- * Pretend that all vCPUs are primary threads.
- */
- if (xen_pv_domain()) {
- cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
- return 0;
- }
-
- for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
- cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
- return 0;
-}
-early_initcall(smp_init_primary_thread_mask);
-#else
-static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
-#endif
-
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(int apicid)
-{
- int i;
-
- /*
- * cpuid <-> apicid mapping is persistent, so when a cpu is up,
- * check if the kernel has allocated a cpuid for it.
- */
- for (i = 0; i < nr_logical_cpuids; i++) {
- if (cpuid_to_apicid[i] == apicid)
- return i;
- }
-
- /* Allocate a new cpuid. */
- if (nr_logical_cpuids >= nr_cpu_ids) {
- WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
- "Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apicid);
- return -EINVAL;
- }
-
- cpuid_to_apicid[nr_logical_cpuids] = apicid;
- return nr_logical_cpuids++;
-}
-
-static void cpu_update_apic(int cpu, u32 apicid)
-{
-#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
-#endif
- set_cpu_possible(cpu, true);
- set_bit(apicid, phys_cpu_present_map);
- set_cpu_present(cpu, true);
- num_processors++;
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apicid);
-}
-
-static __init void cpu_set_boot_apic(void)
-{
- cpuid_to_apicid[0] = boot_cpu_physical_apicid;
- cpu_update_apic(0, boot_cpu_physical_apicid);
- x86_32_probe_bigsmp_early();
-}
-
-int generic_processor_info(int apicid)
-{
- int cpu, max = nr_cpu_ids;
-
- /* The boot CPU must be set before MADT/MPTABLE parsing happens */
- if (cpuid_to_apicid[0] == BAD_APICID)
- panic("Boot CPU APIC not registered yet\n");
-
- if (apicid == boot_cpu_physical_apicid)
- return 0;
-
- if (disabled_cpu_apicid == apicid) {
- int thiscpu = num_processors + disabled_cpus;
-
- pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
- thiscpu, apicid);
-
- disabled_cpus++;
- return -ENODEV;
- }
-
- if (num_processors >= nr_cpu_ids) {
- int thiscpu = max + disabled_cpus;
-
- pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
- "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
-
- disabled_cpus++;
- return -EINVAL;
- }
-
- cpu = allocate_logical_cpuid(apicid);
- if (cpu < 0) {
- disabled_cpus++;
- return -EINVAL;
- }
-
- cpu_update_apic(cpu, apicid);
- return cpu;
-}
-
-
void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
bool dmar)
{
@@ -2828,15 +2656,6 @@ static int __init lapic_insert_resource(
*/
late_initcall(lapic_insert_resource);
-static int __init apic_set_disabled_cpu_apicid(char *arg)
-{
- if (!arg || !get_option(&arg, &disabled_cpu_apicid))
- return -EINVAL;
-
- return 0;
-}
-early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
-
static int __init apic_set_extnmi(char *arg)
{
if (!arg)
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -26,14 +26,16 @@ obj-y += bugs.o
obj-y += aperfmperf.o
obj-y += cpuid-deps.o
obj-y += umwait.o
+obj-y += capflags.o powerflags.o
-obj-$(CONFIG_PROC_FS) += proc.o
-obj-y += capflags.o powerflags.o
+obj-$(CONFIG_X86_LOCAL_APIC) += topology.o
-obj-$(CONFIG_IA32_FEAT_CTL) += feat_ctl.o
+obj-$(CONFIG_PROC_FS) += proc.o
+
+obj-$(CONFIG_IA32_FEAT_CTL) += feat_ctl.o
ifdef CONFIG_CPU_SUP_INTEL
-obj-y += intel.o intel_pconfig.o tsx.o
-obj-$(CONFIG_PM) += intel_epb.o
+obj-y += intel.o intel_pconfig.o tsx.o
+obj-$(CONFIG_PM) += intel_epb.o
endif
obj-$(CONFIG_CPU_SUP_AMD) += amd.o
obj-$(CONFIG_CPU_SUP_HYGON) += hygon.o
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/cpu.h>
+
+#include <xen/xen.h>
+
+#include <asm/apic.h>
+#include <asm/mpspec.h>
+#include <asm/smp.h>
+
+/*
+ * Map cpu index to physical APIC ID
+ */
+DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid, BAD_APICID);
+DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid, CPU_ACPIID_INVALID);
+EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
+EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
+
+/* Bitmap of physically present CPUs. */
+DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
+
+/* Used for CPU number allocation and parallel CPU bringup */
+u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+
+/*
+ * Processor to be disabled specified by kernel parameter
+ * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
+ * avoid undefined behaviour caused by sending INIT from AP to BSP.
+ */
+static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
+
+unsigned int num_processors;
+unsigned disabled_cpus;
+
+/*
+ * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
+ * contiguously, it equals to current allocated max logical CPU ID plus 1.
+ * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
+ * so the maximum of nr_logical_cpuids is nr_cpu_ids.
+ *
+ * NOTE: Reserve 0 for BSP.
+ */
+static int nr_logical_cpuids = 1;
+
+bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
+{
+ return phys_id == (u64)cpuid_to_apicid[cpu];
+}
+
+#ifdef CONFIG_SMP
+static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
+{
+ /* Isolate the SMT bit(s) in the APICID and check for 0 */
+ u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
+
+ if (smp_num_siblings == 1 || !(apicid & mask))
+ cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
+}
+
+/*
+ * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
+ * during early boot. Initialize the primary thread mask before SMP
+ * bringup.
+ */
+static int __init smp_init_primary_thread_mask(void)
+{
+ unsigned int cpu;
+
+ /*
+ * XEN/PV provides either none or useless topology information.
+ * Pretend that all vCPUs are primary threads.
+ */
+ if (xen_pv_domain()) {
+ cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
+ return 0;
+ }
+
+ for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
+ cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
+ return 0;
+}
+early_initcall(smp_init_primary_thread_mask);
+#else
+static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
+#endif
+
+/*
+ * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
+ * and cpuid_to_apicid[] synchronized.
+ */
+static int allocate_logical_cpuid(int apicid)
+{
+ int i;
+
+ /*
+ * cpuid <-> apicid mapping is persistent, so when a cpu is up,
+ * check if the kernel has allocated a cpuid for it.
+ */
+ for (i = 0; i < nr_logical_cpuids; i++) {
+ if (cpuid_to_apicid[i] == apicid)
+ return i;
+ }
+
+ /* Allocate a new cpuid. */
+ if (nr_logical_cpuids >= nr_cpu_ids) {
+ WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
+ "Processor %d/0x%x and the rest are ignored.\n",
+ nr_cpu_ids, nr_logical_cpuids, apicid);
+ return -EINVAL;
+ }
+
+ cpuid_to_apicid[nr_logical_cpuids] = apicid;
+ return nr_logical_cpuids++;
+}
+
+static void cpu_update_apic(int cpu, u32 apicid)
+{
+#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
+#endif
+ set_cpu_possible(cpu, true);
+ set_bit(apicid, phys_cpu_present_map);
+ set_cpu_present(cpu, true);
+ num_processors++;
+
+ if (system_state != SYSTEM_BOOTING)
+ cpu_mark_primary_thread(cpu, apicid);
+}
+
+void __init topology_register_boot_apic(u32 apic_id)
+{
+ cpuid_to_apicid[0] = apic_id;
+ cpu_update_apic(0, apic_id);
+}
+
+int generic_processor_info(int apicid)
+{
+ int cpu, max = nr_cpu_ids;
+
+ /* The boot CPU must be set before MADT/MPTABLE parsing happens */
+ if (cpuid_to_apicid[0] == BAD_APICID)
+ panic("Boot CPU APIC not registered yet\n");
+
+ if (apicid == boot_cpu_physical_apicid)
+ return 0;
+
+ if (disabled_cpu_apicid == apicid) {
+ int thiscpu = num_processors + disabled_cpus;
+
+ pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
+ thiscpu, apicid);
+
+ disabled_cpus++;
+ return -ENODEV;
+ }
+
+ if (num_processors >= nr_cpu_ids) {
+ int thiscpu = max + disabled_cpus;
+
+ pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
+ "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
+
+ disabled_cpus++;
+ return -EINVAL;
+ }
+
+ cpu = allocate_logical_cpuid(apicid);
+ if (cpu < 0) {
+ disabled_cpus++;
+ return -EINVAL;
+ }
+
+ cpu_update_apic(cpu, apicid);
+ return cpu;
+}
+
+static int __init apic_set_disabled_cpu_apicid(char *arg)
+{
+ if (!arg || !get_option(&arg, &disabled_cpu_apicid))
+ return -EINVAL;
+
+ return 0;
+}
+early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 02/30] x86/cpu/topology: Provide separate APIC registration functions
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
2024-02-13 21:05 ` [patch 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 03/30] x86/acpi: Use new " Thomas Gleixner
` (27 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
generic_processor_info() aside of being a complete misnomer is used for
both early boot registration and ACPI CPU hotplug.
While it's arguable that this can share some code, it results in code which
is hard to understand and kept around post init for no real reason.
Also the call sites do lots of manual fiddling in topology related
variables instead of having proper interfaces for the purpose which handle
the topology internals correctly.
Provide topology_register_apic(), topology_hotplug_apic() and
topology_hotunplug_apic() which have the extra magic of the call sites
incorporated and for now are wrappers around generic_processor_info().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/apic.h | 3 +
arch/x86/kernel/cpu/topology.c | 113 ++++++++++++++++++++++++++++++++++-------
2 files changed, 98 insertions(+), 18 deletions(-)
---
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -171,7 +171,10 @@ extern bool apic_needs_pit(void);
extern void apic_send_IPI_allbutself(unsigned int vector);
+extern void topology_register_apic(u32 apic_id, u32 acpi_id, bool present);
extern void topology_register_boot_apic(u32 apic_id);
+extern int topology_hotplug_apic(u32 apic_id, u32 acpi_id);
+extern void topology_hotunplug_apic(unsigned int cpu);
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -84,32 +84,38 @@ early_initcall(smp_init_primary_thread_m
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(int apicid)
+static int topo_lookup_cpuid(u32 apic_id)
{
int i;
- /*
- * cpuid <-> apicid mapping is persistent, so when a cpu is up,
- * check if the kernel has allocated a cpuid for it.
- */
+ /* CPU# to APICID mapping is persistent once it is established */
for (i = 0; i < nr_logical_cpuids; i++) {
- if (cpuid_to_apicid[i] == apicid)
+ if (cpuid_to_apicid[i] == apic_id)
return i;
}
+ return -ENODEV;
+}
+
+/*
+ * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
+ * and cpuid_to_apicid[] synchronized.
+ */
+static int allocate_logical_cpuid(u32 apic_id)
+{
+ int cpu = topo_lookup_cpuid(apic_id);
+
+ if (cpu >= 0)
+ return cpu;
/* Allocate a new cpuid. */
if (nr_logical_cpuids >= nr_cpu_ids) {
WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
"Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apicid);
+ nr_cpu_ids, nr_logical_cpuids, apic_id);
return -EINVAL;
}
- cpuid_to_apicid[nr_logical_cpuids] = apicid;
+ cpuid_to_apicid[nr_logical_cpuids] = apic_id;
return nr_logical_cpuids++;
}
@@ -127,12 +133,6 @@ static void cpu_update_apic(int cpu, u32
cpu_mark_primary_thread(cpu, apicid);
}
-void __init topology_register_boot_apic(u32 apic_id)
-{
- cpuid_to_apicid[0] = apic_id;
- cpu_update_apic(0, apic_id);
-}
-
int generic_processor_info(int apicid)
{
int cpu, max = nr_cpu_ids;
@@ -174,6 +174,83 @@ int generic_processor_info(int apicid)
return cpu;
}
+/**
+ * topology_register_apic - Register an APIC in early topology maps
+ * @apic_id: The APIC ID to set up
+ * @acpi_id: The ACPI ID associated to the APIC
+ * @present: True if the corresponding CPU is present
+ */
+void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
+{
+ int cpu;
+
+ if (apic_id >= MAX_LOCAL_APIC) {
+ pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
+ return;
+ }
+
+ if (!present) {
+ disabled_cpus++;
+ return;
+ }
+
+ cpu = generic_processor_info(apic_id);
+ if (cpu >= 0)
+ early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+}
+
+/**
+ * topology_register_boot_apic - Register the boot CPU APIC
+ * @apic_id: The APIC ID to set up
+ *
+ * Separate so CPU #0 can be assigned
+ */
+void __init topology_register_boot_apic(u32 apic_id)
+{
+ cpuid_to_apicid[0] = apic_id;
+ cpu_update_apic(0, apic_id);
+}
+
+#ifdef CONFIG_ACPI_HOTPLUG_CPU
+/**
+ * topology_hotplug_apic - Handle a physical hotplugged APIC after boot
+ * @apic_id: The APIC ID to set up
+ * @acpi_id: The ACPI ID associated to the APIC
+ */
+int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
+{
+ int cpu;
+
+ if (apic_id >= MAX_LOCAL_APIC)
+ return -EINVAL;
+
+ cpu = topo_lookup_cpuid(apic_id);
+ if (cpu < 0) {
+ cpu = generic_processor_info(apic_id);
+ if (cpu >= 0)
+ per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ }
+ return cpu;
+}
+
+/**
+ * topology_hotunplug_apic - Remove a physical hotplugged APIC after boot
+ * @cpu: The CPU number for which the APIC ID is removed
+ */
+void topology_hotunplug_apic(unsigned int cpu)
+{
+ u32 apic_id = cpuid_to_apicid[cpu];
+
+ if (apic_id == BAD_APICID)
+ return;
+
+ per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
+ clear_bit(apic_id, phys_cpu_present_map);
+ set_cpu_present(cpu, false);
+ num_processors--;
+}
+#endif
+
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
if (!arg || !get_option(&arg, &disabled_cpu_apicid))
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 03/30] x86/acpi: Use new APIC registration functions
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
2024-02-13 21:05 ` [patch 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
2024-02-13 21:05 ` [patch 02/30] x86/cpu/topology: Provide separate APIC registration functions Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 04/30] x86/jailhouse: Use new APIC registration function Thomas Gleixner
` (26 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Use the new topology registration functions and make the early boot code
path __init. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/acpi/boot.c | 44 +++++++-------------------------------------
1 file changed, 7 insertions(+), 37 deletions(-)
---
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -164,33 +164,9 @@ static int __init acpi_parse_madt(struct
return 0;
}
-/**
- * acpi_register_lapic - register a local apic and generates a logic cpu number
- * @id: local apic id to register
- * @acpiid: ACPI id to register
- * @enabled: this cpu is enabled or not
- *
- * Returns the logic cpu number which maps to the local apic
- */
-static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
+static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
-
- if (id >= MAX_LOCAL_APIC) {
- pr_info("skipped apicid that is too big\n");
- return -EINVAL;
- }
-
- if (!enabled) {
- ++disabled_cpus;
- return -EINVAL;
- }
-
- cpu = generic_processor_info(id);
- if (cpu >= 0)
- early_per_cpu(x86_cpu_to_acpiid, cpu) = acpiid;
-
- return cpu;
+ topology_register_apic(apic_id, acpi_id, present);
}
static bool __init acpi_is_processor_usable(u32 lapic_flags)
@@ -844,12 +820,10 @@ static int acpi_map_cpu2node(acpi_handle
return 0;
}
-int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
- int *pcpu)
+int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int *pcpu)
{
- int cpu;
+ int cpu = topology_hotplug_apic(physid, acpi_id);
- cpu = acpi_register_lapic(physid, acpi_id, ACPI_MADT_ENABLED);
if (cpu < 0) {
pr_info("Unable to map lapic to logical cpu number\n");
return cpu;
@@ -868,15 +842,11 @@ int acpi_unmap_cpu(int cpu)
#ifdef CONFIG_ACPI_NUMA
set_apicid_to_node(per_cpu(x86_cpu_to_apicid, cpu), NUMA_NO_NODE);
#endif
-
- per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
- set_cpu_present(cpu, false);
- num_processors--;
-
- return (0);
+ topology_hotunplug_apic(cpu);
+ return 0;
}
EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_CPU */
int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
{
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 04/30] x86/jailhouse: Use new APIC registration function
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (2 preceding siblings ...)
2024-02-13 21:05 ` [patch 03/30] x86/acpi: Use new " Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 05/30] x86/of: Use new APIC registration functions Thomas Gleixner
` (25 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/jailhouse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -102,7 +102,7 @@ static void __init jailhouse_parse_smp_c
register_lapic_address(0xfee00000);
for (cpu = 0; cpu < setup_data.v1.num_cpus; cpu++)
- generic_processor_info(setup_data.v1.cpu_ids[cpu]);
+ topology_register_apic(setup_data.v1.cpu_ids[cpu], CPU_ACPIID_INVALID, true);
smp_found_config = 1;
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 05/30] x86/of: Use new APIC registration functions
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (3 preceding siblings ...)
2024-02-13 21:05 ` [patch 04/30] x86/jailhouse: Use new APIC registration function Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 06/30] x86/mpparse: Use new APIC registration function Thomas Gleixner
` (24 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/devicetree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -136,7 +136,7 @@ static void __init dtb_cpu_setup(void)
pr_warn("%pOF: missing local APIC ID\n", dn);
continue;
}
- generic_processor_info(apic_id);
+ topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 06/30] x86/mpparse: Use new APIC registration function
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (4 preceding siblings ...)
2024-02-13 21:05 ` [patch 05/30] x86/of: Use new APIC registration functions Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV Thomas Gleixner
` (23 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Aside of switching over to the new interface, record the number of
registered CPUs locally, which allows to make num_processors and
disabled_cpus confined to the topology code.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/mpspec.h | 2 --
arch/x86/kernel/cpu/topology.c | 2 +-
arch/x86/kernel/mpparse.c | 17 +++++++++--------
3 files changed, 10 insertions(+), 11 deletions(-)
---
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -61,8 +61,6 @@ static inline void e820__memblock_alloc_
#define mpparse_parse_smp_config x86_init_noop
#endif
-int generic_processor_info(int apicid);
-
extern DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC);
static inline void reset_phys_cpu_present_map(u32 apicid)
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -133,7 +133,7 @@ static void cpu_update_apic(int cpu, u32
cpu_mark_primary_thread(cpu, apicid);
}
-int generic_processor_info(int apicid)
+static int generic_processor_info(int apicid)
{
int cpu, max = nr_cpu_ids;
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -36,6 +36,8 @@
* Checksum an MP configuration block.
*/
+static unsigned int num_procs __initdata;
+
static int __init mpf_checksum(unsigned char *mp, int len)
{
int sum = 0;
@@ -50,16 +52,15 @@ static void __init MP_processor_info(str
{
char *bootup_cpu = "";
- if (!(m->cpuflag & CPU_ENABLED)) {
- disabled_cpus++;
+ topology_register_apic(m->apicid, CPU_ACPIID_INVALID, m->cpuflag & CPU_ENABLED);
+ if (!(m->cpuflag & CPU_ENABLED))
return;
- }
if (m->cpuflag & CPU_BOOTPROCESSOR)
bootup_cpu = " (Bootup-CPU)";
pr_info("Processor #%d%s\n", m->apicid, bootup_cpu);
- generic_processor_info(m->apicid);
+ num_procs++;
}
#ifdef CONFIG_X86_IO_APIC
@@ -236,9 +237,9 @@ static int __init smp_read_mpc(struct mp
}
}
- if (!num_processors)
+ if (!num_procs && !acpi_lapic)
pr_err("MPTABLE: no processors registered!\n");
- return num_processors;
+ return num_procs || acpi_lapic;
}
#ifdef CONFIG_X86_IO_APIC
@@ -529,8 +530,8 @@ static __init void mpparse_get_smp_confi
} else
BUG();
- if (!early)
- pr_info("Processors: %d\n", num_processors);
+ if (!early && !acpi_lapic)
+ pr_info("Processors: %d\n", num_procs);
/*
* Only use the first configuration found.
*/
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (5 preceding siblings ...)
2024-02-13 21:05 ` [patch 06/30] x86/mpparse: Use new APIC registration function Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 08/30] x86/xen/smp_pv: Register fake APICs Thomas Gleixner
` (22 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
The MADT table for XEN/PV dom0 is not really useful and registering the
APICs is momentarily a pointless exercise because XENPV does not use an
APIC at all.
It overrides the x86_init.mpparse.parse_smp_config() callback, resets
num_processors and counts how many of them are provided by the hypervisor.
This is in the way of cleaning up the APIC registration. Prevent MADT
registration for XEN/PV temporarily until the rework is completed and
XEN/PV can use the MADT again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/acpi/boot.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
---
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -23,6 +23,8 @@
#include <linux/serial_core.h>
#include <linux/pgtable.h>
+#include <xen/xen.h>
+
#include <asm/e820/api.h>
#include <asm/irqdomain.h>
#include <asm/pci_x86.h>
@@ -166,7 +168,8 @@ static int __init acpi_parse_madt(struct
static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
{
- topology_register_apic(apic_id, acpi_id, present);
+ if (!xen_pv_domain())
+ topology_register_apic(apic_id, acpi_id, present);
}
static bool __init acpi_is_processor_usable(u32 lapic_flags)
@@ -1087,7 +1090,8 @@ static int __init early_acpi_parse_madt_
return count;
}
- register_lapic_address(acpi_lapic_addr);
+ if (!xen_pv_domain())
+ register_lapic_address(acpi_lapic_addr);
return count;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 08/30] x86/xen/smp_pv: Register fake APICs
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (6 preceding siblings ...)
2024-02-13 21:05 ` [patch 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 09/30] x86/cpu/topology: Confine topology information Thomas Gleixner
` (21 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
XENPV does not use the APIC. It's just piggy packing on the infrastructure
and fiddles with global variables as it sees fit.
These global variables are going away, so let XENPV register pseudo APIC
IDs to keep the accounting correct and keep up the illusion that XEN/PV is
something sane.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/xen/smp_pv.c | 35 +++++++++--------------------------
1 file changed, 9 insertions(+), 26 deletions(-)
---
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -29,6 +29,7 @@
#include <asm/idtentry.h>
#include <asm/desc.h>
#include <asm/cpu.h>
+#include <asm/apic.h>
#include <asm/io_apic.h>
#include <xen/interface/xen.h>
@@ -150,34 +151,16 @@ int xen_smp_intr_init_pv(unsigned int cp
static void __init xen_pv_smp_config(void)
{
- int i, rc;
- unsigned int subtract = 0;
+ u32 apicid = 0;
+ int i;
- num_processors = 0;
- disabled_cpus = 0;
- for (i = 0; i < nr_cpu_ids; i++) {
- rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL);
- if (rc >= 0) {
- num_processors++;
- set_cpu_possible(i, true);
- } else {
- set_cpu_possible(i, false);
- set_cpu_present(i, false);
- subtract++;
- }
+ topology_register_boot_apic(apicid++);
+
+ for (i = 1; i < nr_cpu_ids; i++) {
+ if (HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL) < 0)
+ break;
+ topology_register_apic(apicid++, CPU_ACPIID_INVALID, true);
}
-#ifdef CONFIG_HOTPLUG_CPU
- /* This is akin to using 'nr_cpus' on the Linux command line.
- * Which is OK as when we use 'dom0_max_vcpus=X' we can only
- * have up to X, while nr_cpu_ids is greater than X. This
- * normally is not a problem, except when CPU hotplugging
- * is involved and then there might be more than X CPUs
- * in the guest - which will not work as there is no
- * hypercall to expand the max number of VCPUs an already
- * running guest has. So cap it up to X. */
- if (subtract)
- set_nr_cpu_ids(nr_cpu_ids - subtract);
-#endif
/* Pretend to be a proper enumerated system */
smp_found_config = 1;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 09/30] x86/cpu/topology: Confine topology information
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (7 preceding siblings ...)
2024-02-13 21:05 ` [patch 08/30] x86/xen/smp_pv: Register fake APICs Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 10/30] x86/cpu/topology: Simplify APIC registration Thomas Gleixner
` (20 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Now that all external fiddling with num_processors and disabled_cpus is
gone, move the last user prefill_possible_map() into the topology code too
and remove the global visibility of these variables.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/smp.h | 3 -
arch/x86/kernel/apic/apic.c | 1
arch/x86/kernel/cpu/topology.c | 76 +++++++++++++++++++++++++++++++++++++++--
arch/x86/kernel/smpboot.c | 72 --------------------------------------
4 files changed, 74 insertions(+), 78 deletions(-)
---
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -9,7 +9,6 @@
#include <asm/thread_info.h>
extern int smp_num_siblings;
-extern unsigned int num_processors;
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
@@ -174,8 +173,6 @@ static inline struct cpumask *cpu_llc_sh
}
#endif /* CONFIG_SMP */
-extern unsigned disabled_cpus;
-
#ifdef CONFIG_DEBUG_NMI_SELFTEST
extern void nmi_selftest(void);
#else
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2054,7 +2054,6 @@ void __init init_apic_mappings(void)
pr_info("APIC: disable apic facility\n");
apic_disable();
}
- num_processors = 1;
}
}
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -29,8 +29,8 @@ u32 cpuid_to_apicid[] __read_mostly = {
*/
static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-unsigned int num_processors;
-unsigned disabled_cpus;
+static unsigned int num_processors;
+static unsigned int disabled_cpus;
/*
* The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
@@ -174,6 +174,71 @@ static int generic_processor_info(int ap
return cpu;
}
+static int __initdata setup_possible_cpus = -1;
+
+/*
+ * cpu_possible_mask should be static, it cannot change as cpu's
+ * are onlined, or offlined. The reason is per-cpu data-structures
+ * are allocated by some modules at init time, and don't expect to
+ * do this dynamically on cpu arrival/departure.
+ * cpu_present_mask on the other hand can change dynamically.
+ * In case when cpu_hotplug is not compiled, then we resort to current
+ * behaviour, which is cpu_possible == cpu_present.
+ * - Ashok Raj
+ *
+ * Three ways to find out the number of additional hotplug CPUs:
+ * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
+ * - The user can overwrite it with possible_cpus=NUM
+ * - Otherwise don't reserve additional CPUs.
+ * We do this because additional CPUs waste a lot of memory.
+ * -AK
+ */
+__init void prefill_possible_map(void)
+{
+ int i, possible;
+
+ i = setup_max_cpus ?: 1;
+ if (setup_possible_cpus == -1) {
+ possible = num_processors;
+#ifdef CONFIG_HOTPLUG_CPU
+ if (setup_max_cpus)
+ possible += disabled_cpus;
+#else
+ if (possible > i)
+ possible = i;
+#endif
+ } else
+ possible = setup_possible_cpus;
+
+ total_cpus = max_t(int, possible, num_processors + disabled_cpus);
+
+ /* nr_cpu_ids could be reduced via nr_cpus= */
+ if (possible > nr_cpu_ids) {
+ pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
+ possible, nr_cpu_ids);
+ possible = nr_cpu_ids;
+ }
+
+#ifdef CONFIG_HOTPLUG_CPU
+ if (!setup_max_cpus)
+#endif
+ if (possible > i) {
+ pr_warn("%d Processors exceeds max_cpus limit of %u\n",
+ possible, setup_max_cpus);
+ possible = i;
+ }
+
+ set_nr_cpu_ids(possible);
+
+ pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
+ possible, max_t(int, possible - num_processors, 0));
+
+ reset_cpu_possible_mask();
+
+ for (i = 0; i < possible; i++)
+ set_cpu_possible(i, true);
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -251,6 +316,13 @@ void topology_hotunplug_apic(unsigned in
}
#endif
+static int __init _setup_possible_cpus(char *str)
+{
+ get_option(&str, &setup_possible_cpus);
+ return 0;
+}
+early_param("possible_cpus", _setup_possible_cpus);
+
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
if (!arg || !get_option(&arg, &disabled_cpu_apicid))
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1291,78 +1291,6 @@ void __init native_smp_cpus_done(unsigne
cache_aps_init();
}
-static int __initdata setup_possible_cpus = -1;
-static int __init _setup_possible_cpus(char *str)
-{
- get_option(&str, &setup_possible_cpus);
- return 0;
-}
-early_param("possible_cpus", _setup_possible_cpus);
-
-
-/*
- * cpu_possible_mask should be static, it cannot change as cpu's
- * are onlined, or offlined. The reason is per-cpu data-structures
- * are allocated by some modules at init time, and don't expect to
- * do this dynamically on cpu arrival/departure.
- * cpu_present_mask on the other hand can change dynamically.
- * In case when cpu_hotplug is not compiled, then we resort to current
- * behaviour, which is cpu_possible == cpu_present.
- * - Ashok Raj
- *
- * Three ways to find out the number of additional hotplug CPUs:
- * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
- * - The user can overwrite it with possible_cpus=NUM
- * - Otherwise don't reserve additional CPUs.
- * We do this because additional CPUs waste a lot of memory.
- * -AK
- */
-__init void prefill_possible_map(void)
-{
- int i, possible;
-
- i = setup_max_cpus ?: 1;
- if (setup_possible_cpus == -1) {
- possible = num_processors;
-#ifdef CONFIG_HOTPLUG_CPU
- if (setup_max_cpus)
- possible += disabled_cpus;
-#else
- if (possible > i)
- possible = i;
-#endif
- } else
- possible = setup_possible_cpus;
-
- total_cpus = max_t(int, possible, num_processors + disabled_cpus);
-
- /* nr_cpu_ids could be reduced via nr_cpus= */
- if (possible > nr_cpu_ids) {
- pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
- possible, nr_cpu_ids);
- possible = nr_cpu_ids;
- }
-
-#ifdef CONFIG_HOTPLUG_CPU
- if (!setup_max_cpus)
-#endif
- if (possible > i) {
- pr_warn("%d Processors exceeds max_cpus limit of %u\n",
- possible, setup_max_cpus);
- possible = i;
- }
-
- set_nr_cpu_ids(possible);
-
- pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
- possible, max_t(int, possible - num_processors, 0));
-
- reset_cpu_possible_mask();
-
- for (i = 0; i < possible; i++)
- set_cpu_possible(i, true);
-}
-
/* correctly size the local cpu masks */
void __init setup_cpu_local_masks(void)
{
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 10/30] x86/cpu/topology: Simplify APIC registration
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (8 preceding siblings ...)
2024-02-13 21:05 ` [patch 09/30] x86/cpu/topology: Confine topology information Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 11/30] x86/cpu/topology: Use a data structure for topology info Thomas Gleixner
` (19 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Having the same check whether the number of assigned CPUs has reached the
nr_cpu_ids limit twice in the same code path is pointless. Repeating the
information that CPUs are ignored over and over is also pointless noise.
Remove the redundant check and reduce the noise by using a pr_warn_once().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 23 +++--------------------
1 file changed, 3 insertions(+), 20 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -107,14 +107,6 @@ static int allocate_logical_cpuid(u32 ap
if (cpu >= 0)
return cpu;
- /* Allocate a new cpuid. */
- if (nr_logical_cpuids >= nr_cpu_ids) {
- WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
- "Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apic_id);
- return -EINVAL;
- }
-
cpuid_to_apicid[nr_logical_cpuids] = apic_id;
return nr_logical_cpuids++;
}
@@ -135,7 +127,7 @@ static void cpu_update_apic(int cpu, u32
static int generic_processor_info(int apicid)
{
- int cpu, max = nr_cpu_ids;
+ int cpu;
/* The boot CPU must be set before MADT/MPTABLE parsing happens */
if (cpuid_to_apicid[0] == BAD_APICID)
@@ -155,21 +147,12 @@ static int generic_processor_info(int ap
}
if (num_processors >= nr_cpu_ids) {
- int thiscpu = max + disabled_cpus;
-
- pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
- "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
-
+ pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
disabled_cpus++;
- return -EINVAL;
+ return -ENOSPC;
}
cpu = allocate_logical_cpuid(apicid);
- if (cpu < 0) {
- disabled_cpus++;
- return -EINVAL;
- }
-
cpu_update_apic(cpu, apicid);
return cpu;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 11/30] x86/cpu/topology: Use a data structure for topology info
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (9 preceding siblings ...)
2024-02-13 21:05 ` [patch 10/30] x86/cpu/topology: Simplify APIC registration Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 12/30] x86/smpboot: Make error message actually useful Thomas Gleixner
` (18 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Put the processor accounting into a data structure, which will gain more
topology related information in the next steps, and sanitize the accounting.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 59 ++++++++++++++++++++---------------------
1 file changed, 29 insertions(+), 30 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -23,25 +23,24 @@ DECLARE_BITMAP(phys_cpu_present_map, MAX
u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
/*
+ * Keep track of assigned, disabled and rejected CPUs. Present assigned
+ * with 1 as CPU #0 is reserved for the boot CPU.
+ */
+static struct {
+ unsigned int nr_assigned_cpus;
+ unsigned int nr_disabled_cpus;
+ unsigned int nr_rejected_cpus;
+} topo_info __read_mostly = {
+ .nr_assigned_cpus = 1,
+};
+
+/*
* Processor to be disabled specified by kernel parameter
* disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
* avoid undefined behaviour caused by sending INIT from AP to BSP.
*/
static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-static unsigned int num_processors;
-static unsigned int disabled_cpus;
-
-/*
- * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
- * contiguously, it equals to current allocated max logical CPU ID plus 1.
- * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
- * so the maximum of nr_logical_cpuids is nr_cpu_ids.
- *
- * NOTE: Reserve 0 for BSP.
- */
-static int nr_logical_cpuids = 1;
-
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -75,7 +74,7 @@ static int __init smp_init_primary_threa
return 0;
}
- for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
+ for (cpu = 0; cpu < topo_info.nr_assigned_cpus; cpu++)
cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
return 0;
}
@@ -89,7 +88,7 @@ static int topo_lookup_cpuid(u32 apic_id
int i;
/* CPU# to APICID mapping is persistent once it is established */
- for (i = 0; i < nr_logical_cpuids; i++) {
+ for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
if (cpuid_to_apicid[i] == apic_id)
return i;
}
@@ -107,22 +106,21 @@ static int allocate_logical_cpuid(u32 ap
if (cpu >= 0)
return cpu;
- cpuid_to_apicid[nr_logical_cpuids] = apic_id;
- return nr_logical_cpuids++;
+ return topo_info.nr_assigned_cpus++;
}
-static void cpu_update_apic(int cpu, u32 apicid)
+static void cpu_update_apic(unsigned int cpu, u32 apic_id)
{
#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
#endif
+ cpuid_to_apicid[cpu] = apic_id;
set_cpu_possible(cpu, true);
- set_bit(apicid, phys_cpu_present_map);
+ set_bit(apic_id, phys_cpu_present_map);
set_cpu_present(cpu, true);
- num_processors++;
if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apicid);
+ cpu_mark_primary_thread(cpu, apic_id);
}
static int generic_processor_info(int apicid)
@@ -137,18 +135,18 @@ static int generic_processor_info(int ap
return 0;
if (disabled_cpu_apicid == apicid) {
- int thiscpu = num_processors + disabled_cpus;
+ int thiscpu = topo_info.nr_assigned_cpus + topo_info.nr_disabled_cpus;
pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
thiscpu, apicid);
- disabled_cpus++;
+ topo_info.nr_rejected_cpus++;
return -ENODEV;
}
- if (num_processors >= nr_cpu_ids) {
+ if (topo_info.nr_assigned_cpus >= nr_cpu_ids) {
pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
- disabled_cpus++;
+ topo_info.nr_rejected_cpus++;
return -ENOSPC;
}
@@ -178,14 +176,16 @@ static int __initdata setup_possible_cpu
*/
__init void prefill_possible_map(void)
{
+ unsigned int num_processors = topo_info.nr_assigned_cpus;
+ unsigned int disabled_cpus = topo_info.nr_disabled_cpus;
int i, possible;
i = setup_max_cpus ?: 1;
if (setup_possible_cpus == -1) {
- possible = num_processors;
+ possible = topo_info.nr_assigned_cpus;
#ifdef CONFIG_HOTPLUG_CPU
if (setup_max_cpus)
- possible += disabled_cpus;
+ possible += num_processors;
#else
if (possible > i)
possible = i;
@@ -238,7 +238,7 @@ void __init topology_register_apic(u32 a
}
if (!present) {
- disabled_cpus++;
+ topo_info.nr_disabled_cpus++;
return;
}
@@ -295,7 +295,6 @@ void topology_hotunplug_apic(unsigned in
per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
clear_bit(apic_id, phys_cpu_present_map);
set_cpu_present(cpu, false);
- num_processors--;
}
#endif
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 12/30] x86/smpboot: Make error message actually useful
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (10 preceding siblings ...)
2024-02-13 21:05 ` [patch 11/30] x86/cpu/topology: Use a data structure for topology info Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 13/30] x86/cpu/topology: Sanitize the APIC admission logic Thomas Gleixner
` (17 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
"smpboot: native_kick_ap: bad cpu 33" is absolutely useless information.
Replace it with something meaningful which allows to decode the failure
condition.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/smpboot.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
---
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1072,9 +1072,13 @@ int native_kick_ap(unsigned int cpu, str
pr_debug("++++++++++++++++++++=_---CPU UP %u\n", cpu);
- if (apicid == BAD_APICID || !test_bit(apicid, phys_cpu_present_map) ||
- !apic_id_valid(apicid)) {
- pr_err("%s: bad cpu %d\n", __func__, cpu);
+ if (apicid == BAD_APICID || !apic_id_valid(apicid)) {
+ pr_err("CPU %u has invalid APIC ID %x. Aborting bringup\n", cpu, apicid);
+ return -EINVAL;
+ }
+
+ if (!test_bit(apicid, phys_cpu_present_map)) {
+ pr_err("CPU %u APIC ID %x is not present. Aborting bringup\n", cpu, apicid);
return -EINVAL;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 13/30] x86/cpu/topology: Sanitize the APIC admission logic
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (11 preceding siblings ...)
2024-02-13 21:05 ` [patch 12/30] x86/smpboot: Make error message actually useful Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 14/30] x86/cpu/topology: Rework possible CPU management Thomas Gleixner
` (16 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Move the actually required content of generic_processor_id() into the call
sites and use common helper functions for them. This separates the early
boot registration and the ACPI hotplug mechanism completely which allows
further cleanups and improvements.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3:
Check for disabled APIC early and exclude the boot APIC ID from
the CPUNR exhaustion check. - Sohil, Michael
Rename topo_assign_cpunr() to topo_get_cpunr() as the assignment
happens elsewhere - Arjan
---
arch/x86/kernel/cpu/topology.c | 159 +++++++++++++++++++----------------------
1 file changed, 77 insertions(+), 82 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -30,8 +30,10 @@ static struct {
unsigned int nr_assigned_cpus;
unsigned int nr_disabled_cpus;
unsigned int nr_rejected_cpus;
+ u32 boot_cpu_apic_id;
} topo_info __read_mostly = {
.nr_assigned_cpus = 1,
+ .boot_cpu_apic_id = BAD_APICID,
};
/*
@@ -83,78 +85,6 @@ early_initcall(smp_init_primary_thread_m
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-static int topo_lookup_cpuid(u32 apic_id)
-{
- int i;
-
- /* CPU# to APICID mapping is persistent once it is established */
- for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
- if (cpuid_to_apicid[i] == apic_id)
- return i;
- }
- return -ENODEV;
-}
-
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(u32 apic_id)
-{
- int cpu = topo_lookup_cpuid(apic_id);
-
- if (cpu >= 0)
- return cpu;
-
- return topo_info.nr_assigned_cpus++;
-}
-
-static void cpu_update_apic(unsigned int cpu, u32 apic_id)
-{
-#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
-#endif
- cpuid_to_apicid[cpu] = apic_id;
- set_cpu_possible(cpu, true);
- set_bit(apic_id, phys_cpu_present_map);
- set_cpu_present(cpu, true);
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apic_id);
-}
-
-static int generic_processor_info(int apicid)
-{
- int cpu;
-
- /* The boot CPU must be set before MADT/MPTABLE parsing happens */
- if (cpuid_to_apicid[0] == BAD_APICID)
- panic("Boot CPU APIC not registered yet\n");
-
- if (apicid == boot_cpu_physical_apicid)
- return 0;
-
- if (disabled_cpu_apicid == apicid) {
- int thiscpu = topo_info.nr_assigned_cpus + topo_info.nr_disabled_cpus;
-
- pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
- thiscpu, apicid);
-
- topo_info.nr_rejected_cpus++;
- return -ENODEV;
- }
-
- if (topo_info.nr_assigned_cpus >= nr_cpu_ids) {
- pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
- topo_info.nr_rejected_cpus++;
- return -ENOSPC;
- }
-
- cpu = allocate_logical_cpuid(apicid);
- cpu_update_apic(cpu, apicid);
- return cpu;
-}
-
static int __initdata setup_possible_cpus = -1;
/*
@@ -222,6 +152,43 @@ static int __initdata setup_possible_cpu
set_cpu_possible(i, true);
}
+static int topo_lookup_cpuid(u32 apic_id)
+{
+ int i;
+
+ /* CPU# to APICID mapping is persistent once it is established */
+ for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
+ if (cpuid_to_apicid[i] == apic_id)
+ return i;
+ }
+ return -ENODEV;
+}
+
+static int topo_get_cpunr(u32 apic_id)
+{
+ int cpu = topo_lookup_cpuid(apic_id);
+
+ if (cpu >= 0)
+ return cpu;
+
+ return topo_info.nr_assigned_cpus++;
+}
+
+static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
+{
+#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
+ early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+#endif
+ cpuid_to_apicid[cpu] = apic_id;
+
+ set_cpu_possible(cpu, true);
+ set_cpu_present(cpu, true);
+
+ if (system_state != SYSTEM_BOOTING)
+ cpu_mark_primary_thread(cpu, apic_id);
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -234,17 +201,40 @@ void __init topology_register_apic(u32 a
if (apic_id >= MAX_LOCAL_APIC) {
pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
+ topo_info.nr_rejected_cpus++;
return;
}
- if (!present) {
- topo_info.nr_disabled_cpus++;
+ if (disabled_cpu_apicid == apic_id) {
+ pr_info("Disabling CPU as requested via 'disable_cpu_apicid=0x%x'.\n", apic_id);
+ topo_info.nr_rejected_cpus++;
return;
}
- cpu = generic_processor_info(apic_id);
- if (cpu >= 0)
- early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ /* CPU numbers exhausted? */
+ if (apic_id != topo_info.boot_cpu_apic_id && topo_info.nr_assigned_cpus >= nr_cpu_ids) {
+ pr_warn_once("CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
+ topo_info.nr_rejected_cpus++;
+ return;
+ }
+
+ if (present) {
+ set_bit(apic_id, phys_cpu_present_map);
+
+ /*
+ * Double registration is valid in case of the boot CPU
+ * APIC because that is registered before the enumeration
+ * of the APICs via firmware parsers or VM guest
+ * mechanisms.
+ */
+ if (apic_id == topo_info.boot_cpu_apic_id)
+ cpu = 0;
+ else
+ cpu = topo_get_cpunr(apic_id);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
+ } else {
+ topo_info.nr_disabled_cpus++;
+ }
}
/**
@@ -255,8 +245,10 @@ void __init topology_register_apic(u32 a
*/
void __init topology_register_boot_apic(u32 apic_id)
{
- cpuid_to_apicid[0] = apic_id;
- cpu_update_apic(0, apic_id);
+ WARN_ON_ONCE(topo_info.boot_cpu_apic_id != BAD_APICID);
+
+ topo_info.boot_cpu_apic_id = apic_id;
+ topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
#ifdef CONFIG_ACPI_HOTPLUG_CPU
@@ -274,10 +266,13 @@ int topology_hotplug_apic(u32 apic_id, u
cpu = topo_lookup_cpuid(apic_id);
if (cpu < 0) {
- cpu = generic_processor_info(apic_id);
- if (cpu >= 0)
- per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
+ return -ENOSPC;
+
+ cpu = topo_assign_cpunr(apic_id);
}
+ set_bit(apic_id, phys_cpu_present_map);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
return cpu;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 14/30] x86/cpu/topology: Rework possible CPU management
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (12 preceding siblings ...)
2024-02-13 21:05 ` [patch 13/30] x86/cpu/topology: Sanitize the APIC admission logic Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 15/30] x86/cpu: Detect real BSP on crash kernels Thomas Gleixner
` (15 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of
that it's backwards because it applies command line limits after
registering all APICs.
Rewrite it so that it:
- Applies the command line limits upfront so that only the allowed amount
of APIC IDs can be registered.
- Applies eventual late restrictions in an understandable way
- Uses simple min_t() calculations which are trivial to follow.
- Provides a separate function for resetting to UP mode late in the
bringup process.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/apic.h | 5 +
arch/x86/include/asm/cpu.h | 10 --
arch/x86/include/asm/topology.h | 1
arch/x86/kernel/cpu/topology.c | 176 ++++++++++++++++++++++++----------------
arch/x86/kernel/setup.c | 9 --
arch/x86/kernel/smpboot.c | 6 -
6 files changed, 118 insertions(+), 89 deletions(-)
---
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -175,6 +175,9 @@ extern void topology_register_apic(u32 a
extern void topology_register_boot_apic(u32 apic_id);
extern int topology_hotplug_apic(u32 apic_id, u32 acpi_id);
extern void topology_hotunplug_apic(unsigned int cpu);
+extern void topology_apply_cmdline_limits_early(void);
+extern void topology_init_possible_cpus(void);
+extern void topology_reset_possible_cpus_up(void);
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
@@ -190,6 +193,8 @@ static inline void apic_intr_mode_init(v
static inline void lapic_assign_system_vectors(void) { }
static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
static inline bool apic_needs_pit(void) { return true; }
+static inline void topology_apply_cmdline_limits_early(void) { }
+static inline void topology_init_possible_cpus(void) { }
#endif /* !CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_X2APIC
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -9,18 +9,10 @@
#include <linux/percpu.h>
#include <asm/ibt.h>
-#ifdef CONFIG_SMP
-
-extern void prefill_possible_map(void);
-
-#else /* CONFIG_SMP */
-
-static inline void prefill_possible_map(void) {}
-
+#ifndef CONFIG_SMP
#define cpu_physical_id(cpu) boot_cpu_physical_apicid
#define cpu_acpi_id(cpu) 0
#define safe_smp_processor_id() 0
-
#endif /* CONFIG_SMP */
#ifdef CONFIG_HOTPLUG_CPU
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -191,6 +191,7 @@ static inline bool topology_is_primary_t
{
return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
}
+
#else /* CONFIG_SMP */
#define topology_max_packages() (1)
static inline int
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -5,6 +5,7 @@
#include <xen/xen.h>
#include <asm/apic.h>
+#include <asm/io_apic.h>
#include <asm/mpspec.h>
#include <asm/smp.h>
@@ -85,73 +86,6 @@ early_initcall(smp_init_primary_thread_m
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-static int __initdata setup_possible_cpus = -1;
-
-/*
- * cpu_possible_mask should be static, it cannot change as cpu's
- * are onlined, or offlined. The reason is per-cpu data-structures
- * are allocated by some modules at init time, and don't expect to
- * do this dynamically on cpu arrival/departure.
- * cpu_present_mask on the other hand can change dynamically.
- * In case when cpu_hotplug is not compiled, then we resort to current
- * behaviour, which is cpu_possible == cpu_present.
- * - Ashok Raj
- *
- * Three ways to find out the number of additional hotplug CPUs:
- * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
- * - The user can overwrite it with possible_cpus=NUM
- * - Otherwise don't reserve additional CPUs.
- * We do this because additional CPUs waste a lot of memory.
- * -AK
- */
-__init void prefill_possible_map(void)
-{
- unsigned int num_processors = topo_info.nr_assigned_cpus;
- unsigned int disabled_cpus = topo_info.nr_disabled_cpus;
- int i, possible;
-
- i = setup_max_cpus ?: 1;
- if (setup_possible_cpus == -1) {
- possible = topo_info.nr_assigned_cpus;
-#ifdef CONFIG_HOTPLUG_CPU
- if (setup_max_cpus)
- possible += num_processors;
-#else
- if (possible > i)
- possible = i;
-#endif
- } else
- possible = setup_possible_cpus;
-
- total_cpus = max_t(int, possible, num_processors + disabled_cpus);
-
- /* nr_cpu_ids could be reduced via nr_cpus= */
- if (possible > nr_cpu_ids) {
- pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
- possible, nr_cpu_ids);
- possible = nr_cpu_ids;
- }
-
-#ifdef CONFIG_HOTPLUG_CPU
- if (!setup_max_cpus)
-#endif
- if (possible > i) {
- pr_warn("%d Processors exceeds max_cpus limit of %u\n",
- possible, setup_max_cpus);
- possible = i;
- }
-
- set_nr_cpu_ids(possible);
-
- pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
- possible, max_t(int, possible - num_processors, 0));
-
- reset_cpu_possible_mask();
-
- for (i = 0; i < possible; i++)
- set_cpu_possible(i, true);
-}
-
static int topo_lookup_cpuid(u32 apic_id)
{
int i;
@@ -293,12 +227,114 @@ void topology_hotunplug_apic(unsigned in
}
#endif
-static int __init _setup_possible_cpus(char *str)
+#ifdef CONFIG_SMP
+static unsigned int max_possible_cpus __initdata = NR_CPUS;
+
+/**
+ * topology_apply_cmdline_limits_early - Apply topology command line limits early
+ *
+ * Ensure that command line limits are in effect before firmware parsing
+ * takes place.
+ */
+void __init topology_apply_cmdline_limits_early(void)
+{
+ unsigned int possible = nr_cpu_ids;
+
+ /* 'maxcpus=0' 'nosmp' 'nolapic' 'disableapic' 'noapic' */
+ if (!setup_max_cpus || ioapic_is_disabled || apic_is_disabled)
+ possible = 1;
+
+ /* 'possible_cpus=N' */
+ possible = min_t(unsigned int, max_possible_cpus, possible);
+
+ if (possible < nr_cpu_ids) {
+ pr_info("Limiting to %u possible CPUs\n", possible);
+ set_nr_cpu_ids(possible);
+ }
+}
+
+static __init bool restrict_to_up(void)
+{
+ if (!smp_found_config || ioapic_is_disabled)
+ return true;
+ /*
+ * XEN PV is special as it does not advertise the local APIC
+ * properly, but provides a fake topology for it so that the
+ * infrastructure works. So don't apply the restrictions vs. APIC
+ * here.
+ */
+ if (xen_pv_domain())
+ return false;
+
+ return apic_is_disabled;
+}
+
+void __init topology_init_possible_cpus(void)
+{
+ unsigned int assigned = topo_info.nr_assigned_cpus;
+ unsigned int disabled = topo_info.nr_disabled_cpus;
+ unsigned int total = assigned + disabled;
+ unsigned int cpu, allowed = 1;
+
+ if (!restrict_to_up()) {
+ if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
+ disabled += assigned - nr_cpu_ids;
+ assigned = nr_cpu_ids;
+ }
+ allowed = min_t(unsigned int, total, nr_cpu_ids);
+ }
+
+ if (total > allowed)
+ pr_warn("%u possible CPUs exceed the limit of %u\n", total, allowed);
+
+ assigned = min_t(unsigned int, allowed, assigned);
+ disabled = allowed - assigned;
+
+ topo_info.nr_assigned_cpus = assigned;
+ topo_info.nr_disabled_cpus = disabled;
+
+ total_cpus = allowed;
+ set_nr_cpu_ids(allowed);
+
+ pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
+ if (topo_info.nr_rejected_cpus)
+ pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
+
+ init_cpu_present(cpumask_of(0));
+ init_cpu_possible(cpumask_of(0));
+
+ for (cpu = 0; cpu < allowed; cpu++) {
+ u32 apicid = cpuid_to_apicid[cpu];
+
+ set_cpu_possible(cpu, true);
+
+ if (apicid == BAD_APICID)
+ continue;
+
+ set_cpu_present(cpu, test_bit(apicid, phys_cpu_present_map));
+ }
+}
+
+/*
+ * Late SMP disable after sizing CPU masks when APIC/IOAPIC setup failed.
+ */
+void __init topology_reset_possible_cpus_up(void)
{
- get_option(&str, &setup_possible_cpus);
+ init_cpu_present(cpumask_of(0));
+ init_cpu_possible(cpumask_of(0));
+
+ bitmap_zero(phys_cpu_present_map, MAX_LOCAL_APIC);
+ if (topo_info.boot_cpu_apic_id != BAD_APICID)
+ set_bit(topo_info.boot_cpu_apic_id, phys_cpu_present_map);
+}
+
+static int __init setup_possible_cpus(char *str)
+{
+ get_option(&str, &max_possible_cpus);
return 0;
}
-early_param("possible_cpus", _setup_possible_cpus);
+early_param("possible_cpus", setup_possible_cpus);
+#endif
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1131,6 +1131,8 @@ void __init setup_arch(char **cmdline_p)
early_quirks();
+ topology_apply_cmdline_limits_early();
+
/*
* Parse SMP configuration. Try ACPI first and then the platform
* specific parser.
@@ -1138,13 +1140,10 @@ void __init setup_arch(char **cmdline_p)
acpi_boot_init();
x86_init.mpparse.parse_smp_cfg();
- /*
- * Systems w/o ACPI and mptables might not have it mapped the local
- * APIC yet, but prefill_possible_map() might need to access it.
- */
+ /* Last opportunity to detect and map the local APIC */
init_apic_mappings();
- prefill_possible_map();
+ topology_init_possible_cpus();
init_cpu_to_node();
init_gi_nodes();
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1147,11 +1147,7 @@ static __init void disable_smp(void)
pr_info("SMP disabled\n");
disable_ioapic_support();
-
- init_cpu_present(cpumask_of(0));
- init_cpu_possible(cpumask_of(0));
-
- reset_phys_cpu_present_map(smp_found_config ? boot_cpu_physical_apicid : 0);
+ topology_reset_possible_cpus_up();
cpumask_set_cpu(0, topology_sibling_cpumask(0));
cpumask_set_cpu(0, topology_core_cpumask(0));
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 15/30] x86/cpu: Detect real BSP on crash kernels
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (13 preceding siblings ...)
2024-02-13 21:05 ` [patch 14/30] x86/cpu/topology: Rework possible CPU management Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 16/30] x86/topology: Add a mechanism to track topology via APIC IDs Thomas Gleixner
` (14 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
When a kdump kernel is started from a crashing CPU then there is no
guarantee that this CPU is the real boot CPU (BSP). If the kdump kernel
tries to online the BSP then the INIT sequence will reset the machine.
There is a command line option to prevent this, but in case of nested kdump
kernels this is wrong.
But that command line option is not required at all because the real
BSP is enumerated as the first CPU by firmware. Support for the only
known system which was different (Voyager) got removed long ago.
Detect whether the boot CPU APIC ID is the first APIC ID enumerated by
the firmware. If the first APIC ID enumerated is not matching the boot
CPU APIC ID then skip registering it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: Adopt to prior ordering and boot APIC exclusion changes
V2: Check for the first enumerated APIC ID (Rui)
---
Documentation/admin-guide/kdump/kdump.rst | 7 -
Documentation/admin-guide/kernel-parameters.txt | 9 --
arch/x86/kernel/cpu/topology.c | 97 ++++++++++++++----------
3 files changed, 61 insertions(+), 52 deletions(-)
---
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -191,9 +191,7 @@ Dump-capture kernel config options (Arch
CPU is enough for kdump kernel to dump vmcore on most of systems.
However, you can also specify nr_cpus=X to enable multiple processors
- in kdump kernel. In this case, "disable_cpu_apicid=" is needed to
- tell kdump kernel which cpu is 1st kernel's BSP. Please refer to
- admin-guide/kernel-parameters.txt for more details.
+ in kdump kernel.
With CONFIG_SMP=n, the above things are not related.
@@ -454,8 +452,7 @@ loading dump-capture kernel.
to use multi-thread programs with it, such as parallel dump feature of
makedumpfile. Otherwise, the multi-thread program may have a great
performance degradation. To enable multi-cpu support, you should bring up an
- SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
- options while loading it.
+ SMP dump-capture kernel and specify maxcpus/nr_cpus options while loading it.
* For s390x there are two kdump modes: If a ELF header is specified with
the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1100,15 +1100,6 @@
Disable TLBIE instruction. Currently does not work
with KVM, with HASH MMU, or with coherent accelerators.
- disable_cpu_apicid= [X86,APIC,SMP]
- Format: <int>
- The number of initial APIC ID for the
- corresponding CPU to be disabled at boot,
- mostly used for the kdump 2nd kernel to
- disable BSP to wake up multiple CPUs without
- causing system reset or hang due to sending
- INIT from AP to BSP.
-
disable_ddw [PPC/PSERIES]
Disable Dynamic DMA Window support. Use this
to workaround buggy firmware.
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -32,18 +32,13 @@ static struct {
unsigned int nr_disabled_cpus;
unsigned int nr_rejected_cpus;
u32 boot_cpu_apic_id;
+ u32 real_bsp_apic_id;
} topo_info __read_mostly = {
.nr_assigned_cpus = 1,
.boot_cpu_apic_id = BAD_APICID,
+ .real_bsp_apic_id = BAD_APICID,
};
-/*
- * Processor to be disabled specified by kernel parameter
- * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
- * avoid undefined behaviour caused by sending INIT from AP to BSP.
- */
-static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -123,6 +118,60 @@ static void topo_set_cpuids(unsigned int
cpu_mark_primary_thread(cpu, apic_id);
}
+static __init bool check_for_real_bsp(u32 apic_id)
+{
+ /*
+ * There is no real good way to detect whether this a kdump()
+ * kernel, but except on the Voyager SMP monstrosity which is not
+ * longer supported, the real BSP APIC ID is the first one which is
+ * enumerated by firmware. That allows to detect whether the boot
+ * CPU is the real BSP. If it is not, then do not register the APIC
+ * because sending INIT to the real BSP would reset the whole
+ * system.
+ *
+ * The first APIC ID which is enumerated by firmware is detectable
+ * because the boot CPU APIC ID is registered before that without
+ * invoking this code.
+ */
+ if (topo_info.real_bsp_apic_id != BAD_APICID)
+ return false;
+
+ if (apic_id == topo_info.boot_cpu_apic_id) {
+ topo_info.real_bsp_apic_id = apic_id;
+ return false;
+ }
+
+ pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x > %x\n",
+ topo_info.boot_cpu_apic_id, apic_id);
+ pr_warn("Crash kernel detected. Disabling real BSP to prevent machine INIT\n");
+
+ topo_info.real_bsp_apic_id = apic_id;
+ return true;
+}
+
+static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
+{
+ int cpu;
+
+ if (present) {
+ set_bit(apic_id, phys_cpu_present_map);
+
+ /*
+ * Double registration is valid in case of the boot CPU
+ * APIC because that is registered before the enumeration
+ * of the APICs via firmware parsers or VM guest
+ * mechanisms.
+ */
+ if (apic_id == topo_info.boot_cpu_apic_id)
+ cpu = 0;
+ else
+ cpu = topo_get_cpunr(apic_id);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
+ } else {
+ topo_info.nr_disabled_cpus++;
+ }
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -131,16 +180,13 @@ static void topo_set_cpuids(unsigned int
*/
void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
-
if (apic_id >= MAX_LOCAL_APIC) {
pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
topo_info.nr_rejected_cpus++;
return;
}
- if (disabled_cpu_apicid == apic_id) {
- pr_info("Disabling CPU as requested via 'disable_cpu_apicid=0x%x'.\n", apic_id);
+ if (check_for_real_bsp(apic_id)) {
topo_info.nr_rejected_cpus++;
return;
}
@@ -152,23 +198,7 @@ void __init topology_register_apic(u32 a
return;
}
- if (present) {
- set_bit(apic_id, phys_cpu_present_map);
-
- /*
- * Double registration is valid in case of the boot CPU
- * APIC because that is registered before the enumeration
- * of the APICs via firmware parsers or VM guest
- * mechanisms.
- */
- if (apic_id == topo_info.boot_cpu_apic_id)
- cpu = 0;
- else
- cpu = topo_get_cpunr(apic_id);
- topo_set_cpuids(cpu, apic_id, acpi_id);
- } else {
- topo_info.nr_disabled_cpus++;
- }
+ topo_register_apic(apic_id, acpi_id, present);
}
/**
@@ -182,7 +212,7 @@ void __init topology_register_boot_apic(
WARN_ON_ONCE(topo_info.boot_cpu_apic_id != BAD_APICID);
topo_info.boot_cpu_apic_id = apic_id;
- topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
+ topo_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
#ifdef CONFIG_ACPI_HOTPLUG_CPU
@@ -335,12 +365,3 @@ static int __init setup_possible_cpus(ch
}
early_param("possible_cpus", setup_possible_cpus);
#endif
-
-static int __init apic_set_disabled_cpu_apicid(char *arg)
-{
- if (!arg || !get_option(&arg, &disabled_cpu_apicid))
- return -EINVAL;
-
- return 0;
-}
-early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 16/30] x86/topology: Add a mechanism to track topology via APIC IDs
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (14 preceding siblings ...)
2024-02-13 21:05 ` [patch 15/30] x86/cpu: Detect real BSP on crash kernels Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug Thomas Gleixner
` (13 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Topology on X86 is determined by the registered APIC IDs and the
segmentation information retrieved from CPUID. Depending on the granularity
of the provided CPUID information the most fine grained scheme looks like
this according to Intel terminology:
[PKG][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]
Not enumerated domain levels consume 0 bits in the APIC ID. This allows to
provide a consistent view at the topology and determine other information
precisely like the number of cores in a package on hybrid systems, where
the existing assumption that number or cores == number of threads / threads
per core does not hold.
Provide per domain level bitmaps which record the APIC ID split into the
domain levels to make later evaluation of domain level specific information
simple. This allows to calculate e.g. the logical IDs without any further
extra logic.
Contrary to the existing registration mechanism this records disabled CPUs,
which are subject to later hotplug as well. That's useful for boot time
sizing of package or die dependent allocations without using heuristics.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 48 +++++++++++++++++++++++++++++++++++++++--
1 file changed, 46 insertions(+), 2 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -1,5 +1,27 @@
// SPDX-License-Identifier: GPL-2.0-only
-
+/*
+ * CPU/APIC topology
+ *
+ * The APIC IDs describe the system topology in multiple domain levels.
+ * The CPUID topology parser provides the information which part of the
+ * APIC ID is associated to the individual levels:
+ *
+ * [PACKAGE][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]
+ *
+ * The root space contains the package (socket) IDs.
+ *
+ * Not enumerated levels consume 0 bits space, but conceptually they are
+ * always represented. If e.g. only CORE and THREAD levels are enumerated
+ * then the DIE, MODULE and TILE have the same physical ID as the PACKAGE.
+ *
+ * If SMT is not supported, then the THREAD domain is still used. It then
+ * has the same physical ID as the CORE domain and is the only child of
+ * the core domain.
+ *
+ * This allows a unified view on the system independent of the enumerated
+ * domain levels without requiring any conditionals in the code.
+ */
+#define pr_fmt(fmt) "CPU topo: " fmt
#include <linux/cpu.h>
#include <xen/xen.h>
@@ -9,6 +31,8 @@
#include <asm/mpspec.h>
#include <asm/smp.h>
+#include "cpu.h"
+
/*
* Map cpu index to physical APIC ID
*/
@@ -23,6 +47,9 @@ DECLARE_BITMAP(phys_cpu_present_map, MAX
/* Used for CPU number allocation and parallel CPU bringup */
u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+/* Bitmaps to mark registered APICs at each topology domain */
+static struct { DECLARE_BITMAP(map, MAX_LOCAL_APIC); } apic_maps[TOPO_MAX_DOMAIN] __ro_after_init;
+
/*
* Keep track of assigned, disabled and rejected CPUs. Present assigned
* with 1 as CPU #0 is reserved for the boot CPU.
@@ -39,6 +66,8 @@ static struct {
.real_bsp_apic_id = BAD_APICID,
};
+#define domain_weight(_dom) bitmap_weight(apic_maps[_dom].map, MAX_LOCAL_APIC)
+
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -81,6 +110,17 @@ early_initcall(smp_init_primary_thread_m
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
+/*
+ * Convert the APIC ID to a domain level ID by masking out the low bits
+ * below the domain level @dom.
+ */
+static inline u32 topo_apicid(u32 apicid, enum x86_topology_domains dom)
+{
+ if (dom == TOPO_SMT_DOMAIN)
+ return apicid;
+ return apicid & (UINT_MAX << x86_topo_system.dom_shifts[dom - 1]);
+}
+
static int topo_lookup_cpuid(u32 apic_id)
{
int i;
@@ -151,7 +191,7 @@ static __init bool check_for_real_bsp(u3
static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
+ int cpu, dom;
if (present) {
set_bit(apic_id, phys_cpu_present_map);
@@ -170,6 +210,10 @@ static __init void topo_register_apic(u3
} else {
topo_info.nr_disabled_cpus++;
}
+
+ /* Register present and possible CPUs in the domain maps */
+ for (dom = TOPO_SMT_DOMAIN; dom < TOPO_MAX_DOMAIN; dom++)
+ set_bit(topo_apicid(apic_id, dom), apic_maps[dom].map);
}
/**
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (15 preceding siblings ...)
2024-02-13 21:05 ` [patch 16/30] x86/topology: Add a mechanism to track topology via APIC IDs Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init Thomas Gleixner
` (12 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
The topology bitmaps track all possible APIC IDs which have been registered
during enumeration. As sizing and further topology information is going to
be derived from these bitmaps, reject attempts to hotplug an APIC ID which
was not registered during enumeration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 4 ++++
1 file changed, 4 insertions(+)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -272,6 +272,10 @@ int topology_hotplug_apic(u32 apic_id, u
if (apic_id >= MAX_LOCAL_APIC)
return -EINVAL;
+ /* Reject if the APIC ID was not registered during enumeration. */
+ if (!test_bit(apic_id, apic_maps[TOPO_SMT_DOMAIN].map))
+ return -ENODEV;
+
cpu = topo_lookup_cpuid(apic_id);
if (cpu < 0) {
if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (16 preceding siblings ...)
2024-02-13 21:05 ` [patch 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug Thomas Gleixner
@ 2024-02-13 21:05 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 19/30] x86/xen/smp_pv: Count number of vCPUs early Thomas Gleixner
` (11 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:05 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
There is no point in assigning the CPU numbers during ACPI physical
hotplug. The number of possible hotplug CPUs is known when the possible map
is initialized, so the CPU numbers can be associated to the registered
non-present APIC IDs right there.
This allows to put more code into the __init section and makes the related
data __ro_after_init.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -45,7 +45,7 @@ EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_a
DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
/* Used for CPU number allocation and parallel CPU bringup */
-u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+u32 cpuid_to_apicid[] __ro_after_init = { [0 ... NR_CPUS - 1] = BAD_APICID, };
/* Bitmaps to mark registered APICs at each topology domain */
static struct { DECLARE_BITMAP(map, MAX_LOCAL_APIC); } apic_maps[TOPO_MAX_DOMAIN] __ro_after_init;
@@ -60,7 +60,7 @@ static struct {
unsigned int nr_rejected_cpus;
u32 boot_cpu_apic_id;
u32 real_bsp_apic_id;
-} topo_info __read_mostly = {
+} topo_info __ro_after_init = {
.nr_assigned_cpus = 1,
.boot_cpu_apic_id = BAD_APICID,
.real_bsp_apic_id = BAD_APICID,
@@ -133,7 +133,7 @@ static int topo_lookup_cpuid(u32 apic_id
return -ENODEV;
}
-static int topo_get_cpunr(u32 apic_id)
+static __init int topo_get_cpunr(u32 apic_id)
{
int cpu = topo_lookup_cpuid(apic_id);
@@ -149,8 +149,6 @@ static void topo_set_cpuids(unsigned int
early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
#endif
- cpuid_to_apicid[cpu] = apic_id;
-
set_cpu_possible(cpu, true);
set_cpu_present(cpu, true);
@@ -206,6 +204,8 @@ static __init void topo_register_apic(u3
cpu = 0;
else
cpu = topo_get_cpunr(apic_id);
+
+ cpuid_to_apicid[cpu] = apic_id;
topo_set_cpuids(cpu, apic_id, acpi_id);
} else {
topo_info.nr_disabled_cpus++;
@@ -277,12 +277,9 @@ int topology_hotplug_apic(u32 apic_id, u
return -ENODEV;
cpu = topo_lookup_cpuid(apic_id);
- if (cpu < 0) {
- if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
- return -ENOSPC;
+ if (cpu < 0)
+ return -ENOSPC;
- cpu = topo_assign_cpunr(apic_id);
- }
set_bit(apic_id, phys_cpu_present_map);
topo_set_cpuids(cpu, apic_id, acpi_id);
return cpu;
@@ -353,6 +350,7 @@ void __init topology_init_possible_cpus(
unsigned int disabled = topo_info.nr_disabled_cpus;
unsigned int total = assigned + disabled;
unsigned int cpu, allowed = 1;
+ u32 apicid;
if (!restrict_to_up()) {
if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
@@ -381,8 +379,17 @@ void __init topology_init_possible_cpus(
init_cpu_present(cpumask_of(0));
init_cpu_possible(cpumask_of(0));
+ /* Assign CPU numbers to non-present CPUs */
+ for (apicid = 0; disabled; disabled--, apicid++) {
+ apicid = find_next_andnot_bit(apic_maps[TOPO_SMT_DOMAIN].map, phys_cpu_present_map,
+ MAX_LOCAL_APIC, apicid);
+ if (apicid >= MAX_LOCAL_APIC)
+ break;
+ cpuid_to_apicid[topo_info.nr_assigned_cpus++] = apicid;
+ }
+
for (cpu = 0; cpu < allowed; cpu++) {
- u32 apicid = cpuid_to_apicid[cpu];
+ apicid = cpuid_to_apicid[cpu];
set_cpu_possible(cpu, true);
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 19/30] x86/xen/smp_pv: Count number of vCPUs early
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (17 preceding siblings ...)
2024-02-13 21:05 ` [patch 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT Thomas Gleixner
` (10 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
XEN/PV has a completely broken vCPU enumeration scheme, which just works by
chance and provides zero topology information. Each vCPU ends up being a
single core package.
Dom0 provides MADT which can be used for topology information, but that
table is the unmodified host table, which means that there can be more CPUs
registered than the number of vCPUs XEN provides for the dom0 guest.
DomU does not have ACPI and both rely on counting the possible vCPUs via an
hypercall.
To prepare for using CPUID topology information either via MADT or via fake
APIC IDs count the number of possible CPUs during early boot and adjust
nr_cpu_ids() accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/xen/enlighten_pv.c | 3 +++
arch/x86/xen/smp.h | 2 ++
arch/x86/xen/smp_pv.c | 14 ++++++++++++++
3 files changed, 19 insertions(+)
---
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -200,6 +200,9 @@ static void __init xen_pv_init_platform(
xen_set_mtrr_data();
else
mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK);
+
+ /* Adjust nr_cpu_ids before "enumeration" happens */
+ xen_smp_count_cpus();
}
static void __init xen_pv_guest_late_init(void)
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -19,6 +19,7 @@ extern void xen_smp_intr_free(unsigned i
int xen_smp_intr_init_pv(unsigned int cpu);
void xen_smp_intr_free_pv(unsigned int cpu);
+void xen_smp_count_cpus(void);
void xen_smp_cpus_done(unsigned int max_cpus);
void xen_smp_send_reschedule(int cpu);
@@ -44,6 +45,7 @@ static inline int xen_smp_intr_init_pv(u
return 0;
}
static inline void xen_smp_intr_free_pv(unsigned int cpu) {}
+static inline void xen_smp_count_cpus(void) { }
#endif /* CONFIG_SMP */
#endif
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -411,6 +411,20 @@ static irqreturn_t xen_irq_work_interrup
return IRQ_HANDLED;
}
+void __init xen_smp_count_cpus(void)
+{
+ unsigned int cpus;
+
+ for (cpus = 0; cpus < nr_cpu_ids; cpus++) {
+ if (HYPERVISOR_vcpu_op(VCPUOP_is_up, cpus, NULL) < 0)
+ break;
+ }
+
+ pr_info("Xen PV: Detected %u vCPUS\n", cpus);
+ if (cpus < nr_cpu_ids)
+ set_nr_cpu_ids(cpus);
+}
+
static const struct smp_ops xen_smp_ops __initconst = {
.smp_prepare_boot_cpu = xen_pv_smp_prepare_boot_cpu,
.smp_prepare_cpus = xen_pv_smp_prepare_cpus,
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (18 preceding siblings ...)
2024-02-13 21:06 ` [patch 19/30] x86/xen/smp_pv: Count number of vCPUs early Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 21/30] x86/cpu/topology: Use topology bitmaps for sizing Thomas Gleixner
` (9 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
It turns out that XEN/PV Dom0 has halfways usable CPUID/MADT enumeration
except that it cannot deal with CPUs which are enumerated as disabled in
MADT.
DomU has no MADT and provides at least rudimentary topology information in
CPUID leaves 1 and 4.
For both it's important that there are not more possible Linux CPUs than
vCPUs provided by the hypervisor.
As this is ensured by counting the vCPUs before enumeration happens:
- lift the restrictions in the CPUID evaluation and the MADT parser
- Utilize MADT registration for Dom0
- Keep the fake APIC ID registration for DomU
- Fix the XEN APIC fake so the readout of the local APIC ID works for
Dom0 via the hypercall and for DomU by returning the registered
fake APIC IDs.
With that the XEN/PV fake approximates usefulness.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/acpi/boot.c | 25 ++++++++-----------------
arch/x86/kernel/cpu/topology_common.c | 2 +-
arch/x86/xen/apic.c | 14 +++++++-------
arch/x86/xen/smp_pv.c | 13 ++++++++-----
4 files changed, 24 insertions(+), 30 deletions(-)
---
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -23,8 +23,6 @@
#include <linux/serial_core.h>
#include <linux/pgtable.h>
-#include <xen/xen.h>
-
#include <asm/e820/api.h>
#include <asm/irqdomain.h>
#include <asm/pci_x86.h>
@@ -166,12 +164,6 @@ static int __init acpi_parse_madt(struct
return 0;
}
-static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
-{
- if (!xen_pv_domain())
- topology_register_apic(apic_id, acpi_id, present);
-}
-
static bool __init acpi_is_processor_usable(u32 lapic_flags)
{
if (lapic_flags & ACPI_MADT_ENABLED)
@@ -233,7 +225,7 @@ acpi_parse_x2apic(union acpi_subtable_he
return 0;
}
- acpi_register_lapic(apic_id, processor->uid, enabled);
+ topology_register_apic(apic_id, processor->uid, enabled);
#else
pr_warn("x2apic entry ignored\n");
#endif
@@ -268,9 +260,9 @@ acpi_parse_lapic(union acpi_subtable_hea
* to not preallocating memory for all NR_CPUS
* when we use CPU hotplug.
*/
- acpi_register_lapic(processor->id, /* APIC ID */
- processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ topology_register_apic(processor->id, /* APIC ID */
+ processor->processor_id, /* ACPI ID */
+ processor->lapic_flags & ACPI_MADT_ENABLED);
has_lapic_cpus = true;
return 0;
@@ -288,9 +280,9 @@ acpi_parse_sapic(union acpi_subtable_hea
acpi_table_print_madt_entry(&header->common);
- acpi_register_lapic((processor->id << 8) | processor->eid,/* APIC ID */
- processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ topology_register_apic((processor->id << 8) | processor->eid,/* APIC ID */
+ processor->processor_id, /* ACPI ID */
+ processor->lapic_flags & ACPI_MADT_ENABLED);
return 0;
}
@@ -1090,8 +1082,7 @@ static int __init early_acpi_parse_madt_
return count;
}
- if (!xen_pv_domain())
- register_lapic_address(acpi_lapic_addr);
+ register_lapic_address(acpi_lapic_addr);
return count;
}
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -77,7 +77,7 @@ static bool fake_topology(struct topo_sc
topology_set_dom(tscan, TOPO_SMT_DOMAIN, 0, 1);
topology_set_dom(tscan, TOPO_CORE_DOMAIN, 0, 1);
- return tscan->c->cpuid_level < 1 || xen_pv_domain();
+ return tscan->c->cpuid_level < 1;
}
static void parse_topology(struct topo_scan *tscan, bool early)
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -43,20 +43,20 @@ static u32 xen_apic_read(u32 reg)
struct xen_platform_op op = {
.cmd = XENPF_get_cpuinfo,
.interface_version = XENPF_INTERFACE_VERSION,
- .u.pcpu_info.xen_cpuid = 0,
};
- int ret;
-
- /* Shouldn't need this as APIC is turned off for PV, and we only
- * get called on the bootup processor. But just in case. */
- if (!xen_initial_domain() || smp_processor_id())
- return 0;
+ int ret, cpu;
if (reg == APIC_LVR)
return 0x14;
if (reg != APIC_ID)
return 0;
+ cpu = smp_processor_id();
+ if (!xen_initial_domain())
+ return cpu ? cpuid_to_apicid[cpu] << 24 : 0;
+
+ op.u.pcpu_info.xen_cpuid = cpu;
+
ret = HYPERVISOR_platform_op(&op);
if (ret)
op.u.pcpu_info.apic_id = BAD_APICID;
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -156,11 +156,9 @@ static void __init xen_pv_smp_config(voi
topology_register_boot_apic(apicid++);
- for (i = 1; i < nr_cpu_ids; i++) {
- if (HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL) < 0)
- break;
+ for (i = 1; i < nr_cpu_ids; i++)
topology_register_apic(apicid++, CPU_ACPIID_INVALID, true);
- }
+
/* Pretend to be a proper enumerated system */
smp_found_config = 1;
}
@@ -451,5 +449,10 @@ void __init xen_smp_init(void)
/* Avoid searching for BIOS MP tables */
x86_init.mpparse.find_mptable = x86_init_noop;
x86_init.mpparse.early_parse_smp_cfg = x86_init_noop;
- x86_init.mpparse.parse_smp_cfg = xen_pv_smp_config;
+
+ /* XEN/PV Dom0 has halfways sane topology information via CPUID/MADT */
+ if (xen_initial_domain())
+ x86_init.mpparse.parse_smp_cfg = x86_init_noop;
+ else
+ x86_init.mpparse.parse_smp_cfg = xen_pv_smp_config;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 21/30] x86/cpu/topology: Use topology bitmaps for sizing
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (19 preceding siblings ...)
2024-02-13 21:06 ` [patch 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 22/30] x86/cpu/topology: Mop up primary thread mask handling Thomas Gleixner
` (8 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Now that all possible APIC IDs are tracked in the topology bitmaps, its
trivial to retrieve the real information from there.
This gets rid of the guesstimates for the maximal packages and dies per
package as the actual numbers can be determined before a single AP has been
brought up.
The number of SMT threads can now be determined correctly from the bitmaps
in all situations. Up to now a system which has SMT disabled in the BIOS
will still claim that it is SMT capable, because the lowest APIC ID bit is
reserved for that and CPUID leaf 0xb/0x1f still enumerates the SMT domain
accordingly. By calculating the bitmap weights of the SMT and the CORE
domain and setting them into relation the SMT disabled in BIOS situation
reports correctly that the system is not SMT capable.
It also handles the situation correctly when a hybrid systems boot CPU does
not have SMT as it takes the SMT capability of the APs fully into account.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: Fix the SMT capabable calculation - Rui
---
arch/x86/include/asm/smp.h | 3 +--
arch/x86/include/asm/topology.h | 23 ++++++++++++-----------
arch/x86/kernel/cpu/common.c | 9 ++++++---
arch/x86/kernel/cpu/debugfs.c | 2 +-
arch/x86/kernel/cpu/topology.c | 20 +++++++++++++++++++-
arch/x86/kernel/cpu/topology_common.c | 24 ------------------------
arch/x86/kernel/smpboot.c | 16 ----------------
arch/x86/xen/smp.c | 2 --
8 files changed, 39 insertions(+), 60 deletions(-)
---
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -8,7 +8,7 @@
#include <asm/current.h>
#include <asm/thread_info.h>
-extern int smp_num_siblings;
+extern unsigned int smp_num_siblings;
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
@@ -109,7 +109,6 @@ void cpu_disable_common(void);
void native_smp_prepare_boot_cpu(void);
void smp_prepare_cpus_common(void);
void native_smp_prepare_cpus(unsigned int max_cpus);
-void calculate_max_logical_packages(void);
void native_smp_cpus_done(unsigned int max_cpus);
int common_cpu_up(unsigned int cpunum, struct task_struct *tidle);
int native_kick_ap(unsigned int cpu, struct task_struct *tidle);
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -143,7 +143,18 @@ extern const struct cpumask *cpu_cluster
#define topology_amd_node_id(cpu) (cpu_data(cpu).topo.amd_node_id)
-extern unsigned int __max_die_per_package;
+extern unsigned int __max_dies_per_package;
+extern unsigned int __max_logical_packages;
+
+static inline unsigned int topology_max_packages(void)
+{
+ return __max_logical_packages;
+}
+
+static inline unsigned int topology_max_die_per_package(void)
+{
+ return __max_dies_per_package;
+}
#ifdef CONFIG_SMP
#define topology_cluster_id(cpu) (cpu_data(cpu).topo.l2c_id)
@@ -152,14 +163,6 @@ extern unsigned int __max_die_per_packag
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
-extern unsigned int __max_logical_packages;
-#define topology_max_packages() (__max_logical_packages)
-
-static inline int topology_max_die_per_package(void)
-{
- return __max_die_per_package;
-}
-
extern int __max_smt_threads;
static inline int topology_max_smt_threads(void)
@@ -193,13 +196,11 @@ static inline bool topology_is_primary_t
}
#else /* CONFIG_SMP */
-#define topology_max_packages() (1)
static inline int
topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
static inline int
topology_update_die_map(unsigned int dieid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
-static inline int topology_max_die_per_package(void) { return 1; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline unsigned int topology_amd_nodes_per_pkg(void) { return 0; };
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -73,11 +73,14 @@
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
+unsigned int smp_num_siblings __ro_after_init = 1;
EXPORT_SYMBOL(smp_num_siblings);
-unsigned int __max_die_per_package __read_mostly = 1;
-EXPORT_SYMBOL(__max_die_per_package);
+unsigned int __max_dies_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__max_dies_per_package);
+
+unsigned int __max_logical_packages __ro_after_init = 1;
+EXPORT_SYMBOL(__max_logical_packages);
static struct ppin_info {
int feature;
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -29,7 +29,7 @@ static int cpu_debug_show(struct seq_fil
seq_printf(m, "amd_node_id: %u\n", c->topo.amd_node_id);
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
- seq_printf(m, "max_die_per_pkg: %u\n", __max_die_per_package);
+ seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
seq_printf(m, "smp_num_siblings: %u\n", smp_num_siblings);
return 0;
}
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -348,8 +348,8 @@ void __init topology_init_possible_cpus(
{
unsigned int assigned = topo_info.nr_assigned_cpus;
unsigned int disabled = topo_info.nr_disabled_cpus;
+ unsigned int cnta, cntb, cpu, allowed = 1;
unsigned int total = assigned + disabled;
- unsigned int cpu, allowed = 1;
u32 apicid;
if (!restrict_to_up()) {
@@ -372,6 +372,24 @@ void __init topology_init_possible_cpus(
total_cpus = allowed;
set_nr_cpu_ids(allowed);
+ cnta = domain_weight(TOPO_PKG_DOMAIN);
+ cntb = domain_weight(TOPO_DIE_DOMAIN);
+ __max_logical_packages = cnta;
+ __max_dies_per_package = 1U << (get_count_order(cntb) - get_count_order(cnta));
+
+ pr_info("Max. logical packages: %3u\n", cnta);
+ pr_info("Max. logical dies: %3u\n", cntb);
+ pr_info("Max. dies per package: %3u\n", __max_dies_per_package);
+
+ cnta = domain_weight(TOPO_CORE_DOMAIN);
+ cntb = domain_weight(TOPO_SMT_DOMAIN);
+ /*
+ * Can't use order delta here as order(cnta) can be equal
+ * order(cntb) even if cnta != cntb.
+ */
+ smp_num_siblings = DIV_ROUND_UP(cntb, cnta);
+ pr_info("Max. threads per core: %3u\n", smp_num_siblings);
+
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -196,16 +196,6 @@ void cpu_parse_topology(struct cpuinfo_x
tscan.dom_shifts[dom], x86_topo_system.dom_shifts[dom]);
}
- /* Bug compatible with the existing parsers */
- if (tscan.dom_ncpus[TOPO_SMT_DOMAIN] > smp_num_siblings) {
- if (system_state == SYSTEM_BOOTING) {
- pr_warn_once("CPU%d: SMT detected and enabled late\n", cpu);
- smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
- } else {
- pr_warn_once("CPU%d: SMT detected after init. Too late!\n", cpu);
- }
- }
-
topo_set_ids(&tscan);
topo_set_max_cores(&tscan);
}
@@ -232,20 +222,6 @@ void __init cpu_init_topology(struct cpu
topo_set_max_cores(&tscan);
/*
- * Bug compatible with the existing code. If the boot CPU does not
- * have SMT this ends up with one sibling. This needs way deeper
- * changes further down the road to get it right during early boot.
- */
- smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
-
- /*
- * Neither it's clear whether there are as many dies as the APIC
- * space indicating die level is. But assume that the actual number
- * of CPUs gives a proper indication for now to stay bug compatible.
- */
- __max_die_per_package = tscan.dom_ncpus[TOPO_DIE_DOMAIN] /
- tscan.dom_ncpus[TOPO_DIE_DOMAIN - 1];
- /*
* AMD systems have Nodes per package which cannot be mapped to
* APIC ID.
*/
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -139,8 +139,6 @@ static DEFINE_PER_CPU_READ_MOSTLY(struct
.phys_die_id = U32_MAX,
};
-unsigned int __max_logical_packages __read_mostly;
-EXPORT_SYMBOL(__max_logical_packages);
static unsigned int logical_packages __read_mostly;
static unsigned int logical_die __read_mostly;
@@ -1267,24 +1265,10 @@ void __init native_smp_prepare_boot_cpu(
native_pv_lock_init();
}
-void __init calculate_max_logical_packages(void)
-{
- int ncpus;
-
- /*
- * Today neither Intel nor AMD support heterogeneous systems so
- * extrapolate the boot cpu's data to all packages.
- */
- ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
- __max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
- pr_info("Max logical packages: %u\n", __max_logical_packages);
-}
-
void __init native_smp_cpus_done(unsigned int max_cpus)
{
pr_debug("Boot done\n");
- calculate_max_logical_packages();
build_sched_topology();
nmi_selftest();
impress_friends();
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -123,8 +123,6 @@ void __init xen_smp_cpus_done(unsigned i
{
if (xen_hvm_domain())
native_smp_cpus_done(max_cpus);
- else
- calculate_max_logical_packages();
}
void xen_smp_send_reschedule(int cpu)
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 22/30] x86/cpu/topology: Mop up primary thread mask handling
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (20 preceding siblings ...)
2024-02-13 21:06 ` [patch 21/30] x86/cpu/topology: Use topology bitmaps for sizing Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread() Thomas Gleixner
` (7 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
The early initcall to initialize the primary thread mask is not longer
required because topology_init_possible_cpus() can mark primary threads
correctly when initializing the possible and present map as the number of
SMT threads is already determined correctly.
The XENPV workaround is not longer required because XENPV now registers
fake APIC IDs which will just work like any other enumeration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 29 ++---------------------------
1 file changed, 2 insertions(+), 27 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -82,30 +82,6 @@ static void cpu_mark_primary_thread(unsi
if (smp_num_siblings == 1 || !(apicid & mask))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
-
-/*
- * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
- * during early boot. Initialize the primary thread mask before SMP
- * bringup.
- */
-static int __init smp_init_primary_thread_mask(void)
-{
- unsigned int cpu;
-
- /*
- * XEN/PV provides either none or useless topology information.
- * Pretend that all vCPUs are primary threads.
- */
- if (xen_pv_domain()) {
- cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
- return 0;
- }
-
- for (cpu = 0; cpu < topo_info.nr_assigned_cpus; cpu++)
- cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
- return 0;
-}
-early_initcall(smp_init_primary_thread_mask);
#else
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
@@ -151,9 +127,6 @@ static void topo_set_cpuids(unsigned int
#endif
set_cpu_possible(cpu, true);
set_cpu_present(cpu, true);
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apic_id);
}
static __init bool check_for_real_bsp(u32 apic_id)
@@ -276,6 +249,7 @@ int topology_hotplug_apic(u32 apic_id, u
set_bit(apic_id, phys_cpu_present_map);
topo_set_cpuids(cpu, apic_id, acpi_id);
+ cpu_mark_primary_thread(cpu, apic_id);
return cpu;
}
@@ -411,6 +385,7 @@ void __init topology_init_possible_cpus(
if (apicid == BAD_APICID)
continue;
+ cpu_mark_primary_thread(cpu, apicid);
set_cpu_present(cpu, test_bit(apicid, phys_cpu_present_map));
}
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread()
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (21 preceding siblings ...)
2024-02-13 21:06 ` [patch 22/30] x86/cpu/topology: Mop up primary thread mask handling Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 24/30] x86/cpu/topology: Provide logical pkg/die mapping Thomas Gleixner
` (6 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
No point in creating a mask via fls(). smp_num_siblings is guaranteed to be
a power of 2. So just using (smp_num_siblings - 1) has the same effect.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -76,10 +76,7 @@ bool arch_match_cpu_phys_id(int cpu, u64
#ifdef CONFIG_SMP
static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
{
- /* Isolate the SMT bit(s) in the APICID and check for 0 */
- u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
-
- if (smp_num_siblings == 1 || !(apicid & mask))
+ if (!(apicid & (smp_num_siblings - 1)))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
#else
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 24/30] x86/cpu/topology: Provide logical pkg/die mapping
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (22 preceding siblings ...)
2024-02-13 21:06 ` [patch 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread() Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 25/30] x86/cpu/topology: Use topology logical mapping mechanism Thomas Gleixner
` (5 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
With the topology bitmaps in place the logical package and die IDs can
trivially be retrieved by determining the bitmap weight of the relevant
topology domain level up to and including the physical ID in question.
Provide a function to that effect.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/topology.h | 9 +++++++++
arch/x86/kernel/cpu/topology.c | 28 ++++++++++++++++++++++++++++
2 files changed, 37 insertions(+)
---
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -156,6 +156,15 @@ static inline unsigned int topology_max_
return __max_dies_per_package;
}
+#ifdef CONFIG_X86_LOCAL_APIC
+int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level);
+#else
+static inline int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
+{
+ return 0;
+}
+#endif
+
#ifdef CONFIG_SMP
#define topology_cluster_id(cpu) (cpu_data(cpu).topo.l2c_id)
#define topology_die_cpumask(cpu) (per_cpu(cpu_die_map, cpu))
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -230,6 +230,34 @@ void __init topology_register_boot_apic(
topo_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
+/**
+ * topology_get_logical_id - Retrieve the logical ID at a given topology domain level
+ * @apicid: The APIC ID for which to lookup the logical ID
+ * @at_level: The topology domain level to use
+ *
+ * @apicid must be a full APIC ID, not the normalized variant. It's valid to have
+ * all bits below the domain level specified by @at_level to be clear. So both
+ * real APIC IDs and backshifted normalized APIC IDs work correctly.
+ *
+ * Returns:
+ * - >= 0: The requested logical ID
+ * - -ERANGE: @apicid is out of range
+ * - -ENODEV: @apicid is not registered
+ */
+int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
+{
+ /* Remove the bits below @at_level to get the proper level ID of @apicid */
+ unsigned int lvlid = topo_apicid(apicid, at_level);
+
+ if (lvlid >= MAX_LOCAL_APIC)
+ return -ERANGE;
+ if (!test_bit(lvlid, apic_maps[at_level].map))
+ return -ENODEV;
+ /* Get the number of set bits before @lvlid. */
+ return bitmap_weight(apic_maps[at_level].map, lvlid);
+}
+EXPORT_SYMBOL_GPL(topology_get_logical_id);
+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/**
* topology_hotplug_apic - Handle a physical hotplugged APIC after boot
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 25/30] x86/cpu/topology: Use topology logical mapping mechanism
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (23 preceding siblings ...)
2024-02-13 21:06 ` [patch 24/30] x86/cpu/topology: Provide logical pkg/die mapping Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps Thomas Gleixner
` (4 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Replace the logical package and die management functionality and retrieve
the logical IDs from the topology bitmaps.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/topology.h | 15 ++--
arch/x86/kernel/cpu/common.c | 13 ---
arch/x86/kernel/cpu/topology_common.c | 4 +
arch/x86/kernel/smpboot.c | 111 ----------------------------------
4 files changed, 12 insertions(+), 131 deletions(-)
---
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -172,6 +172,13 @@ static inline int topology_get_logical_i
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+
+static inline int topology_phys_to_logical_pkg(unsigned int pkg)
+{
+ return topology_get_logical_id(pkg << x86_topo_system.dom_shifts[TOPO_PKG_DOMAIN],
+ TOPO_PKG_DOMAIN);
+}
+
extern int __max_smt_threads;
static inline int topology_max_smt_threads(void)
@@ -181,10 +188,6 @@ static inline int topology_max_smt_threa
#include <linux/cpu_smt.h>
-int topology_update_package_map(unsigned int apicid, unsigned int cpu);
-int topology_update_die_map(unsigned int dieid, unsigned int cpu);
-int topology_phys_to_logical_pkg(unsigned int pkg);
-
extern unsigned int __amd_nodes_per_pkg;
static inline unsigned int topology_amd_nodes_per_pkg(void)
@@ -205,10 +208,6 @@ static inline bool topology_is_primary_t
}
#else /* CONFIG_SMP */
-static inline int
-topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
-static inline int
-topology_update_die_map(unsigned int dieid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1718,18 +1718,6 @@ static void generic_identify(struct cpui
#endif
}
-static void update_package_map(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
- unsigned int cpu = smp_processor_id();
-
- BUG_ON(topology_update_package_map(c->topo.pkg_id, cpu));
- BUG_ON(topology_update_die_map(c->topo.die_id, cpu));
-#else
- c->topo.logical_pkg_id = 0;
-#endif
-}
-
/*
* This does the hard work of actually picking apart the CPU stuff...
*/
@@ -1913,7 +1901,6 @@ void identify_secondary_cpu(struct cpuin
#ifdef CONFIG_X86_32
enable_sep_cpu();
#endif
- update_package_map(c);
x86_spec_ctrl_setup_ap();
update_srbds_msr();
if (boot_cpu_has_bug(X86_BUG_GDS))
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -10,6 +10,7 @@
#include "cpu.h"
struct x86_topology_system x86_topo_system __ro_after_init;
+EXPORT_SYMBOL_GPL(x86_topo_system);
unsigned int __amd_nodes_per_pkg __ro_after_init;
EXPORT_SYMBOL_GPL(__amd_nodes_per_pkg);
@@ -147,6 +148,9 @@ static void topo_set_ids(struct topo_sca
c->topo.pkg_id = topo_shift_apicid(apicid, TOPO_PKG_DOMAIN);
c->topo.die_id = topo_shift_apicid(apicid, TOPO_DIE_DOMAIN);
+ c->topo.logical_pkg_id = topology_get_logical_id(apicid, TOPO_PKG_DOMAIN);
+ c->topo.logical_die_id = topology_get_logical_id(apicid, TOPO_DIE_DOMAIN);
+
/* Package relative core ID */
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -125,23 +125,6 @@ struct mwait_cpu_dead {
*/
static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
-/* Logical package management. */
-struct logical_maps {
- u32 phys_pkg_id;
- u32 phys_die_id;
- u32 logical_pkg_id;
- u32 logical_die_id;
-};
-
-/* Temporary workaround until the full topology mechanics is in place */
-static DEFINE_PER_CPU_READ_MOSTLY(struct logical_maps, logical_maps) = {
- .phys_pkg_id = U32_MAX,
- .phys_die_id = U32_MAX,
-};
-
-static unsigned int logical_packages __read_mostly;
-static unsigned int logical_die __read_mostly;
-
/* Maximum number of SMT threads on any online core */
int __read_mostly __max_smt_threads = 1;
@@ -334,103 +317,11 @@ static void notrace start_secondary(void
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}
-/**
- * topology_phys_to_logical_pkg - Map a physical package id to a logical
- * @phys_pkg: The physical package id to map
- *
- * Returns logical package id or -1 if not found
- */
-int topology_phys_to_logical_pkg(unsigned int phys_pkg)
-{
- int cpu;
-
- for_each_possible_cpu(cpu) {
- if (per_cpu(logical_maps.phys_pkg_id, cpu) == phys_pkg)
- return per_cpu(logical_maps.logical_pkg_id, cpu);
- }
- return -1;
-}
-EXPORT_SYMBOL(topology_phys_to_logical_pkg);
-
-/**
- * topology_phys_to_logical_die - Map a physical die id to logical
- * @die_id: The physical die id to map
- * @cur_cpu: The CPU for which the mapping is done
- *
- * Returns logical die id or -1 if not found
- */
-static int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu)
-{
- int cpu, proc_id = cpu_data(cur_cpu).topo.pkg_id;
-
- for_each_possible_cpu(cpu) {
- if (per_cpu(logical_maps.phys_pkg_id, cpu) == proc_id &&
- per_cpu(logical_maps.phys_die_id, cpu) == die_id)
- return per_cpu(logical_maps.logical_die_id, cpu);
- }
- return -1;
-}
-
-/**
- * topology_update_package_map - Update the physical to logical package map
- * @pkg: The physical package id as retrieved via CPUID
- * @cpu: The cpu for which this is updated
- */
-int topology_update_package_map(unsigned int pkg, unsigned int cpu)
-{
- int new;
-
- /* Already available somewhere? */
- new = topology_phys_to_logical_pkg(pkg);
- if (new >= 0)
- goto found;
-
- new = logical_packages++;
- if (new != pkg) {
- pr_info("CPU %u Converting physical %u to logical package %u\n",
- cpu, pkg, new);
- }
-found:
- per_cpu(logical_maps.phys_pkg_id, cpu) = pkg;
- per_cpu(logical_maps.logical_pkg_id, cpu) = new;
- cpu_data(cpu).topo.logical_pkg_id = new;
- return 0;
-}
-/**
- * topology_update_die_map - Update the physical to logical die map
- * @die: The die id as retrieved via CPUID
- * @cpu: The cpu for which this is updated
- */
-int topology_update_die_map(unsigned int die, unsigned int cpu)
-{
- int new;
-
- /* Already available somewhere? */
- new = topology_phys_to_logical_die(die, cpu);
- if (new >= 0)
- goto found;
-
- new = logical_die++;
- if (new != die) {
- pr_info("CPU %u Converting physical %u to logical die %u\n",
- cpu, die, new);
- }
-found:
- per_cpu(logical_maps.phys_die_id, cpu) = die;
- per_cpu(logical_maps.logical_die_id, cpu) = new;
- cpu_data(cpu).topo.logical_die_id = new;
- return 0;
-}
-
static void __init smp_store_boot_cpu_info(void)
{
- int id = 0; /* CPU 0 */
- struct cpuinfo_x86 *c = &cpu_data(id);
+ struct cpuinfo_x86 *c = &cpu_data(0);
*c = boot_cpu_data;
- c->cpu_index = id;
- topology_update_package_map(c->topo.pkg_id, id);
- topology_update_die_map(c->topo.die_id, id);
c->initialized = true;
}
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (24 preceding siblings ...)
2024-02-13 21:06 ` [patch 25/30] x86/cpu/topology: Use topology logical mapping mechanism Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 27/30] x86/cpu/topology: Rename smp_num_siblings Thomas Gleixner
` (3 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Similar to other sizing information the number of cores per package can be
established from the topology bitmap.
Provide a function for retrieving that information and replace the buggy
hack in the CPUID evaluation with it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/topology.c | 43 ++++++++++++++++++++++++++++++++++
arch/x86/kernel/cpu/topology.h | 11 ++++++++
arch/x86/kernel/cpu/topology_common.c | 18 ++------------
3 files changed, 57 insertions(+), 15 deletions(-)
---
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -217,6 +217,49 @@ int topology_get_logical_id(u32 apicid,
}
EXPORT_SYMBOL_GPL(topology_get_logical_id);
+/**
+ * topology_unit_count - Retrieve the count of specified units at a given topology domain level
+ * @apicid: The APIC ID which specifies the search range
+ * @which_units: The domain level specifying the units to count
+ * @at_level: The domain level at which @which_units have to be counted
+ *
+ * This returns the number of possible units according to the enumerated
+ * information.
+ *
+ * E.g. topology_count_units(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN)
+ * counts the number of possible cores in the package to which @apicid
+ * belongs.
+ *
+ * @at_level must obviously be greater than @which_level to produce useful
+ * results. If @at_level is equal to @which_units the result is
+ * unsurprisingly 1. If @at_level is less than @which_units the results
+ * is by definition undefined and the function returns 0.
+ */
+unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level)
+{
+ /* Remove the bits below @at_level to get the proper level ID of @apicid */
+ unsigned int lvlid = topo_apicid(apicid, at_level);
+ unsigned int id, end, cnt = 0;
+
+ if (lvlid >= MAX_LOCAL_APIC)
+ return 0;
+ if (!test_bit(lvlid, apic_maps[at_level].map))
+ return 0;
+ if (which_units > at_level)
+ return 0;
+ if (which_units == at_level)
+ return 1;
+
+ /* Calculate the exclusive end */
+ end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]);
+ /* Unfortunately there is no bitmap_weight_range() */
+ for (id = find_next_bit(apic_maps[which_units].map, end, lvlid);
+ id < end; id = find_next_bit(apic_maps[which_units].map, end, ++id))
+ cnt++;
+ return cnt;
+}
+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/**
* topology_hotplug_apic - Handle a physical hotplugged APIC after boot
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -53,4 +53,15 @@ static inline void topology_update_dom(s
tscan->dom_ncpus[dom] = ncpus;
}
+#ifdef CONFIG_X86_LOCAL_APIC
+unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level);
+#else
+static inline unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level)
+{
+ return 1;
+}
+#endif
+
#endif /* ARCH_X86_TOPOLOGY_H */
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -155,25 +155,15 @@ static void topo_set_ids(struct topo_sca
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
+ /* Maximum number of cores on this package */
+ c->x86_max_cores = topology_unit_count(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
+
c->topo.amd_node_id = tscan->amd_node_id;
if (c->x86_vendor == X86_VENDOR_AMD)
cpu_topology_fixup_amd(tscan);
}
-static void topo_set_max_cores(struct topo_scan *tscan)
-{
- /*
- * Bug compatible for now. This is broken on hybrid systems:
- * 8 cores SMT + 8 cores w/o SMT
- * tscan.dom_ncpus[TOPO_DIEGRP_DOMAIN] = 24; 24 / 2 = 12 !!
- *
- * Cannot be fixed without further topology enumeration changes.
- */
- tscan->c->x86_max_cores = tscan->dom_ncpus[TOPO_DIEGRP_DOMAIN] >>
- x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
-}
-
void cpu_parse_topology(struct cpuinfo_x86 *c)
{
unsigned int dom, cpu = smp_processor_id();
@@ -201,7 +191,6 @@ void cpu_parse_topology(struct cpuinfo_x
}
topo_set_ids(&tscan);
- topo_set_max_cores(&tscan);
}
void __init cpu_init_topology(struct cpuinfo_x86 *c)
@@ -223,7 +212,6 @@ void __init cpu_init_topology(struct cpu
}
topo_set_ids(&tscan);
- topo_set_max_cores(&tscan);
/*
* AMD systems have Nodes per package which cannot be mapped to
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 27/30] x86/cpu/topology: Rename smp_num_siblings
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (25 preceding siblings ...)
2024-02-13 21:06 ` [patch 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 28/30] x86/cpu/topology: Rename topology_max_die_per_package() Thomas Gleixner
` (2 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
It's really a non-intuitive name. Rename it to __max_threads_per_core which
is obvious.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/perf_event_p4.h | 4 ++--
arch/x86/include/asm/smp.h | 2 --
arch/x86/include/asm/topology.h | 1 +
arch/x86/kernel/cpu/common.c | 6 +++---
arch/x86/kernel/cpu/debugfs.c | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/cpu/topology.c | 6 +++---
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
9 files changed, 13 insertions(+), 14 deletions(-)
---
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -181,7 +181,7 @@ static inline u64 p4_clear_ht_bit(u64 co
static inline int p4_ht_active(void)
{
#ifdef CONFIG_SMP
- return smp_num_siblings > 1;
+ return __max_threads_per_core > 1;
#endif
return 0;
}
@@ -189,7 +189,7 @@ static inline int p4_ht_active(void)
static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
- if (smp_num_siblings == 2)
+ if (__max_threads_per_core == 2)
return cpu != cpumask_first(this_cpu_cpumask_var_ptr(cpu_sibling_map));
#endif
return 0;
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -8,8 +8,6 @@
#include <asm/current.h>
#include <asm/thread_info.h>
-extern unsigned int smp_num_siblings;
-
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -145,6 +145,7 @@ extern const struct cpumask *cpu_cluster
extern unsigned int __max_dies_per_package;
extern unsigned int __max_logical_packages;
+extern unsigned int __max_threads_per_core;
static inline unsigned int topology_max_packages(void)
{
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -73,8 +73,8 @@
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
-unsigned int smp_num_siblings __ro_after_init = 1;
-EXPORT_SYMBOL(smp_num_siblings);
+unsigned int __max_threads_per_core __ro_after_init = 1;
+EXPORT_SYMBOL(__max_threads_per_core);
unsigned int __max_dies_per_package __ro_after_init = 1;
EXPORT_SYMBOL(__max_dies_per_package);
@@ -2251,7 +2251,7 @@ void __init arch_cpu_finalize_init(void)
* identify_boot_cpu() initialized SMT support information, let the
* core code know.
*/
- cpu_smt_set_num_threads(smp_num_siblings, smp_num_siblings);
+ cpu_smt_set_num_threads(__max_threads_per_core, __max_threads_per_core);
if (!IS_ENABLED(CONFIG_SMP)) {
pr_info("CPU: ");
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -30,7 +30,7 @@ static int cpu_debug_show(struct seq_fil
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
- seq_printf(m, "smp_num_siblings: %u\n", smp_num_siblings);
+ seq_printf(m, "max_threads_per_core:%u\n", __max_threads_per_core);
return 0;
}
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -433,7 +433,7 @@ static u32 get_nbc_for_node(int node_id)
struct cpuinfo_x86 *c = &boot_cpu_data;
u32 cores_per_node;
- cores_per_node = (c->x86_max_cores * smp_num_siblings) / topology_amd_nodes_per_pkg();
+ cores_per_node = (c->x86_max_cores * __max_threads_per_core) / topology_amd_nodes_per_pkg();
return cores_per_node * node_id;
}
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -76,7 +76,7 @@ bool arch_match_cpu_phys_id(int cpu, u64
#ifdef CONFIG_SMP
static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
{
- if (!(apicid & (smp_num_siblings - 1)))
+ if (!(apicid & (__max_threads_per_core - 1)))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
#else
@@ -429,8 +429,8 @@ void __init topology_init_possible_cpus(
* Can't use order delta here as order(cnta) can be equal
* order(cntb) even if cnta != cntb.
*/
- smp_num_siblings = DIV_ROUND_UP(cntb, cnta);
- pr_info("Max. threads per core: %3u\n", smp_num_siblings);
+ __max_threads_per_core = DIV_ROUND_UP(cntb, cnta);
+ pr_info("Max. threads per core: %3u\n", __max_threads_per_core);
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -936,7 +936,7 @@ static __cpuidle void mwait_idle(void)
void select_idle_routine(const struct cpuinfo_x86 *c)
{
#ifdef CONFIG_SMP
- if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
+ if (boot_option_idle_override == IDLE_POLL && __max_threads_per_core > 1)
pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
#endif
if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -563,7 +563,7 @@ static void __init build_sched_topology(
void set_cpu_sibling_map(int cpu)
{
- bool has_smt = smp_num_siblings > 1;
+ bool has_smt = __max_threads_per_core > 1;
bool has_mp = has_smt || boot_cpu_data.x86_max_cores > 1;
struct cpuinfo_x86 *c = &cpu_data(cpu);
struct cpuinfo_x86 *o;
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 28/30] x86/cpu/topology: Rename topology_max_die_per_package()
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (26 preceding siblings ...)
2024-02-13 21:06 ` [patch 27/30] x86/cpu/topology: Rename smp_num_siblings Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package Thomas Gleixner
2024-02-13 21:06 ` [patch 30/30] x86/cpu/topology: Get rid of cpuinfo:: X86_max_cores Thomas Gleixner
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
The plural of die is dies.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/events/intel/cstate.c | 2 +-
arch/x86/events/intel/uncore.c | 2 +-
arch/x86/events/intel/uncore_snbep.c | 2 +-
arch/x86/events/rapl.c | 2 +-
arch/x86/include/asm/topology.h | 2 +-
drivers/hwmon/coretemp.c | 2 +-
drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c | 2 +-
drivers/powercap/intel_rapl_common.c | 2 +-
drivers/thermal/intel/intel_hfi.c | 2 +-
drivers/thermal/intel/intel_powerclamp.c | 2 +-
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2 +-
11 files changed, 11 insertions(+), 11 deletions(-)
---
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -834,7 +834,7 @@ static int __init cstate_init(void)
}
if (has_cstate_pkg) {
- if (topology_max_die_per_package() > 1) {
+ if (topology_max_dies_per_package() > 1) {
err = perf_pmu_register(&cstate_pkg_pmu,
"cstate_die", -1);
} else {
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1893,7 +1893,7 @@ static int __init intel_uncore_init(void
return -ENODEV;
__uncore_max_dies =
- topology_max_packages() * topology_max_die_per_package();
+ topology_max_packages() * topology_max_dies_per_package();
id = x86_match_cpu(intel_uncore_match);
if (!id) {
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1406,7 +1406,7 @@ static int topology_gidnid_map(int nodei
*/
for (i = 0; i < 8; i++) {
if (nodeid == GIDNIDMAP(gidnid, i)) {
- if (topology_max_die_per_package() > 1)
+ if (topology_max_dies_per_package() > 1)
die_id = i;
else
die_id = topology_phys_to_logical_pkg(i);
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -674,7 +674,7 @@ static const struct attribute_group *rap
static int __init init_rapl_pmus(void)
{
- int maxdie = topology_max_packages() * topology_max_die_per_package();
+ int maxdie = topology_max_packages() * topology_max_dies_per_package();
size_t size;
size = sizeof(*rapl_pmus) + maxdie * sizeof(struct rapl_pmu *);
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -152,7 +152,7 @@ static inline unsigned int topology_max_
return __max_logical_packages;
}
-static inline unsigned int topology_max_die_per_package(void)
+static inline unsigned int topology_max_dies_per_package(void)
{
return __max_dies_per_package;
}
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -780,7 +780,7 @@ static int __init coretemp_init(void)
if (!x86_match_cpu(coretemp_ids))
return -ENODEV;
- max_zones = topology_max_packages() * topology_max_die_per_package();
+ max_zones = topology_max_packages() * topology_max_dies_per_package();
zone_devices = kcalloc(max_zones, sizeof(struct platform_device *),
GFP_KERNEL);
if (!zone_devices)
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c
@@ -242,7 +242,7 @@ static int __init intel_uncore_init(void
return -ENODEV;
uncore_max_entries = topology_max_packages() *
- topology_max_die_per_package();
+ topology_max_dies_per_package();
uncore_instances = kcalloc(uncore_max_entries,
sizeof(*uncore_instances), GFP_KERNEL);
if (!uncore_instances)
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -1564,7 +1564,7 @@ struct rapl_package *rapl_add_package(in
if (id_is_cpu) {
rp->id = topology_logical_die_id(id);
rp->lead_cpu = id;
- if (topology_max_die_per_package() > 1)
+ if (topology_max_dies_per_package() > 1)
snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
topology_physical_package_id(id), topology_die_id(id));
else
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -581,7 +581,7 @@ void __init intel_hfi_init(void)
/* There is one HFI instance per die/package. */
max_hfi_instances = topology_max_packages() *
- topology_max_die_per_package();
+ topology_max_dies_per_package();
/*
* This allocation may fail. CPU hotplug callbacks must check
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -616,7 +616,7 @@ static int powerclamp_idle_injection_reg
poll_pkg_cstate_enable = false;
if (cpumask_equal(cpu_present_mask, idle_injection_cpu_mask)) {
ii_dev = idle_inject_register_full(idle_injection_cpu_mask, idle_inject_update);
- if (topology_max_packages() == 1 && topology_max_die_per_package() == 1)
+ if (topology_max_packages() == 1 && topology_max_dies_per_package() == 1)
poll_pkg_cstate_enable = true;
} else {
ii_dev = idle_inject_register(idle_injection_cpu_mask);
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -494,7 +494,7 @@ static int __init pkg_temp_thermal_init(
if (!x86_match_cpu(pkg_temp_thermal_ids))
return -ENODEV;
- max_id = topology_max_packages() * topology_max_die_per_package();
+ max_id = topology_max_packages() * topology_max_dies_per_package();
zones = kcalloc(max_id, sizeof(struct zone_device *),
GFP_KERNEL);
if (!zones)
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (27 preceding siblings ...)
2024-02-13 21:06 ` [patch 28/30] x86/cpu/topology: Rename topology_max_die_per_package() Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 30/30] x86/cpu/topology: Get rid of cpuinfo:: X86_max_cores Thomas Gleixner
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Expose properly accounted information and accessors so the fiddling with
other topology variables can be replaced.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/topology.h | 12 ++++++++++++
arch/x86/kernel/cpu/common.c | 6 ++++++
arch/x86/kernel/cpu/topology.c | 8 +++++++-
3 files changed, 25 insertions(+), 1 deletion(-)
---
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -146,6 +146,8 @@ extern const struct cpumask *cpu_cluster
extern unsigned int __max_dies_per_package;
extern unsigned int __max_logical_packages;
extern unsigned int __max_threads_per_core;
+extern unsigned int __num_threads_per_package;
+extern unsigned int __num_cores_per_package;
static inline unsigned int topology_max_packages(void)
{
@@ -157,6 +159,16 @@ static inline unsigned int topology_max_
return __max_dies_per_package;
}
+static inline unsigned int topology_num_cores_per_package(void)
+{
+ return __num_cores_per_package;
+}
+
+static inline unsigned int topology_num_threads_per_package(void)
+{
+ return __num_threads_per_package;
+}
+
#ifdef CONFIG_X86_LOCAL_APIC
int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level);
#else
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -82,6 +82,12 @@ EXPORT_SYMBOL(__max_dies_per_package);
unsigned int __max_logical_packages __ro_after_init = 1;
EXPORT_SYMBOL(__max_logical_packages);
+unsigned int __num_cores_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__num_cores_per_package);
+
+unsigned int __num_threads_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__num_threads_per_package);
+
static struct ppin_info {
int feature;
int msr_ppin_ctl;
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -392,7 +392,7 @@ void __init topology_init_possible_cpus(
unsigned int disabled = topo_info.nr_disabled_cpus;
unsigned int cnta, cntb, cpu, allowed = 1;
unsigned int total = assigned + disabled;
- u32 apicid;
+ u32 apicid, firstid;
if (!restrict_to_up()) {
if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
@@ -432,6 +432,12 @@ void __init topology_init_possible_cpus(
__max_threads_per_core = DIV_ROUND_UP(cntb, cnta);
pr_info("Max. threads per core: %3u\n", __max_threads_per_core);
+ firstid = find_first_bit(apic_maps[TOPO_SMT_DOMAIN].map, MAX_LOCAL_APIC);
+ __num_cores_per_package = topology_unit_count(firstid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
+ pr_info("Num. cores per package: %3u\n", __num_cores_per_package);
+ __num_threads_per_package = topology_unit_count(firstid, TOPO_SMT_DOMAIN, TOPO_PKG_DOMAIN);
+ pr_info("Num. threads per package: %3u\n", __num_threads_per_package);
+
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
^ permalink raw reply [flat|nested] 61+ messages in thread
* [patch 30/30] x86/cpu/topology: Get rid of cpuinfo:: X86_max_cores
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
` (28 preceding siblings ...)
2024-02-13 21:06 ` [patch 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package Thomas Gleixner
@ 2024-02-13 21:06 ` Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] x86/cpu/topology: Get rid of cpuinfo::x86_max_cores tip-bot2 for Thomas Gleixner
29 siblings, 1 reply; 61+ messages in thread
From: Thomas Gleixner @ 2024-02-13 21:06 UTC (permalink / raw)
To: LKML
Cc: x86, Tom Lendacky, Andrew Cooper, Arjan van de Ven, Huang Rui,
Juergen Gross, Dimitri Sivanich, Sohil Mehta, K Prateek Nayak,
Kan Liang, Zhang Rui, Paul E. McKenney, Feng Tang,
Andy Shevchenko, Michael Kelley, Peter Zijlstra (Intel)
From: Thomas Gleixner <tglx@linutronix.de>
Now that __num_cores_per_package and __num_threads_per_package are
available, cpuinfo::x86_max_cores and the related math all over the place
can be replaced with the ready to consume data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Documentation/arch/x86/topology.rst | 24 ++++++++---------------
arch/x86/events/intel/uncore_nhmex.c | 4 +--
arch/x86/events/intel/uncore_snb.c | 8 +++----
arch/x86/events/intel/uncore_snbep.c | 16 +++++++--------
arch/x86/include/asm/processor.h | 2 -
arch/x86/kernel/cpu/cacheinfo.c | 2 -
arch/x86/kernel/cpu/common.c | 1
arch/x86/kernel/cpu/debugfs.c | 3 +-
arch/x86/kernel/cpu/mce/inject.c | 3 --
arch/x86/kernel/cpu/microcode/intel.c | 2 -
arch/x86/kernel/cpu/topology_common.c | 3 --
arch/x86/kernel/smpboot.c | 2 -
drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 -
drivers/hwmon/fam15h_power.c | 2 -
14 files changed, 31 insertions(+), 43 deletions(-)
---
--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -47,17 +47,21 @@ AMD nomenclature for package is 'Node'.
Package-related topology information in the kernel:
- - cpuinfo_x86.x86_max_cores:
+ - topology_num_threads_per_package()
- The number of cores in a package. This information is retrieved via CPUID.
+ The number of threads in a package.
- - cpuinfo_x86.x86_max_dies:
+ - topology_num_cores_per_package()
- The number of dies in a package. This information is retrieved via CPUID.
+ The number of cores in a package.
+
+ - topology_max_dies_per_package()
+
+ The maximum number of dies in a package.
- cpuinfo_x86.topo.die_id:
- The physical ID of the die. This information is retrieved via CPUID.
+ The physical ID of the die.
- cpuinfo_x86.topo.pkg_id:
@@ -96,16 +100,6 @@ are SMT- or CMT-type threads.
AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
"core".
-Core-related topology information in the kernel:
-
- - smp_num_siblings:
-
- The number of threads in a core. The number of threads in a package can be
- calculated by::
-
- threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
-
-
Threads
=======
A thread is a single scheduling unit. It's the equivalent to a logical Linux
--- a/arch/x86/events/intel/uncore_nhmex.c
+++ b/arch/x86/events/intel/uncore_nhmex.c
@@ -1221,8 +1221,8 @@ void nhmex_uncore_cpu_init(void)
uncore_nhmex = true;
else
nhmex_uncore_mbox.event_descs = wsmex_uncore_mbox_events;
- if (nhmex_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- nhmex_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (nhmex_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ nhmex_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = nhmex_msr_uncores;
}
/* end of Nehalem-EX uncore support */
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -364,8 +364,8 @@ static struct intel_uncore_type *snb_msr
void snb_uncore_cpu_init(void)
{
uncore_msr_uncores = snb_msr_uncores;
- if (snb_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- snb_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (snb_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ snb_uncore_cbox.num_boxes = topology_num_cores_per_package();
}
static void skl_uncore_msr_init_box(struct intel_uncore_box *box)
@@ -428,8 +428,8 @@ static struct intel_uncore_type *skl_msr
void skl_uncore_cpu_init(void)
{
uncore_msr_uncores = skl_msr_uncores;
- if (skl_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- skl_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (skl_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ skl_uncore_cbox.num_boxes = topology_num_cores_per_package();
snb_uncore_arb.ops = &skl_uncore_msr_ops;
}
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1172,8 +1172,8 @@ static struct intel_uncore_type *snbep_m
void snbep_uncore_cpu_init(void)
{
- if (snbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- snbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (snbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ snbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = snbep_msr_uncores;
}
@@ -1845,8 +1845,8 @@ static struct intel_uncore_type *ivbep_m
void ivbep_uncore_cpu_init(void)
{
- if (ivbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- ivbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (ivbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ ivbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = ivbep_msr_uncores;
}
@@ -2917,8 +2917,8 @@ static bool hswep_has_limit_sbox(unsigne
void hswep_uncore_cpu_init(void)
{
- if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (hswep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ hswep_uncore_cbox.num_boxes = topology_num_cores_per_package();
/* Detect 6-8 core systems with only two SBOXes */
if (hswep_has_limit_sbox(HSWEP_PCU_DID))
@@ -3280,8 +3280,8 @@ static struct event_constraint bdx_uncor
void bdx_uncore_cpu_init(void)
{
- if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (bdx_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ bdx_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = bdx_msr_uncores;
/* Detect systems with no SBOXes */
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -149,8 +149,6 @@ struct cpuinfo_x86 {
unsigned long loops_per_jiffy;
/* protected processor identification number */
u64 ppin;
- /* cpuid returned max cores value: */
- u16 x86_max_cores;
u16 x86_clflush_size;
/* number of cores as seen by the OS: */
u16 booted_cores;
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -301,7 +301,7 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_
eax->split.type = types[leaf];
eax->split.level = levels[leaf];
eax->split.num_threads_sharing = 0;
- eax->split.num_cores_on_die = __this_cpu_read(cpu_info.x86_max_cores) - 1;
+ eax->split.num_cores_on_die = topology_num_cores_per_package();
if (assoc == 0xffff)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1738,7 +1738,6 @@ static void identify_cpu(struct cpuinfo_
c->x86_model = c->x86_stepping = 0; /* So far unknown... */
c->x86_vendor_id[0] = '\0'; /* Unset */
c->x86_model_id[0] = '\0'; /* Unset */
- c->x86_max_cores = 1;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -28,7 +28,8 @@ static int cpu_debug_show(struct seq_fil
seq_printf(m, "l2c_id: %u\n", c->topo.l2c_id);
seq_printf(m, "amd_node_id: %u\n", c->topo.amd_node_id);
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
- seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
+ seq_printf(m, "num_threads: %u\n", __num_threads_per_package);
+ seq_printf(m, "num_cores: %u\n", __num_cores_per_package);
seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
seq_printf(m, "max_threads_per_core:%u\n", __max_threads_per_core);
return 0;
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -430,10 +430,9 @@ static void trigger_thr_int(void *info)
static u32 get_nbc_for_node(int node_id)
{
- struct cpuinfo_x86 *c = &boot_cpu_data;
u32 cores_per_node;
- cores_per_node = (c->x86_max_cores * __max_threads_per_core) / topology_amd_nodes_per_pkg();
+ cores_per_node = topology_num_threads_per_package() / topology_amd_nodes_per_pkg();
return cores_per_node * node_id;
}
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -641,7 +641,7 @@ static __init void calc_llc_size_per_cor
{
u64 llc_size = c->x86_cache_size * 1024ULL;
- do_div(llc_size, c->x86_max_cores);
+ do_div(llc_size, topology_num_cores_per_package());
llc_size_per_core = (unsigned int)llc_size;
}
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -155,9 +155,6 @@ static void topo_set_ids(struct topo_sca
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
- /* Maximum number of cores on this package */
- c->x86_max_cores = topology_unit_count(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
-
c->topo.amd_node_id = tscan->amd_node_id;
if (c->x86_vendor == X86_VENDOR_AMD)
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -564,7 +564,7 @@ static void __init build_sched_topology(
void set_cpu_sibling_map(int cpu)
{
bool has_smt = __max_threads_per_core > 1;
- bool has_mp = has_smt || boot_cpu_data.x86_max_cores > 1;
+ bool has_mp = has_smt || topology_num_cores_per_package() > 1;
struct cpuinfo_x86 *c = &cpu_data(cpu);
struct cpuinfo_x86 *o;
int i, threads;
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -451,7 +451,7 @@ static int vangogh_init_smc_tables(struc
#ifdef CONFIG_X86
/* AMD x86 APU only */
- smu->cpu_core_num = boot_cpu_data.x86_max_cores;
+ smu->cpu_core_num = topology_num_cores_per_package();
#else
smu->cpu_core_num = 4;
#endif
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -209,7 +209,7 @@ static ssize_t power1_average_show(struc
* With the new x86 topology modelling, x86_max_cores is the
* compute unit number.
*/
- cu_num = boot_cpu_data.x86_max_cores;
+ cu_num = topology_num_cores_per_package();
ret = read_registers(data);
if (ret)
^ permalink raw reply [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Get rid of cpuinfo::x86_max_cores
2024-02-13 21:06 ` [patch 30/30] x86/cpu/topology: Get rid of cpuinfo:: X86_max_cores Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 89b0f15f408f7c4ee98c1ec4c3224852fcbc3274
Gitweb: https://git.kernel.org/tip/89b0f15f408f7c4ee98c1ec4c3224852fcbc3274
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:16 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 16 Feb 2024 15:51:32 +01:00
x86/cpu/topology: Get rid of cpuinfo::x86_max_cores
Now that __num_cores_per_package and __num_threads_per_package are
available, cpuinfo::x86_max_cores and the related math all over the place
can be replaced with the ready to consume data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.176147806@linutronix.de
---
Documentation/arch/x86/topology.rst | 24 +++++----------
arch/x86/events/intel/uncore_nhmex.c | 4 +--
arch/x86/events/intel/uncore_snb.c | 8 ++---
arch/x86/events/intel/uncore_snbep.c | 16 +++++-----
arch/x86/include/asm/processor.h | 2 +-
arch/x86/kernel/cpu/cacheinfo.c | 2 +-
arch/x86/kernel/cpu/common.c | 1 +-
arch/x86/kernel/cpu/debugfs.c | 3 +-
arch/x86/kernel/cpu/mce/inject.c | 3 +--
arch/x86/kernel/cpu/microcode/intel.c | 2 +-
arch/x86/kernel/cpu/topology_common.c | 3 +--
arch/x86/kernel/smpboot.c | 2 +-
drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 2 +-
drivers/hwmon/fam15h_power.c | 2 +-
14 files changed, 31 insertions(+), 43 deletions(-)
diff --git a/Documentation/arch/x86/topology.rst b/Documentation/arch/x86/topology.rst
index 08ebf9e..7352ab8 100644
--- a/Documentation/arch/x86/topology.rst
+++ b/Documentation/arch/x86/topology.rst
@@ -47,17 +47,21 @@ AMD nomenclature for package is 'Node'.
Package-related topology information in the kernel:
- - cpuinfo_x86.x86_max_cores:
+ - topology_num_threads_per_package()
- The number of cores in a package. This information is retrieved via CPUID.
+ The number of threads in a package.
- - cpuinfo_x86.x86_max_dies:
+ - topology_num_cores_per_package()
- The number of dies in a package. This information is retrieved via CPUID.
+ The number of cores in a package.
+
+ - topology_max_dies_per_package()
+
+ The maximum number of dies in a package.
- cpuinfo_x86.topo.die_id:
- The physical ID of the die. This information is retrieved via CPUID.
+ The physical ID of the die.
- cpuinfo_x86.topo.pkg_id:
@@ -96,16 +100,6 @@ are SMT- or CMT-type threads.
AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses
"core".
-Core-related topology information in the kernel:
-
- - smp_num_siblings:
-
- The number of threads in a core. The number of threads in a package can be
- calculated by::
-
- threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings
-
-
Threads
=======
A thread is a single scheduling unit. It's the equivalent to a logical Linux
diff --git a/arch/x86/events/intel/uncore_nhmex.c b/arch/x86/events/intel/uncore_nhmex.c
index 56eea2c..92da8aa 100644
--- a/arch/x86/events/intel/uncore_nhmex.c
+++ b/arch/x86/events/intel/uncore_nhmex.c
@@ -1221,8 +1221,8 @@ void nhmex_uncore_cpu_init(void)
uncore_nhmex = true;
else
nhmex_uncore_mbox.event_descs = wsmex_uncore_mbox_events;
- if (nhmex_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- nhmex_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (nhmex_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ nhmex_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = nhmex_msr_uncores;
}
/* end of Nehalem-EX uncore support */
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 7fd4334..9462fd9 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -364,8 +364,8 @@ static struct intel_uncore_type *snb_msr_uncores[] = {
void snb_uncore_cpu_init(void)
{
uncore_msr_uncores = snb_msr_uncores;
- if (snb_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- snb_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (snb_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ snb_uncore_cbox.num_boxes = topology_num_cores_per_package();
}
static void skl_uncore_msr_init_box(struct intel_uncore_box *box)
@@ -428,8 +428,8 @@ static struct intel_uncore_type *skl_msr_uncores[] = {
void skl_uncore_cpu_init(void)
{
uncore_msr_uncores = skl_msr_uncores;
- if (skl_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- skl_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (skl_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ skl_uncore_cbox.num_boxes = topology_num_cores_per_package();
snb_uncore_arb.ops = &skl_uncore_msr_ops;
}
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 3f6bd3e..2eaf0f3 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1172,8 +1172,8 @@ static struct intel_uncore_type *snbep_msr_uncores[] = {
void snbep_uncore_cpu_init(void)
{
- if (snbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- snbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (snbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ snbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = snbep_msr_uncores;
}
@@ -1845,8 +1845,8 @@ static struct intel_uncore_type *ivbep_msr_uncores[] = {
void ivbep_uncore_cpu_init(void)
{
- if (ivbep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- ivbep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (ivbep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ ivbep_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = ivbep_msr_uncores;
}
@@ -2917,8 +2917,8 @@ static bool hswep_has_limit_sbox(unsigned int device)
void hswep_uncore_cpu_init(void)
{
- if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (hswep_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ hswep_uncore_cbox.num_boxes = topology_num_cores_per_package();
/* Detect 6-8 core systems with only two SBOXes */
if (hswep_has_limit_sbox(HSWEP_PCU_DID))
@@ -3280,8 +3280,8 @@ static struct event_constraint bdx_uncore_pcu_constraints[] = {
void bdx_uncore_cpu_init(void)
{
- if (bdx_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores)
- bdx_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
+ if (bdx_uncore_cbox.num_boxes > topology_num_cores_per_package())
+ bdx_uncore_cbox.num_boxes = topology_num_cores_per_package();
uncore_msr_uncores = bdx_msr_uncores;
/* Detect systems with no SBOXes */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index de1648e..326581d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -149,8 +149,6 @@ struct cpuinfo_x86 {
unsigned long loops_per_jiffy;
/* protected processor identification number */
u64 ppin;
- /* cpuid returned max cores value: */
- u16 x86_max_cores;
u16 x86_clflush_size;
/* number of cores as seen by the OS: */
u16 booted_cores;
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index e1d118e..f2241e7 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -301,7 +301,7 @@ amd_cpuid4(int leaf, union _cpuid4_leaf_eax *eax,
eax->split.type = types[leaf];
eax->split.level = levels[leaf];
eax->split.num_threads_sharing = 0;
- eax->split.num_cores_on_die = __this_cpu_read(cpu_info.x86_max_cores) - 1;
+ eax->split.num_cores_on_die = topology_num_cores_per_package();
if (assoc == 0xffff)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c9a1014..05e0b31 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1738,7 +1738,6 @@ static void identify_cpu(struct cpuinfo_x86 *c)
c->x86_model = c->x86_stepping = 0; /* So far unknown... */
c->x86_vendor_id[0] = '\0'; /* Unset */
c->x86_model_id[0] = '\0'; /* Unset */
- c->x86_max_cores = 1;
#ifdef CONFIG_X86_64
c->x86_clflush_size = 64;
c->x86_phys_bits = 36;
diff --git a/arch/x86/kernel/cpu/debugfs.c b/arch/x86/kernel/cpu/debugfs.c
index f40f3ee..3baf3e4 100644
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -28,7 +28,8 @@ static int cpu_debug_show(struct seq_file *m, void *p)
seq_printf(m, "l2c_id: %u\n", c->topo.l2c_id);
seq_printf(m, "amd_node_id: %u\n", c->topo.amd_node_id);
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
- seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
+ seq_printf(m, "num_threads: %u\n", __num_threads_per_package);
+ seq_printf(m, "num_cores: %u\n", __num_cores_per_package);
seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
seq_printf(m, "max_threads_per_core:%u\n", __max_threads_per_core);
return 0;
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 1e32788..94953d7 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -430,10 +430,9 @@ static void trigger_thr_int(void *info)
static u32 get_nbc_for_node(int node_id)
{
- struct cpuinfo_x86 *c = &boot_cpu_data;
u32 cores_per_node;
- cores_per_node = (c->x86_max_cores * __max_threads_per_core) / topology_amd_nodes_per_pkg();
+ cores_per_node = topology_num_threads_per_package() / topology_amd_nodes_per_pkg();
return cores_per_node * node_id;
}
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 857e608..5f04144 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -641,7 +641,7 @@ static __init void calc_llc_size_per_core(struct cpuinfo_x86 *c)
{
u64 llc_size = c->x86_cache_size * 1024ULL;
- do_div(llc_size, c->x86_max_cores);
+ do_div(llc_size, topology_num_cores_per_package());
llc_size_per_core = (unsigned int)llc_size;
}
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index a2c3f8f..a50ae8d 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -155,9 +155,6 @@ static void topo_set_ids(struct topo_scan *tscan)
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
- /* Maximum number of cores on this package */
- c->x86_max_cores = topology_unit_count(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
-
c->topo.amd_node_id = tscan->amd_node_id;
if (c->x86_vendor == X86_VENDOR_AMD)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 35c272c..9c1e121 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -564,7 +564,7 @@ static void __init build_sched_topology(void)
void set_cpu_sibling_map(int cpu)
{
bool has_smt = __max_threads_per_core > 1;
- bool has_mp = has_smt || boot_cpu_data.x86_max_cores > 1;
+ bool has_mp = has_smt || topology_num_cores_per_package() > 1;
struct cpuinfo_x86 *c = &cpu_data(cpu);
struct cpuinfo_x86 *o;
int i, threads;
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
index 2ff6dee..da1f439 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c
@@ -451,7 +451,7 @@ static int vangogh_init_smc_tables(struct smu_context *smu)
#ifdef CONFIG_X86
/* AMD x86 APU only */
- smu->cpu_core_num = boot_cpu_data.x86_max_cores;
+ smu->cpu_core_num = topology_num_cores_per_package();
#else
smu->cpu_core_num = 4;
#endif
diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index 6307112..9ed2c4b 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -209,7 +209,7 @@ static ssize_t power1_average_show(struct device *dev,
* With the new x86 topology modelling, x86_max_cores is the
* compute unit number.
*/
- cu_num = boot_cpu_data.x86_max_cores;
+ cu_num = topology_num_cores_per_package();
ret = read_registers(data);
if (ret)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Provide __num_[cores|threads]_per_package
2024-02-13 21:06 ` [patch 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: fd43b8ae76e903c76f14d06eb939449bcc3f614f
Gitweb: https://git.kernel.org/tip/fd43b8ae76e903c76f14d06eb939449bcc3f614f
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:14 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:45 +01:00
x86/cpu/topology: Provide __num_[cores|threads]_per_package
Expose properly accounted information and accessors so the fiddling with
other topology variables can be replaced.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.120958987@linutronix.de
---
arch/x86/include/asm/topology.h | 12 ++++++++++++
arch/x86/kernel/cpu/common.c | 6 ++++++
arch/x86/kernel/cpu/topology.c | 8 +++++++-
3 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 6a71794..76b1d87 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -146,6 +146,8 @@ extern const struct cpumask *cpu_clustergroup_mask(int cpu);
extern unsigned int __max_dies_per_package;
extern unsigned int __max_logical_packages;
extern unsigned int __max_threads_per_core;
+extern unsigned int __num_threads_per_package;
+extern unsigned int __num_cores_per_package;
static inline unsigned int topology_max_packages(void)
{
@@ -157,6 +159,16 @@ static inline unsigned int topology_max_dies_per_package(void)
return __max_dies_per_package;
}
+static inline unsigned int topology_num_cores_per_package(void)
+{
+ return __num_cores_per_package;
+}
+
+static inline unsigned int topology_num_threads_per_package(void)
+{
+ return __num_threads_per_package;
+}
+
#ifdef CONFIG_X86_LOCAL_APIC
int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level);
#else
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cb22cb8..c9a1014 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -82,6 +82,12 @@ EXPORT_SYMBOL(__max_dies_per_package);
unsigned int __max_logical_packages __ro_after_init = 1;
EXPORT_SYMBOL(__max_logical_packages);
+unsigned int __num_cores_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__num_cores_per_package);
+
+unsigned int __num_threads_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__num_threads_per_package);
+
static struct ppin_info {
int feature;
int msr_ppin_ctl;
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index b078fac..41dd8e0 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -392,7 +392,7 @@ void __init topology_init_possible_cpus(void)
unsigned int disabled = topo_info.nr_disabled_cpus;
unsigned int cnta, cntb, cpu, allowed = 1;
unsigned int total = assigned + disabled;
- u32 apicid;
+ u32 apicid, firstid;
if (!restrict_to_up()) {
if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
@@ -432,6 +432,12 @@ void __init topology_init_possible_cpus(void)
__max_threads_per_core = DIV_ROUND_UP(cntb, cnta);
pr_info("Max. threads per core: %3u\n", __max_threads_per_core);
+ firstid = find_first_bit(apic_maps[TOPO_SMT_DOMAIN].map, MAX_LOCAL_APIC);
+ __num_cores_per_package = topology_unit_count(firstid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
+ pr_info("Num. cores per package: %3u\n", __num_cores_per_package);
+ __num_threads_per_package = topology_unit_count(firstid, TOPO_SMT_DOMAIN, TOPO_PKG_DOMAIN);
+ pr_info("Num. threads per package: %3u\n", __num_threads_per_package);
+
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Rename topology_max_die_per_package()
2024-02-13 21:06 ` [patch 28/30] x86/cpu/topology: Rename topology_max_die_per_package() Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: bd745d1c41e7fa56242889eb5dc6df2d7dd5df32
Gitweb: https://git.kernel.org/tip/bd745d1c41e7fa56242889eb5dc6df2d7dd5df32
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:13 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:45 +01:00
x86/cpu/topology: Rename topology_max_die_per_package()
The plural of die is dies.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.065874205@linutronix.de
---
arch/x86/events/intel/cstate.c | 2 +-
arch/x86/events/intel/uncore.c | 2 +-
arch/x86/events/intel/uncore_snbep.c | 2 +-
arch/x86/events/rapl.c | 2 +-
arch/x86/include/asm/topology.h | 2 +-
drivers/hwmon/coretemp.c | 2 +-
drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c | 2 +-
drivers/powercap/intel_rapl_common.c | 2 +-
drivers/thermal/intel/intel_hfi.c | 2 +-
drivers/thermal/intel/intel_powerclamp.c | 2 +-
drivers/thermal/intel/x86_pkg_temp_thermal.c | 2 +-
11 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 4b50a3a..326c8cd 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -834,7 +834,7 @@ static int __init cstate_init(void)
}
if (has_cstate_pkg) {
- if (topology_max_die_per_package() > 1) {
+ if (topology_max_dies_per_package() > 1) {
err = perf_pmu_register(&cstate_pkg_pmu,
"cstate_die", -1);
} else {
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 7927c0b..258e2cd 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1893,7 +1893,7 @@ static int __init intel_uncore_init(void)
return -ENODEV;
__uncore_max_dies =
- topology_max_packages() * topology_max_die_per_package();
+ topology_max_packages() * topology_max_dies_per_package();
id = x86_match_cpu(intel_uncore_match);
if (!id) {
diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index a96496b..3f6bd3e 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -1406,7 +1406,7 @@ static int topology_gidnid_map(int nodeid, u32 gidnid)
*/
for (i = 0; i < 8; i++) {
if (nodeid == GIDNIDMAP(gidnid, i)) {
- if (topology_max_die_per_package() > 1)
+ if (topology_max_dies_per_package() > 1)
die_id = i;
else
die_id = topology_phys_to_logical_pkg(i);
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 8d98d46..fb2b196 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -674,7 +674,7 @@ static const struct attribute_group *rapl_attr_update[] = {
static int __init init_rapl_pmus(void)
{
- int maxdie = topology_max_packages() * topology_max_die_per_package();
+ int maxdie = topology_max_packages() * topology_max_dies_per_package();
size_t size;
size = sizeof(*rapl_pmus) + maxdie * sizeof(struct rapl_pmu *);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index f9eb7a7..6a71794 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -152,7 +152,7 @@ static inline unsigned int topology_max_packages(void)
return __max_logical_packages;
}
-static inline unsigned int topology_max_die_per_package(void)
+static inline unsigned int topology_max_dies_per_package(void)
{
return __max_dies_per_package;
}
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index b8fc8d1..b0991dd 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -782,7 +782,7 @@ static int __init coretemp_init(void)
if (!x86_match_cpu(coretemp_ids))
return -ENODEV;
- max_zones = topology_max_packages() * topology_max_die_per_package();
+ max_zones = topology_max_packages() * topology_max_dies_per_package();
zone_devices = kcalloc(max_zones, sizeof(struct platform_device *),
GFP_KERNEL);
if (!zone_devices)
diff --git a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c
index a5e0f5c..b89c0dd 100644
--- a/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c
+++ b/drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c
@@ -242,7 +242,7 @@ static int __init intel_uncore_init(void)
return -ENODEV;
uncore_max_entries = topology_max_packages() *
- topology_max_die_per_package();
+ topology_max_dies_per_package();
uncore_instances = kcalloc(uncore_max_entries,
sizeof(*uncore_instances), GFP_KERNEL);
if (!uncore_instances)
diff --git a/drivers/powercap/intel_rapl_common.c b/drivers/powercap/intel_rapl_common.c
index 2feed03..00c8618 100644
--- a/drivers/powercap/intel_rapl_common.c
+++ b/drivers/powercap/intel_rapl_common.c
@@ -1564,7 +1564,7 @@ struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id
if (id_is_cpu) {
rp->id = topology_logical_die_id(id);
rp->lead_cpu = id;
- if (topology_max_die_per_package() > 1)
+ if (topology_max_dies_per_package() > 1)
snprintf(rp->name, PACKAGE_DOMAIN_NAME_LENGTH, "package-%d-die-%d",
topology_physical_package_id(id), topology_die_id(id));
else
diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/intel_hfi.c
index 3b04c6e..40d664a 100644
--- a/drivers/thermal/intel/intel_hfi.c
+++ b/drivers/thermal/intel/intel_hfi.c
@@ -607,7 +607,7 @@ void __init intel_hfi_init(void)
/* There is one HFI instance per die/package. */
max_hfi_instances = topology_max_packages() *
- topology_max_die_per_package();
+ topology_max_dies_per_package();
/*
* This allocation may fail. CPU hotplug callbacks must check
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index bc6eb0d..4ba6493 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -587,7 +587,7 @@ static int powerclamp_idle_injection_register(void)
poll_pkg_cstate_enable = false;
if (cpumask_equal(cpu_present_mask, idle_injection_cpu_mask)) {
ii_dev = idle_inject_register_full(idle_injection_cpu_mask, idle_inject_update);
- if (topology_max_packages() == 1 && topology_max_die_per_package() == 1)
+ if (topology_max_packages() == 1 && topology_max_dies_per_package() == 1)
poll_pkg_cstate_enable = true;
} else {
ii_dev = idle_inject_register(idle_injection_cpu_mask);
diff --git a/drivers/thermal/intel/x86_pkg_temp_thermal.c b/drivers/thermal/intel/x86_pkg_temp_thermal.c
index 11a7f81..f6c2e59 100644
--- a/drivers/thermal/intel/x86_pkg_temp_thermal.c
+++ b/drivers/thermal/intel/x86_pkg_temp_thermal.c
@@ -494,7 +494,7 @@ static int __init pkg_temp_thermal_init(void)
if (!x86_match_cpu(pkg_temp_thermal_ids))
return -ENODEV;
- max_id = topology_max_packages() * topology_max_die_per_package();
+ max_id = topology_max_packages() * topology_max_dies_per_package();
zones = kcalloc(max_id, sizeof(struct zone_device *),
GFP_KERNEL);
if (!zones)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Rename smp_num_siblings
2024-02-13 21:06 ` [patch 27/30] x86/cpu/topology: Rename smp_num_siblings Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 8078f4d6102f9370b3b7436d25717735d21f5c09
Gitweb: https://git.kernel.org/tip/8078f4d6102f9370b3b7436d25717735d21f5c09
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:12 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:45 +01:00
x86/cpu/topology: Rename smp_num_siblings
It's really a non-intuitive name. Rename it to __max_threads_per_core which
is obvious.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.011307973@linutronix.de
---
arch/x86/include/asm/perf_event_p4.h | 4 ++--
arch/x86/include/asm/smp.h | 2 --
arch/x86/include/asm/topology.h | 1 +
arch/x86/kernel/cpu/common.c | 6 +++---
arch/x86/kernel/cpu/debugfs.c | 2 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/cpu/topology.c | 6 +++---
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/smpboot.c | 2 +-
9 files changed, 13 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/perf_event_p4.h b/arch/x86/include/asm/perf_event_p4.h
index 94de1a0..d65e338 100644
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -181,7 +181,7 @@ static inline u64 p4_clear_ht_bit(u64 config)
static inline int p4_ht_active(void)
{
#ifdef CONFIG_SMP
- return smp_num_siblings > 1;
+ return __max_threads_per_core > 1;
#endif
return 0;
}
@@ -189,7 +189,7 @@ static inline int p4_ht_active(void)
static inline int p4_ht_thread(int cpu)
{
#ifdef CONFIG_SMP
- if (smp_num_siblings == 2)
+ if (__max_threads_per_core == 2)
return cpu != cpumask_first(this_cpu_cpumask_var_ptr(cpu_sibling_map));
#endif
return 0;
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 5318470..54d6d71 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -8,8 +8,6 @@
#include <asm/current.h>
#include <asm/thread_info.h>
-extern unsigned int smp_num_siblings;
-
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 5ce06f3..f9eb7a7 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -145,6 +145,7 @@ extern const struct cpumask *cpu_clustergroup_mask(int cpu);
extern unsigned int __max_dies_per_package;
extern unsigned int __max_logical_packages;
+extern unsigned int __max_threads_per_core;
static inline unsigned int topology_max_packages(void)
{
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 58f9cc7..cb22cb8 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -73,8 +73,8 @@
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
-unsigned int smp_num_siblings __ro_after_init = 1;
-EXPORT_SYMBOL(smp_num_siblings);
+unsigned int __max_threads_per_core __ro_after_init = 1;
+EXPORT_SYMBOL(__max_threads_per_core);
unsigned int __max_dies_per_package __ro_after_init = 1;
EXPORT_SYMBOL(__max_dies_per_package);
@@ -2251,7 +2251,7 @@ void __init arch_cpu_finalize_init(void)
* identify_boot_cpu() initialized SMT support information, let the
* core code know.
*/
- cpu_smt_set_num_threads(smp_num_siblings, smp_num_siblings);
+ cpu_smt_set_num_threads(__max_threads_per_core, __max_threads_per_core);
if (!IS_ENABLED(CONFIG_SMP)) {
pr_info("CPU: ");
diff --git a/arch/x86/kernel/cpu/debugfs.c b/arch/x86/kernel/cpu/debugfs.c
index 543efc4..f40f3ee 100644
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -30,7 +30,7 @@ static int cpu_debug_show(struct seq_file *m, void *p)
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
- seq_printf(m, "smp_num_siblings: %u\n", smp_num_siblings);
+ seq_printf(m, "max_threads_per_core:%u\n", __max_threads_per_core);
return 0;
}
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 2b29045..1e32788 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -433,7 +433,7 @@ static u32 get_nbc_for_node(int node_id)
struct cpuinfo_x86 *c = &boot_cpu_data;
u32 cores_per_node;
- cores_per_node = (c->x86_max_cores * smp_num_siblings) / topology_amd_nodes_per_pkg();
+ cores_per_node = (c->x86_max_cores * __max_threads_per_core) / topology_amd_nodes_per_pkg();
return cores_per_node * node_id;
}
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 7db9df5..b078fac 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -76,7 +76,7 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
#ifdef CONFIG_SMP
static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
{
- if (!(apicid & (smp_num_siblings - 1)))
+ if (!(apicid & (__max_threads_per_core - 1)))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
#else
@@ -429,8 +429,8 @@ void __init topology_init_possible_cpus(void)
* Can't use order delta here as order(cnta) can be equal
* order(cntb) even if cnta != cntb.
*/
- smp_num_siblings = DIV_ROUND_UP(cntb, cnta);
- pr_info("Max. threads per core: %3u\n", smp_num_siblings);
+ __max_threads_per_core = DIV_ROUND_UP(cntb, cnta);
+ pr_info("Max. threads per core: %3u\n", __max_threads_per_core);
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ab49ade..6121c2b 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -936,7 +936,7 @@ static __cpuidle void mwait_idle(void)
void select_idle_routine(const struct cpuinfo_x86 *c)
{
#ifdef CONFIG_SMP
- if (boot_option_idle_override == IDLE_POLL && smp_num_siblings > 1)
+ if (boot_option_idle_override == IDLE_POLL && __max_threads_per_core > 1)
pr_warn_once("WARNING: polling idle and HT enabled, performance may degrade\n");
#endif
if (x86_idle_set() || boot_option_idle_override == IDLE_POLL)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 9ade685..35c272c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -563,7 +563,7 @@ static void __init build_sched_topology(void)
void set_cpu_sibling_map(int cpu)
{
- bool has_smt = smp_num_siblings > 1;
+ bool has_smt = __max_threads_per_core > 1;
bool has_mp = has_smt || boot_cpu_data.x86_max_cores > 1;
struct cpuinfo_x86 *c = &cpu_data(cpu);
struct cpuinfo_x86 *o;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Retrieve cores per package from topology bitmaps
2024-02-13 21:06 ` [patch 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 3205c9833d69b97e8694efe3e193312dea4c571f
Gitweb: https://git.kernel.org/tip/3205c9833d69b97e8694efe3e193312dea4c571f
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:10 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:45 +01:00
x86/cpu/topology: Retrieve cores per package from topology bitmaps
Similar to other sizing information the number of cores per package can be
established from the topology bitmap.
Provide a function for retrieving that information and replace the buggy
hack in the CPUID evaluation with it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.956858282@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 43 ++++++++++++++++++++++++++-
arch/x86/kernel/cpu/topology.h | 11 +++++++-
arch/x86/kernel/cpu/topology_common.c | 18 +----------
3 files changed, 57 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 29759b4..7db9df5 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -257,6 +257,49 @@ int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
}
EXPORT_SYMBOL_GPL(topology_get_logical_id);
+/**
+ * topology_unit_count - Retrieve the count of specified units at a given topology domain level
+ * @apicid: The APIC ID which specifies the search range
+ * @which_units: The domain level specifying the units to count
+ * @at_level: The domain level at which @which_units have to be counted
+ *
+ * This returns the number of possible units according to the enumerated
+ * information.
+ *
+ * E.g. topology_count_units(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN)
+ * counts the number of possible cores in the package to which @apicid
+ * belongs.
+ *
+ * @at_level must obviously be greater than @which_level to produce useful
+ * results. If @at_level is equal to @which_units the result is
+ * unsurprisingly 1. If @at_level is less than @which_units the results
+ * is by definition undefined and the function returns 0.
+ */
+unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level)
+{
+ /* Remove the bits below @at_level to get the proper level ID of @apicid */
+ unsigned int lvlid = topo_apicid(apicid, at_level);
+ unsigned int id, end, cnt = 0;
+
+ if (lvlid >= MAX_LOCAL_APIC)
+ return 0;
+ if (!test_bit(lvlid, apic_maps[at_level].map))
+ return 0;
+ if (which_units > at_level)
+ return 0;
+ if (which_units == at_level)
+ return 1;
+
+ /* Calculate the exclusive end */
+ end = lvlid + (1U << x86_topo_system.dom_shifts[at_level]);
+ /* Unfortunately there is no bitmap_weight_range() */
+ for (id = find_next_bit(apic_maps[which_units].map, end, lvlid);
+ id < end; id = find_next_bit(apic_maps[which_units].map, end, ++id))
+ cnt++;
+ return cnt;
+}
+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/**
* topology_hotplug_apic - Handle a physical hotplugged APIC after boot
diff --git a/arch/x86/kernel/cpu/topology.h b/arch/x86/kernel/cpu/topology.h
index 2a3c838..3732629 100644
--- a/arch/x86/kernel/cpu/topology.h
+++ b/arch/x86/kernel/cpu/topology.h
@@ -53,4 +53,15 @@ static inline void topology_update_dom(struct topo_scan *tscan, enum x86_topolog
tscan->dom_ncpus[dom] = ncpus;
}
+#ifdef CONFIG_X86_LOCAL_APIC
+unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level);
+#else
+static inline unsigned int topology_unit_count(u32 apicid, enum x86_topology_domains which_units,
+ enum x86_topology_domains at_level)
+{
+ return 1;
+}
+#endif
+
#endif /* ARCH_X86_TOPOLOGY_H */
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index c21a387..a2c3f8f 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -155,25 +155,15 @@ static void topo_set_ids(struct topo_scan *tscan)
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
+ /* Maximum number of cores on this package */
+ c->x86_max_cores = topology_unit_count(apicid, TOPO_CORE_DOMAIN, TOPO_PKG_DOMAIN);
+
c->topo.amd_node_id = tscan->amd_node_id;
if (c->x86_vendor == X86_VENDOR_AMD)
cpu_topology_fixup_amd(tscan);
}
-static void topo_set_max_cores(struct topo_scan *tscan)
-{
- /*
- * Bug compatible for now. This is broken on hybrid systems:
- * 8 cores SMT + 8 cores w/o SMT
- * tscan.dom_ncpus[TOPO_DIEGRP_DOMAIN] = 24; 24 / 2 = 12 !!
- *
- * Cannot be fixed without further topology enumeration changes.
- */
- tscan->c->x86_max_cores = tscan->dom_ncpus[TOPO_DIEGRP_DOMAIN] >>
- x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
-}
-
void cpu_parse_topology(struct cpuinfo_x86 *c)
{
unsigned int dom, cpu = smp_processor_id();
@@ -201,7 +191,6 @@ void cpu_parse_topology(struct cpuinfo_x86 *c)
}
topo_set_ids(&tscan);
- topo_set_max_cores(&tscan);
}
void __init cpu_init_topology(struct cpuinfo_x86 *c)
@@ -223,7 +212,6 @@ void __init cpu_init_topology(struct cpuinfo_x86 *c)
}
topo_set_ids(&tscan);
- topo_set_max_cores(&tscan);
/*
* AMD systems have Nodes per package which cannot be mapped to
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Provide logical pkg/die mapping
2024-02-13 21:06 ` [patch 24/30] x86/cpu/topology: Provide logical pkg/die mapping Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: b7065f4f844c7876ed071b67e2ba57838152bd63
Gitweb: https://git.kernel.org/tip/b7065f4f844c7876ed071b67e2ba57838152bd63
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:07 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Provide logical pkg/die mapping
With the topology bitmaps in place the logical package and die IDs can
trivially be retrieved by determining the bitmap weight of the relevant
topology domain level up to and including the physical ID in question.
Provide a function to that effect.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.846136196@linutronix.de
---
arch/x86/include/asm/topology.h | 9 +++++++++
arch/x86/kernel/cpu/topology.c | 28 ++++++++++++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 94ef1a6..bdd6a98 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -156,6 +156,15 @@ static inline unsigned int topology_max_die_per_package(void)
return __max_dies_per_package;
}
+#ifdef CONFIG_X86_LOCAL_APIC
+int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level);
+#else
+static inline int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
+{
+ return 0;
+}
+#endif
+
#ifdef CONFIG_SMP
#define topology_cluster_id(cpu) (cpu_data(cpu).topo.l2c_id)
#define topology_die_cpumask(cpu) (per_cpu(cpu_die_map, cpu))
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index aea408d..29759b4 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -229,6 +229,34 @@ void __init topology_register_boot_apic(u32 apic_id)
topo_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
+/**
+ * topology_get_logical_id - Retrieve the logical ID at a given topology domain level
+ * @apicid: The APIC ID for which to lookup the logical ID
+ * @at_level: The topology domain level to use
+ *
+ * @apicid must be a full APIC ID, not the normalized variant. It's valid to have
+ * all bits below the domain level specified by @at_level to be clear. So both
+ * real APIC IDs and backshifted normalized APIC IDs work correctly.
+ *
+ * Returns:
+ * - >= 0: The requested logical ID
+ * - -ERANGE: @apicid is out of range
+ * - -ENODEV: @apicid is not registered
+ */
+int topology_get_logical_id(u32 apicid, enum x86_topology_domains at_level)
+{
+ /* Remove the bits below @at_level to get the proper level ID of @apicid */
+ unsigned int lvlid = topo_apicid(apicid, at_level);
+
+ if (lvlid >= MAX_LOCAL_APIC)
+ return -ERANGE;
+ if (!test_bit(lvlid, apic_maps[at_level].map))
+ return -ENODEV;
+ /* Get the number of set bits before @lvlid. */
+ return bitmap_weight(apic_maps[at_level].map, lvlid);
+}
+EXPORT_SYMBOL_GPL(topology_get_logical_id);
+
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/**
* topology_hotplug_apic - Handle a physical hotplugged APIC after boot
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Use topology logical mapping mechanism
2024-02-13 21:06 ` [patch 25/30] x86/cpu/topology: Use topology logical mapping mechanism Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 380414be78bf8dbe1c3ed98feb75e2579c4a1bae
Gitweb: https://git.kernel.org/tip/380414be78bf8dbe1c3ed98feb75e2579c4a1bae
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:09 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Use topology logical mapping mechanism
Replace the logical package and die management functionality and retrieve
the logical IDs from the topology bitmaps.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.901865302@linutronix.de
---
arch/x86/include/asm/topology.h | 15 +--
arch/x86/kernel/cpu/common.c | 13 +---
arch/x86/kernel/cpu/topology_common.c | 4 +-
arch/x86/kernel/smpboot.c | 111 +-------------------------
4 files changed, 12 insertions(+), 131 deletions(-)
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index bdd6a98..5ce06f3 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -172,6 +172,13 @@ static inline int topology_get_logical_id(u32 apicid, enum x86_topology_domains
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
+
+static inline int topology_phys_to_logical_pkg(unsigned int pkg)
+{
+ return topology_get_logical_id(pkg << x86_topo_system.dom_shifts[TOPO_PKG_DOMAIN],
+ TOPO_PKG_DOMAIN);
+}
+
extern int __max_smt_threads;
static inline int topology_max_smt_threads(void)
@@ -181,10 +188,6 @@ static inline int topology_max_smt_threads(void)
#include <linux/cpu_smt.h>
-int topology_update_package_map(unsigned int apicid, unsigned int cpu);
-int topology_update_die_map(unsigned int dieid, unsigned int cpu);
-int topology_phys_to_logical_pkg(unsigned int pkg);
-
extern unsigned int __amd_nodes_per_pkg;
static inline unsigned int topology_amd_nodes_per_pkg(void)
@@ -205,10 +208,6 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
}
#else /* CONFIG_SMP */
-static inline int
-topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
-static inline int
-topology_update_die_map(unsigned int dieid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4babe3c..58f9cc7 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1719,18 +1719,6 @@ static void generic_identify(struct cpuinfo_x86 *c)
#endif
}
-static void update_package_map(struct cpuinfo_x86 *c)
-{
-#ifdef CONFIG_SMP
- unsigned int cpu = smp_processor_id();
-
- BUG_ON(topology_update_package_map(c->topo.pkg_id, cpu));
- BUG_ON(topology_update_die_map(c->topo.die_id, cpu));
-#else
- c->topo.logical_pkg_id = 0;
-#endif
-}
-
/*
* This does the hard work of actually picking apart the CPU stuff...
*/
@@ -1915,7 +1903,6 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
#ifdef CONFIG_X86_32
enable_sep_cpu();
#endif
- update_package_map(c);
x86_spec_ctrl_setup_ap();
update_srbds_msr();
if (boot_cpu_has_bug(X86_BUG_GDS))
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index 0276978..c21a387 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -10,6 +10,7 @@
#include "cpu.h"
struct x86_topology_system x86_topo_system __ro_after_init;
+EXPORT_SYMBOL_GPL(x86_topo_system);
unsigned int __amd_nodes_per_pkg __ro_after_init;
EXPORT_SYMBOL_GPL(__amd_nodes_per_pkg);
@@ -147,6 +148,9 @@ static void topo_set_ids(struct topo_scan *tscan)
c->topo.pkg_id = topo_shift_apicid(apicid, TOPO_PKG_DOMAIN);
c->topo.die_id = topo_shift_apicid(apicid, TOPO_DIE_DOMAIN);
+ c->topo.logical_pkg_id = topology_get_logical_id(apicid, TOPO_PKG_DOMAIN);
+ c->topo.logical_die_id = topology_get_logical_id(apicid, TOPO_DIE_DOMAIN);
+
/* Package relative core ID */
c->topo.core_id = (apicid & topo_domain_mask(TOPO_PKG_DOMAIN)) >>
x86_topo_system.dom_shifts[TOPO_SMT_DOMAIN];
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 93470eb..9ade685 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -125,23 +125,6 @@ struct mwait_cpu_dead {
*/
static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead);
-/* Logical package management. */
-struct logical_maps {
- u32 phys_pkg_id;
- u32 phys_die_id;
- u32 logical_pkg_id;
- u32 logical_die_id;
-};
-
-/* Temporary workaround until the full topology mechanics is in place */
-static DEFINE_PER_CPU_READ_MOSTLY(struct logical_maps, logical_maps) = {
- .phys_pkg_id = U32_MAX,
- .phys_die_id = U32_MAX,
-};
-
-static unsigned int logical_packages __read_mostly;
-static unsigned int logical_die __read_mostly;
-
/* Maximum number of SMT threads on any online core */
int __read_mostly __max_smt_threads = 1;
@@ -334,103 +317,11 @@ static void notrace start_secondary(void *unused)
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}
-/**
- * topology_phys_to_logical_pkg - Map a physical package id to a logical
- * @phys_pkg: The physical package id to map
- *
- * Returns logical package id or -1 if not found
- */
-int topology_phys_to_logical_pkg(unsigned int phys_pkg)
-{
- int cpu;
-
- for_each_possible_cpu(cpu) {
- if (per_cpu(logical_maps.phys_pkg_id, cpu) == phys_pkg)
- return per_cpu(logical_maps.logical_pkg_id, cpu);
- }
- return -1;
-}
-EXPORT_SYMBOL(topology_phys_to_logical_pkg);
-
-/**
- * topology_phys_to_logical_die - Map a physical die id to logical
- * @die_id: The physical die id to map
- * @cur_cpu: The CPU for which the mapping is done
- *
- * Returns logical die id or -1 if not found
- */
-static int topology_phys_to_logical_die(unsigned int die_id, unsigned int cur_cpu)
-{
- int cpu, proc_id = cpu_data(cur_cpu).topo.pkg_id;
-
- for_each_possible_cpu(cpu) {
- if (per_cpu(logical_maps.phys_pkg_id, cpu) == proc_id &&
- per_cpu(logical_maps.phys_die_id, cpu) == die_id)
- return per_cpu(logical_maps.logical_die_id, cpu);
- }
- return -1;
-}
-
-/**
- * topology_update_package_map - Update the physical to logical package map
- * @pkg: The physical package id as retrieved via CPUID
- * @cpu: The cpu for which this is updated
- */
-int topology_update_package_map(unsigned int pkg, unsigned int cpu)
-{
- int new;
-
- /* Already available somewhere? */
- new = topology_phys_to_logical_pkg(pkg);
- if (new >= 0)
- goto found;
-
- new = logical_packages++;
- if (new != pkg) {
- pr_info("CPU %u Converting physical %u to logical package %u\n",
- cpu, pkg, new);
- }
-found:
- per_cpu(logical_maps.phys_pkg_id, cpu) = pkg;
- per_cpu(logical_maps.logical_pkg_id, cpu) = new;
- cpu_data(cpu).topo.logical_pkg_id = new;
- return 0;
-}
-/**
- * topology_update_die_map - Update the physical to logical die map
- * @die: The die id as retrieved via CPUID
- * @cpu: The cpu for which this is updated
- */
-int topology_update_die_map(unsigned int die, unsigned int cpu)
-{
- int new;
-
- /* Already available somewhere? */
- new = topology_phys_to_logical_die(die, cpu);
- if (new >= 0)
- goto found;
-
- new = logical_die++;
- if (new != die) {
- pr_info("CPU %u Converting physical %u to logical die %u\n",
- cpu, die, new);
- }
-found:
- per_cpu(logical_maps.phys_die_id, cpu) = die;
- per_cpu(logical_maps.logical_die_id, cpu) = new;
- cpu_data(cpu).topo.logical_die_id = new;
- return 0;
-}
-
static void __init smp_store_boot_cpu_info(void)
{
- int id = 0; /* CPU 0 */
- struct cpuinfo_x86 *c = &cpu_data(id);
+ struct cpuinfo_x86 *c = &cpu_data(0);
*c = boot_cpu_data;
- c->cpu_index = id;
- topology_update_package_map(c->topo.pkg_id, id);
- topology_update_die_map(c->topo.die_id, id);
c->initialized = true;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Simplify cpu_mark_primary_thread()
2024-02-13 21:06 ` [patch 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread() Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 5e40fb2d4a4c7503cab4f923b7d985dbcf583581
Gitweb: https://git.kernel.org/tip/5e40fb2d4a4c7503cab4f923b7d985dbcf583581
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:06 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Simplify cpu_mark_primary_thread()
No point in creating a mask via fls(). smp_num_siblings is guaranteed to be
a power of 2. So just using (smp_num_siblings - 1) has the same effect.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.791176581@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index f3397e2..aea408d 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -76,10 +76,7 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
#ifdef CONFIG_SMP
static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
{
- /* Isolate the SMT bit(s) in the APICID and check for 0 */
- u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
-
- if (smp_num_siblings == 1 || !(apicid & mask))
+ if (!(apicid & (smp_num_siblings - 1)))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
#else
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Mop up primary thread mask handling
2024-02-13 21:06 ` [patch 22/30] x86/cpu/topology: Mop up primary thread mask handling Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 882e0cff9ef340e7a47659a9aab9da64f4b9b847
Gitweb: https://git.kernel.org/tip/882e0cff9ef340e7a47659a9aab9da64f4b9b847
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:05 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Mop up primary thread mask handling
The early initcall to initialize the primary thread mask is not longer
required because topology_init_possible_cpus() can mark primary threads
correctly when initializing the possible and present map as the number of
SMT threads is already determined correctly.
The XENPV workaround is not longer required because XENPV now registers
fake APIC IDs which will just work like any other enumeration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.736104257@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 29 ++---------------------------
1 file changed, 2 insertions(+), 27 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 630ebe5..f3397e2 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -82,30 +82,6 @@ static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
if (smp_num_siblings == 1 || !(apicid & mask))
cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
}
-
-/*
- * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
- * during early boot. Initialize the primary thread mask before SMP
- * bringup.
- */
-static int __init smp_init_primary_thread_mask(void)
-{
- unsigned int cpu;
-
- /*
- * XEN/PV provides either none or useless topology information.
- * Pretend that all vCPUs are primary threads.
- */
- if (xen_pv_domain()) {
- cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
- return 0;
- }
-
- for (cpu = 0; cpu < topo_info.nr_assigned_cpus; cpu++)
- cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
- return 0;
-}
-early_initcall(smp_init_primary_thread_mask);
#else
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
@@ -151,9 +127,6 @@ static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
#endif
set_cpu_possible(cpu, true);
set_cpu_present(cpu, true);
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apic_id);
}
static __init bool check_for_real_bsp(u32 apic_id)
@@ -282,6 +255,7 @@ int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
set_bit(apic_id, phys_cpu_present_map);
topo_set_cpuids(cpu, apic_id, acpi_id);
+ cpu_mark_primary_thread(cpu, apic_id);
return cpu;
}
@@ -414,6 +388,7 @@ void __init topology_init_possible_cpus(void)
if (apicid == BAD_APICID)
continue;
+ cpu_mark_primary_thread(cpu, apicid);
set_cpu_present(cpu, test_bit(apicid, phys_cpu_present_map));
}
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Use topology bitmaps for sizing
2024-02-13 21:06 ` [patch 21/30] x86/cpu/topology: Use topology bitmaps for sizing Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 090610ba704a66d7a58919be3bad195f24499ecb
Gitweb: https://git.kernel.org/tip/090610ba704a66d7a58919be3bad195f24499ecb
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:03 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Use topology bitmaps for sizing
Now that all possible APIC IDs are tracked in the topology bitmaps, its
trivial to retrieve the real information from there.
This gets rid of the guesstimates for the maximal packages and dies per
package as the actual numbers can be determined before a single AP has been
brought up.
The number of SMT threads can now be determined correctly from the bitmaps
in all situations. Up to now a system which has SMT disabled in the BIOS
will still claim that it is SMT capable, because the lowest APIC ID bit is
reserved for that and CPUID leaf 0xb/0x1f still enumerates the SMT domain
accordingly. By calculating the bitmap weights of the SMT and the CORE
domain and setting them into relation the SMT disabled in BIOS situation
reports correctly that the system is not SMT capable.
It also handles the situation correctly when a hybrid systems boot CPU does
not have SMT as it takes the SMT capability of the APs fully into account.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.681709880@linutronix.de
---
arch/x86/include/asm/smp.h | 3 +--
arch/x86/include/asm/topology.h | 23 ++++++++++++-----------
arch/x86/kernel/cpu/common.c | 9 ++++++---
arch/x86/kernel/cpu/debugfs.c | 2 +-
arch/x86/kernel/cpu/topology.c | 20 +++++++++++++++++++-
arch/x86/kernel/cpu/topology_common.c | 24 ------------------------
arch/x86/kernel/smpboot.c | 16 ----------------
arch/x86/xen/smp.c | 2 --
8 files changed, 39 insertions(+), 60 deletions(-)
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index f1510d6..5318470 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -8,7 +8,7 @@
#include <asm/current.h>
#include <asm/thread_info.h>
-extern int smp_num_siblings;
+extern unsigned int smp_num_siblings;
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
@@ -109,7 +109,6 @@ void cpu_disable_common(void);
void native_smp_prepare_boot_cpu(void);
void smp_prepare_cpus_common(void);
void native_smp_prepare_cpus(unsigned int max_cpus);
-void calculate_max_logical_packages(void);
void native_smp_cpus_done(unsigned int max_cpus);
int common_cpu_up(unsigned int cpunum, struct task_struct *tidle);
int native_kick_ap(unsigned int cpu, struct task_struct *tidle);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 3e11a5a..94ef1a6 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -143,7 +143,18 @@ extern const struct cpumask *cpu_clustergroup_mask(int cpu);
#define topology_amd_node_id(cpu) (cpu_data(cpu).topo.amd_node_id)
-extern unsigned int __max_die_per_package;
+extern unsigned int __max_dies_per_package;
+extern unsigned int __max_logical_packages;
+
+static inline unsigned int topology_max_packages(void)
+{
+ return __max_logical_packages;
+}
+
+static inline unsigned int topology_max_die_per_package(void)
+{
+ return __max_dies_per_package;
+}
#ifdef CONFIG_SMP
#define topology_cluster_id(cpu) (cpu_data(cpu).topo.l2c_id)
@@ -152,14 +163,6 @@ extern unsigned int __max_die_per_package;
#define topology_core_cpumask(cpu) (per_cpu(cpu_core_map, cpu))
#define topology_sibling_cpumask(cpu) (per_cpu(cpu_sibling_map, cpu))
-extern unsigned int __max_logical_packages;
-#define topology_max_packages() (__max_logical_packages)
-
-static inline int topology_max_die_per_package(void)
-{
- return __max_die_per_package;
-}
-
extern int __max_smt_threads;
static inline int topology_max_smt_threads(void)
@@ -193,13 +196,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
}
#else /* CONFIG_SMP */
-#define topology_max_packages() (1)
static inline int
topology_update_package_map(unsigned int apicid, unsigned int cpu) { return 0; }
static inline int
topology_update_die_map(unsigned int dieid, unsigned int cpu) { return 0; }
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
-static inline int topology_max_die_per_package(void) { return 1; }
static inline int topology_max_smt_threads(void) { return 1; }
static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline unsigned int topology_amd_nodes_per_pkg(void) { return 0; };
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index b221e14..4babe3c 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -73,11 +73,14 @@
u32 elf_hwcap2 __read_mostly;
/* Number of siblings per CPU package */
-int smp_num_siblings = 1;
+unsigned int smp_num_siblings __ro_after_init = 1;
EXPORT_SYMBOL(smp_num_siblings);
-unsigned int __max_die_per_package __read_mostly = 1;
-EXPORT_SYMBOL(__max_die_per_package);
+unsigned int __max_dies_per_package __ro_after_init = 1;
+EXPORT_SYMBOL(__max_dies_per_package);
+
+unsigned int __max_logical_packages __ro_after_init = 1;
+EXPORT_SYMBOL(__max_logical_packages);
static struct ppin_info {
int feature;
diff --git a/arch/x86/kernel/cpu/debugfs.c b/arch/x86/kernel/cpu/debugfs.c
index 86de544..543efc4 100644
--- a/arch/x86/kernel/cpu/debugfs.c
+++ b/arch/x86/kernel/cpu/debugfs.c
@@ -29,7 +29,7 @@ static int cpu_debug_show(struct seq_file *m, void *p)
seq_printf(m, "amd_node_id: %u\n", c->topo.amd_node_id);
seq_printf(m, "amd_nodes_per_pkg: %u\n", topology_amd_nodes_per_pkg());
seq_printf(m, "max_cores: %u\n", c->x86_max_cores);
- seq_printf(m, "max_die_per_pkg: %u\n", __max_die_per_package);
+ seq_printf(m, "max_dies_per_pkg: %u\n", __max_dies_per_package);
seq_printf(m, "smp_num_siblings: %u\n", smp_num_siblings);
return 0;
}
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 07124da..630ebe5 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -348,8 +348,8 @@ void __init topology_init_possible_cpus(void)
{
unsigned int assigned = topo_info.nr_assigned_cpus;
unsigned int disabled = topo_info.nr_disabled_cpus;
+ unsigned int cnta, cntb, cpu, allowed = 1;
unsigned int total = assigned + disabled;
- unsigned int cpu, allowed = 1;
u32 apicid;
if (!restrict_to_up()) {
@@ -372,6 +372,24 @@ void __init topology_init_possible_cpus(void)
total_cpus = allowed;
set_nr_cpu_ids(allowed);
+ cnta = domain_weight(TOPO_PKG_DOMAIN);
+ cntb = domain_weight(TOPO_DIE_DOMAIN);
+ __max_logical_packages = cnta;
+ __max_dies_per_package = 1U << (get_count_order(cntb) - get_count_order(cnta));
+
+ pr_info("Max. logical packages: %3u\n", cnta);
+ pr_info("Max. logical dies: %3u\n", cntb);
+ pr_info("Max. dies per package: %3u\n", __max_dies_per_package);
+
+ cnta = domain_weight(TOPO_CORE_DOMAIN);
+ cntb = domain_weight(TOPO_SMT_DOMAIN);
+ /*
+ * Can't use order delta here as order(cnta) can be equal
+ * order(cntb) even if cnta != cntb.
+ */
+ smp_num_siblings = DIV_ROUND_UP(cntb, cnta);
+ pr_info("Max. threads per core: %3u\n", smp_num_siblings);
+
pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
if (topo_info.nr_rejected_cpus)
pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index b0b68c8..0276978 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -196,16 +196,6 @@ void cpu_parse_topology(struct cpuinfo_x86 *c)
tscan.dom_shifts[dom], x86_topo_system.dom_shifts[dom]);
}
- /* Bug compatible with the existing parsers */
- if (tscan.dom_ncpus[TOPO_SMT_DOMAIN] > smp_num_siblings) {
- if (system_state == SYSTEM_BOOTING) {
- pr_warn_once("CPU%d: SMT detected and enabled late\n", cpu);
- smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
- } else {
- pr_warn_once("CPU%d: SMT detected after init. Too late!\n", cpu);
- }
- }
-
topo_set_ids(&tscan);
topo_set_max_cores(&tscan);
}
@@ -232,20 +222,6 @@ void __init cpu_init_topology(struct cpuinfo_x86 *c)
topo_set_max_cores(&tscan);
/*
- * Bug compatible with the existing code. If the boot CPU does not
- * have SMT this ends up with one sibling. This needs way deeper
- * changes further down the road to get it right during early boot.
- */
- smp_num_siblings = tscan.dom_ncpus[TOPO_SMT_DOMAIN];
-
- /*
- * Neither it's clear whether there are as many dies as the APIC
- * space indicating die level is. But assume that the actual number
- * of CPUs gives a proper indication for now to stay bug compatible.
- */
- __max_die_per_package = tscan.dom_ncpus[TOPO_DIE_DOMAIN] /
- tscan.dom_ncpus[TOPO_DIE_DOMAIN - 1];
- /*
* AMD systems have Nodes per package which cannot be mapped to
* APIC ID.
*/
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 7f85f17..93470eb 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -139,8 +139,6 @@ static DEFINE_PER_CPU_READ_MOSTLY(struct logical_maps, logical_maps) = {
.phys_die_id = U32_MAX,
};
-unsigned int __max_logical_packages __read_mostly;
-EXPORT_SYMBOL(__max_logical_packages);
static unsigned int logical_packages __read_mostly;
static unsigned int logical_die __read_mostly;
@@ -1267,24 +1265,10 @@ void __init native_smp_prepare_boot_cpu(void)
native_pv_lock_init();
}
-void __init calculate_max_logical_packages(void)
-{
- int ncpus;
-
- /*
- * Today neither Intel nor AMD support heterogeneous systems so
- * extrapolate the boot cpu's data to all packages.
- */
- ncpus = cpu_data(0).booted_cores * topology_max_smt_threads();
- __max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus);
- pr_info("Max logical packages: %u\n", __max_logical_packages);
-}
-
void __init native_smp_cpus_done(unsigned int max_cpus)
{
pr_debug("Boot done\n");
- calculate_max_logical_packages();
build_sched_topology();
nmi_selftest();
impress_friends();
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 4b0d6ff..114b362 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -123,8 +123,6 @@ void __init xen_smp_cpus_done(unsigned int max_cpus)
{
if (xen_hvm_domain())
native_smp_cpus_done(max_cpus);
- else
- calculate_max_logical_packages();
}
void xen_smp_send_reschedule(int cpu)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT
2024-02-13 21:06 ` [patch 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 354da4cf57af5d8b5302251204d6077600b6d3d6
Gitweb: https://git.kernel.org/tip/354da4cf57af5d8b5302251204d6077600b6d3d6
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:02 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT
It turns out that XEN/PV Dom0 has halfways usable CPUID/MADT enumeration
except that it cannot deal with CPUs which are enumerated as disabled in
MADT.
DomU has no MADT and provides at least rudimentary topology information in
CPUID leaves 1 and 4.
For both it's important that there are not more possible Linux CPUs than
vCPUs provided by the hypervisor.
As this is ensured by counting the vCPUs before enumeration happens:
- lift the restrictions in the CPUID evaluation and the MADT parser
- Utilize MADT registration for Dom0
- Keep the fake APIC ID registration for DomU
- Fix the XEN APIC fake so the readout of the local APIC ID works for
Dom0 via the hypercall and for DomU by returning the registered
fake APIC IDs.
With that the XEN/PV fake approximates usefulness.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.626195405@linutronix.de
---
arch/x86/kernel/acpi/boot.c | 25 ++++++++-----------------
arch/x86/kernel/cpu/topology_common.c | 2 +-
arch/x86/xen/apic.c | 14 +++++++-------
arch/x86/xen/smp_pv.c | 13 ++++++++-----
4 files changed, 24 insertions(+), 30 deletions(-)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index df741fb..4bf82db 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -23,8 +23,6 @@
#include <linux/serial_core.h>
#include <linux/pgtable.h>
-#include <xen/xen.h>
-
#include <asm/e820/api.h>
#include <asm/irqdomain.h>
#include <asm/pci_x86.h>
@@ -166,12 +164,6 @@ static int __init acpi_parse_madt(struct acpi_table_header *table)
return 0;
}
-static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
-{
- if (!xen_pv_domain())
- topology_register_apic(apic_id, acpi_id, present);
-}
-
static bool __init acpi_is_processor_usable(u32 lapic_flags)
{
if (lapic_flags & ACPI_MADT_ENABLED)
@@ -233,7 +225,7 @@ acpi_parse_x2apic(union acpi_subtable_headers *header, const unsigned long end)
return 0;
}
- acpi_register_lapic(apic_id, processor->uid, enabled);
+ topology_register_apic(apic_id, processor->uid, enabled);
#else
pr_warn("x2apic entry ignored\n");
#endif
@@ -268,9 +260,9 @@ acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
* to not preallocating memory for all NR_CPUS
* when we use CPU hotplug.
*/
- acpi_register_lapic(processor->id, /* APIC ID */
- processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ topology_register_apic(processor->id, /* APIC ID */
+ processor->processor_id, /* ACPI ID */
+ processor->lapic_flags & ACPI_MADT_ENABLED);
has_lapic_cpus = true;
return 0;
@@ -288,9 +280,9 @@ acpi_parse_sapic(union acpi_subtable_headers *header, const unsigned long end)
acpi_table_print_madt_entry(&header->common);
- acpi_register_lapic((processor->id << 8) | processor->eid,/* APIC ID */
- processor->processor_id, /* ACPI ID */
- processor->lapic_flags & ACPI_MADT_ENABLED);
+ topology_register_apic((processor->id << 8) | processor->eid,/* APIC ID */
+ processor->processor_id, /* ACPI ID */
+ processor->lapic_flags & ACPI_MADT_ENABLED);
return 0;
}
@@ -1090,8 +1082,7 @@ static int __init early_acpi_parse_madt_lapic_addr_ovr(void)
return count;
}
- if (!xen_pv_domain())
- register_lapic_address(acpi_lapic_addr);
+ register_lapic_address(acpi_lapic_addr);
return count;
}
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index 3876a33..b0b68c8 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -77,7 +77,7 @@ static bool fake_topology(struct topo_scan *tscan)
topology_set_dom(tscan, TOPO_SMT_DOMAIN, 0, 1);
topology_set_dom(tscan, TOPO_CORE_DOMAIN, 0, 1);
- return tscan->c->cpuid_level < 1 || xen_pv_domain();
+ return tscan->c->cpuid_level < 1;
}
static void parse_topology(struct topo_scan *tscan, bool early)
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index 8835d1c..8b045dd 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -43,20 +43,20 @@ static u32 xen_apic_read(u32 reg)
struct xen_platform_op op = {
.cmd = XENPF_get_cpuinfo,
.interface_version = XENPF_INTERFACE_VERSION,
- .u.pcpu_info.xen_cpuid = 0,
};
- int ret;
-
- /* Shouldn't need this as APIC is turned off for PV, and we only
- * get called on the bootup processor. But just in case. */
- if (!xen_initial_domain() || smp_processor_id())
- return 0;
+ int ret, cpu;
if (reg == APIC_LVR)
return 0x14;
if (reg != APIC_ID)
return 0;
+ cpu = smp_processor_id();
+ if (!xen_initial_domain())
+ return cpu ? cpuid_to_apicid[cpu] << 24 : 0;
+
+ op.u.pcpu_info.xen_cpuid = cpu;
+
ret = HYPERVISOR_platform_op(&op);
if (ret)
op.u.pcpu_info.apic_id = BAD_APICID;
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 44706f0..27d1a5b 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -156,11 +156,9 @@ static void __init xen_pv_smp_config(void)
topology_register_boot_apic(apicid++);
- for (i = 1; i < nr_cpu_ids; i++) {
- if (HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL) < 0)
- break;
+ for (i = 1; i < nr_cpu_ids; i++)
topology_register_apic(apicid++, CPU_ACPIID_INVALID, true);
- }
+
/* Pretend to be a proper enumerated system */
smp_found_config = 1;
}
@@ -451,5 +449,10 @@ void __init xen_smp_init(void)
/* Avoid searching for BIOS MP tables */
x86_init.mpparse.find_mptable = x86_init_noop;
x86_init.mpparse.early_parse_smp_cfg = x86_init_noop;
- x86_init.mpparse.parse_smp_cfg = xen_pv_smp_config;
+
+ /* XEN/PV Dom0 has halfways sane topology information via CPUID/MADT */
+ if (xen_initial_domain())
+ x86_init.mpparse.parse_smp_cfg = x86_init_noop;
+ else
+ x86_init.mpparse.parse_smp_cfg = xen_pv_smp_config;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/xen/smp_pv: Count number of vCPUs early
2024-02-13 21:06 ` [patch 19/30] x86/xen/smp_pv: Count number of vCPUs early Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: c8f808231f1fb63553f90d4b3796cb6804d1e693
Gitweb: https://git.kernel.org/tip/c8f808231f1fb63553f90d4b3796cb6804d1e693
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:06:00 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/xen/smp_pv: Count number of vCPUs early
XEN/PV has a completely broken vCPU enumeration scheme, which just works by
chance and provides zero topology information. Each vCPU ends up being a
single core package.
Dom0 provides MADT which can be used for topology information, but that
table is the unmodified host table, which means that there can be more CPUs
registered than the number of vCPUs XEN provides for the dom0 guest.
DomU does not have ACPI and both rely on counting the possible vCPUs via an
hypercall.
To prepare for using CPUID topology information either via MADT or via fake
APIC IDs count the number of possible CPUs during early boot and adjust
nr_cpu_ids() accordingly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.571795063@linutronix.de
---
arch/x86/xen/enlighten_pv.c | 3 +++
arch/x86/xen/smp.h | 2 ++
arch/x86/xen/smp_pv.c | 14 ++++++++++++++
3 files changed, 19 insertions(+)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index aeb33e0..ace2eb0 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -200,6 +200,9 @@ static void __init xen_pv_init_platform(void)
xen_set_mtrr_data();
else
mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK);
+
+ /* Adjust nr_cpu_ids before "enumeration" happens */
+ xen_smp_count_cpus();
}
static void __init xen_pv_guest_late_init(void)
diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index c20cbb1..b8efdbc 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -19,6 +19,7 @@ extern void xen_smp_intr_free(unsigned int cpu);
int xen_smp_intr_init_pv(unsigned int cpu);
void xen_smp_intr_free_pv(unsigned int cpu);
+void xen_smp_count_cpus(void);
void xen_smp_cpus_done(unsigned int max_cpus);
void xen_smp_send_reschedule(int cpu);
@@ -44,6 +45,7 @@ static inline int xen_smp_intr_init_pv(unsigned int cpu)
return 0;
}
static inline void xen_smp_intr_free_pv(unsigned int cpu) {}
+static inline void xen_smp_count_cpus(void) { }
#endif /* CONFIG_SMP */
#endif
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 98849b3..44706f0 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -411,6 +411,20 @@ static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
}
+void __init xen_smp_count_cpus(void)
+{
+ unsigned int cpus;
+
+ for (cpus = 0; cpus < nr_cpu_ids; cpus++) {
+ if (HYPERVISOR_vcpu_op(VCPUOP_is_up, cpus, NULL) < 0)
+ break;
+ }
+
+ pr_info("Xen PV: Detected %u vCPUS\n", cpus);
+ if (cpus < nr_cpu_ids)
+ set_nr_cpu_ids(cpus);
+}
+
static const struct smp_ops xen_smp_ops __initconst = {
.smp_prepare_boot_cpu = xen_pv_smp_prepare_boot_cpu,
.smp_prepare_cpus = xen_pv_smp_prepare_cpus,
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug
2024-02-13 21:05 ` [patch 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 7cdcdab1a660bbe9f98bf1591c048ce7ccee59e0
Gitweb: https://git.kernel.org/tip/7cdcdab1a660bbe9f98bf1591c048ce7ccee59e0
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:57 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug
The topology bitmaps track all possible APIC IDs which have been registered
during enumeration. As sizing and further topology information is going to
be derived from these bitmaps, reject attempts to hotplug an APIC ID which
was not registered during enumeration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.462231229@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index c671206..a6d045b 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -272,6 +272,10 @@ int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
if (apic_id >= MAX_LOCAL_APIC)
return -EINVAL;
+ /* Reject if the APIC ID was not registered during enumeration. */
+ if (!test_bit(apic_id, apic_maps[TOPO_SMT_DOMAIN].map))
+ return -ENODEV;
+
cpu = topo_lookup_cpuid(apic_id);
if (cpu < 0) {
if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Assign hotpluggable CPUIDs during init
2024-02-13 21:05 ` [patch 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: ea2dd8a5d4361ef0b000196043fa407f05b16f1d
Gitweb: https://git.kernel.org/tip/ea2dd8a5d4361ef0b000196043fa407f05b16f1d
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:59 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:44 +01:00
x86/cpu/topology: Assign hotpluggable CPUIDs during init
There is no point in assigning the CPU numbers during ACPI physical
hotplug. The number of possible hotplug CPUs is known when the possible map
is initialized, so the CPU numbers can be associated to the registered
non-present APIC IDs right there.
This allows to put more code into the __init section and makes the related
data __ro_after_init.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.517339971@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index a6d045b..07124da 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -45,7 +45,7 @@ EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
/* Used for CPU number allocation and parallel CPU bringup */
-u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+u32 cpuid_to_apicid[] __ro_after_init = { [0 ... NR_CPUS - 1] = BAD_APICID, };
/* Bitmaps to mark registered APICs at each topology domain */
static struct { DECLARE_BITMAP(map, MAX_LOCAL_APIC); } apic_maps[TOPO_MAX_DOMAIN] __ro_after_init;
@@ -60,7 +60,7 @@ static struct {
unsigned int nr_rejected_cpus;
u32 boot_cpu_apic_id;
u32 real_bsp_apic_id;
-} topo_info __read_mostly = {
+} topo_info __ro_after_init = {
.nr_assigned_cpus = 1,
.boot_cpu_apic_id = BAD_APICID,
.real_bsp_apic_id = BAD_APICID,
@@ -133,7 +133,7 @@ static int topo_lookup_cpuid(u32 apic_id)
return -ENODEV;
}
-static int topo_get_cpunr(u32 apic_id)
+static __init int topo_get_cpunr(u32 apic_id)
{
int cpu = topo_lookup_cpuid(apic_id);
@@ -149,8 +149,6 @@ static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
#endif
- cpuid_to_apicid[cpu] = apic_id;
-
set_cpu_possible(cpu, true);
set_cpu_present(cpu, true);
@@ -206,6 +204,8 @@ static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
cpu = 0;
else
cpu = topo_get_cpunr(apic_id);
+
+ cpuid_to_apicid[cpu] = apic_id;
topo_set_cpuids(cpu, apic_id, acpi_id);
} else {
topo_info.nr_disabled_cpus++;
@@ -277,12 +277,9 @@ int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
return -ENODEV;
cpu = topo_lookup_cpuid(apic_id);
- if (cpu < 0) {
- if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
- return -ENOSPC;
+ if (cpu < 0)
+ return -ENOSPC;
- cpu = topo_assign_cpunr(apic_id);
- }
set_bit(apic_id, phys_cpu_present_map);
topo_set_cpuids(cpu, apic_id, acpi_id);
return cpu;
@@ -353,6 +350,7 @@ void __init topology_init_possible_cpus(void)
unsigned int disabled = topo_info.nr_disabled_cpus;
unsigned int total = assigned + disabled;
unsigned int cpu, allowed = 1;
+ u32 apicid;
if (!restrict_to_up()) {
if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
@@ -381,8 +379,17 @@ void __init topology_init_possible_cpus(void)
init_cpu_present(cpumask_of(0));
init_cpu_possible(cpumask_of(0));
+ /* Assign CPU numbers to non-present CPUs */
+ for (apicid = 0; disabled; disabled--, apicid++) {
+ apicid = find_next_andnot_bit(apic_maps[TOPO_SMT_DOMAIN].map, phys_cpu_present_map,
+ MAX_LOCAL_APIC, apicid);
+ if (apicid >= MAX_LOCAL_APIC)
+ break;
+ cpuid_to_apicid[topo_info.nr_assigned_cpus++] = apicid;
+ }
+
for (cpu = 0; cpu < allowed; cpu++) {
- u32 apicid = cpuid_to_apicid[cpu];
+ apicid = cpuid_to_apicid[cpu];
set_cpu_possible(cpu, true);
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/topology: Add a mechanism to track topology via APIC IDs
2024-02-13 21:05 ` [patch 16/30] x86/topology: Add a mechanism to track topology via APIC IDs Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: f1f758a80516775b5d12d7c93cbedb2a08cd4c98
Gitweb: https://git.kernel.org/tip/f1f758a80516775b5d12d7c93cbedb2a08cd4c98
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:56 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/topology: Add a mechanism to track topology via APIC IDs
Topology on X86 is determined by the registered APIC IDs and the
segmentation information retrieved from CPUID. Depending on the granularity
of the provided CPUID information the most fine grained scheme looks like
this according to Intel terminology:
[PKG][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]
Not enumerated domain levels consume 0 bits in the APIC ID. This allows to
provide a consistent view at the topology and determine other information
precisely like the number of cores in a package on hybrid systems, where
the existing assumption that number or cores == number of threads / threads
per core does not hold.
Provide per domain level bitmaps which record the APIC ID split into the
domain levels to make later evaluation of domain level specific information
simple. This allows to calculate e.g. the logical IDs without any further
extra logic.
Contrary to the existing registration mechanism this records disabled CPUs,
which are subject to later hotplug as well. That's useful for boot time
sizing of package or die dependent allocations without using heuristics.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.406985021@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 48 +++++++++++++++++++++++++++++++--
1 file changed, 46 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index d555331..c671206 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -1,5 +1,27 @@
// SPDX-License-Identifier: GPL-2.0-only
-
+/*
+ * CPU/APIC topology
+ *
+ * The APIC IDs describe the system topology in multiple domain levels.
+ * The CPUID topology parser provides the information which part of the
+ * APIC ID is associated to the individual levels:
+ *
+ * [PACKAGE][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]
+ *
+ * The root space contains the package (socket) IDs.
+ *
+ * Not enumerated levels consume 0 bits space, but conceptually they are
+ * always represented. If e.g. only CORE and THREAD levels are enumerated
+ * then the DIE, MODULE and TILE have the same physical ID as the PACKAGE.
+ *
+ * If SMT is not supported, then the THREAD domain is still used. It then
+ * has the same physical ID as the CORE domain and is the only child of
+ * the core domain.
+ *
+ * This allows a unified view on the system independent of the enumerated
+ * domain levels without requiring any conditionals in the code.
+ */
+#define pr_fmt(fmt) "CPU topo: " fmt
#include <linux/cpu.h>
#include <xen/xen.h>
@@ -9,6 +31,8 @@
#include <asm/mpspec.h>
#include <asm/smp.h>
+#include "cpu.h"
+
/*
* Map cpu index to physical APIC ID
*/
@@ -23,6 +47,9 @@ DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
/* Used for CPU number allocation and parallel CPU bringup */
u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+/* Bitmaps to mark registered APICs at each topology domain */
+static struct { DECLARE_BITMAP(map, MAX_LOCAL_APIC); } apic_maps[TOPO_MAX_DOMAIN] __ro_after_init;
+
/*
* Keep track of assigned, disabled and rejected CPUs. Present assigned
* with 1 as CPU #0 is reserved for the boot CPU.
@@ -39,6 +66,8 @@ static struct {
.real_bsp_apic_id = BAD_APICID,
};
+#define domain_weight(_dom) bitmap_weight(apic_maps[_dom].map, MAX_LOCAL_APIC)
+
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -81,6 +110,17 @@ early_initcall(smp_init_primary_thread_mask);
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
+/*
+ * Convert the APIC ID to a domain level ID by masking out the low bits
+ * below the domain level @dom.
+ */
+static inline u32 topo_apicid(u32 apicid, enum x86_topology_domains dom)
+{
+ if (dom == TOPO_SMT_DOMAIN)
+ return apicid;
+ return apicid & (UINT_MAX << x86_topo_system.dom_shifts[dom - 1]);
+}
+
static int topo_lookup_cpuid(u32 apic_id)
{
int i;
@@ -151,7 +191,7 @@ static __init bool check_for_real_bsp(u32 apic_id)
static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
+ int cpu, dom;
if (present) {
set_bit(apic_id, phys_cpu_present_map);
@@ -170,6 +210,10 @@ static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
} else {
topo_info.nr_disabled_cpus++;
}
+
+ /* Register present and possible CPUs in the domain maps */
+ for (dom = TOPO_SMT_DOMAIN; dom < TOPO_MAX_DOMAIN; dom++)
+ set_bit(topo_apicid(apic_id, dom), apic_maps[dom].map);
}
/**
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu: Detect real BSP on crash kernels
2024-02-13 21:05 ` [patch 15/30] x86/cpu: Detect real BSP on crash kernels Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 5c5682b9f87a3b7bd4833884f300ec673685f6a6
Gitweb: https://git.kernel.org/tip/5c5682b9f87a3b7bd4833884f300ec673685f6a6
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:54 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu: Detect real BSP on crash kernels
When a kdump kernel is started from a crashing CPU then there is no
guarantee that this CPU is the real boot CPU (BSP). If the kdump kernel
tries to online the BSP then the INIT sequence will reset the machine.
There is a command line option to prevent this, but in case of nested kdump
kernels this is wrong.
But that command line option is not required at all because the real
BSP is enumerated as the first CPU by firmware. Support for the only
known system which was different (Voyager) got removed long ago.
Detect whether the boot CPU APIC ID is the first APIC ID enumerated by
the firmware. If the first APIC ID enumerated is not matching the boot
CPU APIC ID then skip registering it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.348542071@linutronix.de
---
Documentation/admin-guide/kdump/kdump.rst | 7 +-
Documentation/admin-guide/kernel-parameters.txt | 9 +-
arch/x86/kernel/cpu/topology.c | 97 +++++++++-------
3 files changed, 61 insertions(+), 52 deletions(-)
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 5762e74..0302a93 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -191,9 +191,7 @@ Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
CPU is enough for kdump kernel to dump vmcore on most of systems.
However, you can also specify nr_cpus=X to enable multiple processors
- in kdump kernel. In this case, "disable_cpu_apicid=" is needed to
- tell kdump kernel which cpu is 1st kernel's BSP. Please refer to
- admin-guide/kernel-parameters.txt for more details.
+ in kdump kernel.
With CONFIG_SMP=n, the above things are not related.
@@ -454,8 +452,7 @@ Notes on loading the dump-capture kernel:
to use multi-thread programs with it, such as parallel dump feature of
makedumpfile. Otherwise, the multi-thread program may have a great
performance degradation. To enable multi-cpu support, you should bring up an
- SMP dump-capture kernel and specify maxcpus/nr_cpus, disable_cpu_apicid=[X]
- options while loading it.
+ SMP dump-capture kernel and specify maxcpus/nr_cpus options while loading it.
* For s390x there are two kdump modes: If a ELF header is specified with
the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 31b3a25..4b9b4d6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1100,15 +1100,6 @@
Disable TLBIE instruction. Currently does not work
with KVM, with HASH MMU, or with coherent accelerators.
- disable_cpu_apicid= [X86,APIC,SMP]
- Format: <int>
- The number of initial APIC ID for the
- corresponding CPU to be disabled at boot,
- mostly used for the kdump 2nd kernel to
- disable BSP to wake up multiple CPUs without
- causing system reset or hang due to sending
- INIT from AP to BSP.
-
disable_ddw [PPC/PSERIES]
Disable Dynamic DMA Window support. Use this
to workaround buggy firmware.
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index fc47f52..d555331 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -32,18 +32,13 @@ static struct {
unsigned int nr_disabled_cpus;
unsigned int nr_rejected_cpus;
u32 boot_cpu_apic_id;
+ u32 real_bsp_apic_id;
} topo_info __read_mostly = {
.nr_assigned_cpus = 1,
.boot_cpu_apic_id = BAD_APICID,
+ .real_bsp_apic_id = BAD_APICID,
};
-/*
- * Processor to be disabled specified by kernel parameter
- * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
- * avoid undefined behaviour caused by sending INIT from AP to BSP.
- */
-static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -123,6 +118,60 @@ static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
cpu_mark_primary_thread(cpu, apic_id);
}
+static __init bool check_for_real_bsp(u32 apic_id)
+{
+ /*
+ * There is no real good way to detect whether this a kdump()
+ * kernel, but except on the Voyager SMP monstrosity which is not
+ * longer supported, the real BSP APIC ID is the first one which is
+ * enumerated by firmware. That allows to detect whether the boot
+ * CPU is the real BSP. If it is not, then do not register the APIC
+ * because sending INIT to the real BSP would reset the whole
+ * system.
+ *
+ * The first APIC ID which is enumerated by firmware is detectable
+ * because the boot CPU APIC ID is registered before that without
+ * invoking this code.
+ */
+ if (topo_info.real_bsp_apic_id != BAD_APICID)
+ return false;
+
+ if (apic_id == topo_info.boot_cpu_apic_id) {
+ topo_info.real_bsp_apic_id = apic_id;
+ return false;
+ }
+
+ pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x > %x\n",
+ topo_info.boot_cpu_apic_id, apic_id);
+ pr_warn("Crash kernel detected. Disabling real BSP to prevent machine INIT\n");
+
+ topo_info.real_bsp_apic_id = apic_id;
+ return true;
+}
+
+static __init void topo_register_apic(u32 apic_id, u32 acpi_id, bool present)
+{
+ int cpu;
+
+ if (present) {
+ set_bit(apic_id, phys_cpu_present_map);
+
+ /*
+ * Double registration is valid in case of the boot CPU
+ * APIC because that is registered before the enumeration
+ * of the APICs via firmware parsers or VM guest
+ * mechanisms.
+ */
+ if (apic_id == topo_info.boot_cpu_apic_id)
+ cpu = 0;
+ else
+ cpu = topo_get_cpunr(apic_id);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
+ } else {
+ topo_info.nr_disabled_cpus++;
+ }
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -131,16 +180,13 @@ static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
*/
void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
-
if (apic_id >= MAX_LOCAL_APIC) {
pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
topo_info.nr_rejected_cpus++;
return;
}
- if (disabled_cpu_apicid == apic_id) {
- pr_info("Disabling CPU as requested via 'disable_cpu_apicid=0x%x'.\n", apic_id);
+ if (check_for_real_bsp(apic_id)) {
topo_info.nr_rejected_cpus++;
return;
}
@@ -152,23 +198,7 @@ void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
return;
}
- if (present) {
- set_bit(apic_id, phys_cpu_present_map);
-
- /*
- * Double registration is valid in case of the boot CPU
- * APIC because that is registered before the enumeration
- * of the APICs via firmware parsers or VM guest
- * mechanisms.
- */
- if (apic_id == topo_info.boot_cpu_apic_id)
- cpu = 0;
- else
- cpu = topo_get_cpunr(apic_id);
- topo_set_cpuids(cpu, apic_id, acpi_id);
- } else {
- topo_info.nr_disabled_cpus++;
- }
+ topo_register_apic(apic_id, acpi_id, present);
}
/**
@@ -182,7 +212,7 @@ void __init topology_register_boot_apic(u32 apic_id)
WARN_ON_ONCE(topo_info.boot_cpu_apic_id != BAD_APICID);
topo_info.boot_cpu_apic_id = apic_id;
- topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
+ topo_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
#ifdef CONFIG_ACPI_HOTPLUG_CPU
@@ -335,12 +365,3 @@ static int __init setup_possible_cpus(char *str)
}
early_param("possible_cpus", setup_possible_cpus);
#endif
-
-static int __init apic_set_disabled_cpu_apicid(char *arg)
-{
- if (!arg || !get_option(&arg, &disabled_cpu_apicid))
- return -EINVAL;
-
- return 0;
-}
-early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Rework possible CPU management
2024-02-13 21:05 ` [patch 14/30] x86/cpu/topology: Rework possible CPU management Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 7c0edad3643f4493c4dafa6f5dfcfb1a86432156
Gitweb: https://git.kernel.org/tip/7c0edad3643f4493c4dafa6f5dfcfb1a86432156
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:53 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu/topology: Rework possible CPU management
Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of
that it's backwards because it applies command line limits after
registering all APICs.
Rewrite it so that it:
- Applies the command line limits upfront so that only the allowed amount
of APIC IDs can be registered.
- Applies eventual late restrictions in an understandable way
- Uses simple min_t() calculations which are trivial to follow.
- Provides a separate function for resetting to UP mode late in the
bringup process.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.290098853@linutronix.de
---
arch/x86/include/asm/apic.h | 5 +-
arch/x86/include/asm/cpu.h | 10 +--
arch/x86/include/asm/topology.h | 1 +-
arch/x86/kernel/cpu/topology.c | 176 ++++++++++++++++++-------------
arch/x86/kernel/setup.c | 9 +--
arch/x86/kernel/smpboot.c | 6 +-
6 files changed, 118 insertions(+), 89 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 28e9aa4..94ce0f7 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -175,6 +175,9 @@ extern void topology_register_apic(u32 apic_id, u32 acpi_id, bool present);
extern void topology_register_boot_apic(u32 apic_id);
extern int topology_hotplug_apic(u32 apic_id, u32 acpi_id);
extern void topology_hotunplug_apic(unsigned int cpu);
+extern void topology_apply_cmdline_limits_early(void);
+extern void topology_init_possible_cpus(void);
+extern void topology_reset_possible_cpus_up(void);
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
@@ -190,6 +193,8 @@ static inline void apic_intr_mode_init(void) { }
static inline void lapic_assign_system_vectors(void) { }
static inline void lapic_assign_legacy_vector(unsigned int i, bool r) { }
static inline bool apic_needs_pit(void) { return true; }
+static inline void topology_apply_cmdline_limits_early(void) { }
+static inline void topology_init_possible_cpus(void) { }
#endif /* !CONFIG_X86_LOCAL_APIC */
#ifdef CONFIG_X86_X2APIC
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index f8f9a9b..aa30fd8 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -9,18 +9,10 @@
#include <linux/percpu.h>
#include <asm/ibt.h>
-#ifdef CONFIG_SMP
-
-extern void prefill_possible_map(void);
-
-#else /* CONFIG_SMP */
-
-static inline void prefill_possible_map(void) {}
-
+#ifndef CONFIG_SMP
#define cpu_physical_id(cpu) boot_cpu_physical_apicid
#define cpu_acpi_id(cpu) 0
#define safe_smp_processor_id() 0
-
#endif /* CONFIG_SMP */
#ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index cb6bafd..3e11a5a 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -191,6 +191,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
}
+
#else /* CONFIG_SMP */
#define topology_max_packages() (1)
static inline int
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 80625f4..fc47f52 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -5,6 +5,7 @@
#include <xen/xen.h>
#include <asm/apic.h>
+#include <asm/io_apic.h>
#include <asm/mpspec.h>
#include <asm/smp.h>
@@ -85,73 +86,6 @@ early_initcall(smp_init_primary_thread_mask);
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-static int __initdata setup_possible_cpus = -1;
-
-/*
- * cpu_possible_mask should be static, it cannot change as cpu's
- * are onlined, or offlined. The reason is per-cpu data-structures
- * are allocated by some modules at init time, and don't expect to
- * do this dynamically on cpu arrival/departure.
- * cpu_present_mask on the other hand can change dynamically.
- * In case when cpu_hotplug is not compiled, then we resort to current
- * behaviour, which is cpu_possible == cpu_present.
- * - Ashok Raj
- *
- * Three ways to find out the number of additional hotplug CPUs:
- * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
- * - The user can overwrite it with possible_cpus=NUM
- * - Otherwise don't reserve additional CPUs.
- * We do this because additional CPUs waste a lot of memory.
- * -AK
- */
-__init void prefill_possible_map(void)
-{
- unsigned int num_processors = topo_info.nr_assigned_cpus;
- unsigned int disabled_cpus = topo_info.nr_disabled_cpus;
- int i, possible;
-
- i = setup_max_cpus ?: 1;
- if (setup_possible_cpus == -1) {
- possible = topo_info.nr_assigned_cpus;
-#ifdef CONFIG_HOTPLUG_CPU
- if (setup_max_cpus)
- possible += num_processors;
-#else
- if (possible > i)
- possible = i;
-#endif
- } else
- possible = setup_possible_cpus;
-
- total_cpus = max_t(int, possible, num_processors + disabled_cpus);
-
- /* nr_cpu_ids could be reduced via nr_cpus= */
- if (possible > nr_cpu_ids) {
- pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
- possible, nr_cpu_ids);
- possible = nr_cpu_ids;
- }
-
-#ifdef CONFIG_HOTPLUG_CPU
- if (!setup_max_cpus)
-#endif
- if (possible > i) {
- pr_warn("%d Processors exceeds max_cpus limit of %u\n",
- possible, setup_max_cpus);
- possible = i;
- }
-
- set_nr_cpu_ids(possible);
-
- pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
- possible, max_t(int, possible - num_processors, 0));
-
- reset_cpu_possible_mask();
-
- for (i = 0; i < possible; i++)
- set_cpu_possible(i, true);
-}
-
static int topo_lookup_cpuid(u32 apic_id)
{
int i;
@@ -293,12 +227,114 @@ void topology_hotunplug_apic(unsigned int cpu)
}
#endif
-static int __init _setup_possible_cpus(char *str)
+#ifdef CONFIG_SMP
+static unsigned int max_possible_cpus __initdata = NR_CPUS;
+
+/**
+ * topology_apply_cmdline_limits_early - Apply topology command line limits early
+ *
+ * Ensure that command line limits are in effect before firmware parsing
+ * takes place.
+ */
+void __init topology_apply_cmdline_limits_early(void)
{
- get_option(&str, &setup_possible_cpus);
+ unsigned int possible = nr_cpu_ids;
+
+ /* 'maxcpus=0' 'nosmp' 'nolapic' 'disableapic' 'noapic' */
+ if (!setup_max_cpus || ioapic_is_disabled || apic_is_disabled)
+ possible = 1;
+
+ /* 'possible_cpus=N' */
+ possible = min_t(unsigned int, max_possible_cpus, possible);
+
+ if (possible < nr_cpu_ids) {
+ pr_info("Limiting to %u possible CPUs\n", possible);
+ set_nr_cpu_ids(possible);
+ }
+}
+
+static __init bool restrict_to_up(void)
+{
+ if (!smp_found_config || ioapic_is_disabled)
+ return true;
+ /*
+ * XEN PV is special as it does not advertise the local APIC
+ * properly, but provides a fake topology for it so that the
+ * infrastructure works. So don't apply the restrictions vs. APIC
+ * here.
+ */
+ if (xen_pv_domain())
+ return false;
+
+ return apic_is_disabled;
+}
+
+void __init topology_init_possible_cpus(void)
+{
+ unsigned int assigned = topo_info.nr_assigned_cpus;
+ unsigned int disabled = topo_info.nr_disabled_cpus;
+ unsigned int total = assigned + disabled;
+ unsigned int cpu, allowed = 1;
+
+ if (!restrict_to_up()) {
+ if (WARN_ON_ONCE(assigned > nr_cpu_ids)) {
+ disabled += assigned - nr_cpu_ids;
+ assigned = nr_cpu_ids;
+ }
+ allowed = min_t(unsigned int, total, nr_cpu_ids);
+ }
+
+ if (total > allowed)
+ pr_warn("%u possible CPUs exceed the limit of %u\n", total, allowed);
+
+ assigned = min_t(unsigned int, allowed, assigned);
+ disabled = allowed - assigned;
+
+ topo_info.nr_assigned_cpus = assigned;
+ topo_info.nr_disabled_cpus = disabled;
+
+ total_cpus = allowed;
+ set_nr_cpu_ids(allowed);
+
+ pr_info("Allowing %u present CPUs plus %u hotplug CPUs\n", assigned, disabled);
+ if (topo_info.nr_rejected_cpus)
+ pr_info("Rejected CPUs %u\n", topo_info.nr_rejected_cpus);
+
+ init_cpu_present(cpumask_of(0));
+ init_cpu_possible(cpumask_of(0));
+
+ for (cpu = 0; cpu < allowed; cpu++) {
+ u32 apicid = cpuid_to_apicid[cpu];
+
+ set_cpu_possible(cpu, true);
+
+ if (apicid == BAD_APICID)
+ continue;
+
+ set_cpu_present(cpu, test_bit(apicid, phys_cpu_present_map));
+ }
+}
+
+/*
+ * Late SMP disable after sizing CPU masks when APIC/IOAPIC setup failed.
+ */
+void __init topology_reset_possible_cpus_up(void)
+{
+ init_cpu_present(cpumask_of(0));
+ init_cpu_possible(cpumask_of(0));
+
+ bitmap_zero(phys_cpu_present_map, MAX_LOCAL_APIC);
+ if (topo_info.boot_cpu_apic_id != BAD_APICID)
+ set_bit(topo_info.boot_cpu_apic_id, phys_cpu_present_map);
+}
+
+static int __init setup_possible_cpus(char *str)
+{
+ get_option(&str, &max_possible_cpus);
return 0;
}
-early_param("possible_cpus", _setup_possible_cpus);
+early_param("possible_cpus", setup_possible_cpus);
+#endif
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index b1e52ac..4e320d4 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1131,6 +1131,8 @@ void __init setup_arch(char **cmdline_p)
early_quirks();
+ topology_apply_cmdline_limits_early();
+
/*
* Parse SMP configuration. Try ACPI first and then the platform
* specific parser.
@@ -1138,13 +1140,10 @@ void __init setup_arch(char **cmdline_p)
acpi_boot_init();
x86_init.mpparse.parse_smp_cfg();
- /*
- * Systems w/o ACPI and mptables might not have it mapped the local
- * APIC yet, but prefill_possible_map() might need to access it.
- */
+ /* Last opportunity to detect and map the local APIC */
init_apic_mappings();
- prefill_possible_map();
+ topology_init_possible_cpus();
init_cpu_to_node();
init_gi_nodes();
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d850fac..7f85f17 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1147,11 +1147,7 @@ static __init void disable_smp(void)
pr_info("SMP disabled\n");
disable_ioapic_support();
-
- init_cpu_present(cpumask_of(0));
- init_cpu_possible(cpumask_of(0));
-
- reset_phys_cpu_present_map(smp_found_config ? boot_cpu_physical_apicid : 0);
+ topology_reset_possible_cpus_up();
cpumask_set_cpu(0, topology_sibling_cpumask(0));
cpumask_set_cpu(0, topology_core_cpumask(0));
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Sanitize the APIC admission logic
2024-02-13 21:05 ` [patch 13/30] x86/cpu/topology: Sanitize the APIC admission logic Thomas Gleixner
@ 2024-02-16 15:16 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:16 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 0e53e7b656cf5aa67c08eca381cec858478195a7
Gitweb: https://git.kernel.org/tip/0e53e7b656cf5aa67c08eca381cec858478195a7
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:52 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu/topology: Sanitize the APIC admission logic
Move the actually required content of generic_processor_id() into the call
sites and use common helper functions for them. This separates the early
boot registration and the ACPI hotplug mechanism completely which allows
further cleanups and improvements.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.230433953@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 159 +++++++++++++++-----------------
1 file changed, 77 insertions(+), 82 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 815e3ee..80625f4 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -30,8 +30,10 @@ static struct {
unsigned int nr_assigned_cpus;
unsigned int nr_disabled_cpus;
unsigned int nr_rejected_cpus;
+ u32 boot_cpu_apic_id;
} topo_info __read_mostly = {
.nr_assigned_cpus = 1,
+ .boot_cpu_apic_id = BAD_APICID,
};
/*
@@ -83,78 +85,6 @@ early_initcall(smp_init_primary_thread_mask);
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-static int topo_lookup_cpuid(u32 apic_id)
-{
- int i;
-
- /* CPU# to APICID mapping is persistent once it is established */
- for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
- if (cpuid_to_apicid[i] == apic_id)
- return i;
- }
- return -ENODEV;
-}
-
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(u32 apic_id)
-{
- int cpu = topo_lookup_cpuid(apic_id);
-
- if (cpu >= 0)
- return cpu;
-
- return topo_info.nr_assigned_cpus++;
-}
-
-static void cpu_update_apic(unsigned int cpu, u32 apic_id)
-{
-#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
-#endif
- cpuid_to_apicid[cpu] = apic_id;
- set_cpu_possible(cpu, true);
- set_bit(apic_id, phys_cpu_present_map);
- set_cpu_present(cpu, true);
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apic_id);
-}
-
-static int generic_processor_info(int apicid)
-{
- int cpu;
-
- /* The boot CPU must be set before MADT/MPTABLE parsing happens */
- if (cpuid_to_apicid[0] == BAD_APICID)
- panic("Boot CPU APIC not registered yet\n");
-
- if (apicid == boot_cpu_physical_apicid)
- return 0;
-
- if (disabled_cpu_apicid == apicid) {
- int thiscpu = topo_info.nr_assigned_cpus + topo_info.nr_disabled_cpus;
-
- pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
- thiscpu, apicid);
-
- topo_info.nr_rejected_cpus++;
- return -ENODEV;
- }
-
- if (topo_info.nr_assigned_cpus >= nr_cpu_ids) {
- pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
- topo_info.nr_rejected_cpus++;
- return -ENOSPC;
- }
-
- cpu = allocate_logical_cpuid(apicid);
- cpu_update_apic(cpu, apicid);
- return cpu;
-}
-
static int __initdata setup_possible_cpus = -1;
/*
@@ -222,6 +152,43 @@ __init void prefill_possible_map(void)
set_cpu_possible(i, true);
}
+static int topo_lookup_cpuid(u32 apic_id)
+{
+ int i;
+
+ /* CPU# to APICID mapping is persistent once it is established */
+ for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
+ if (cpuid_to_apicid[i] == apic_id)
+ return i;
+ }
+ return -ENODEV;
+}
+
+static int topo_get_cpunr(u32 apic_id)
+{
+ int cpu = topo_lookup_cpuid(apic_id);
+
+ if (cpu >= 0)
+ return cpu;
+
+ return topo_info.nr_assigned_cpus++;
+}
+
+static void topo_set_cpuids(unsigned int cpu, u32 apic_id, u32 acpi_id)
+{
+#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
+ early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+#endif
+ cpuid_to_apicid[cpu] = apic_id;
+
+ set_cpu_possible(cpu, true);
+ set_cpu_present(cpu, true);
+
+ if (system_state != SYSTEM_BOOTING)
+ cpu_mark_primary_thread(cpu, apic_id);
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -234,17 +201,40 @@ void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
if (apic_id >= MAX_LOCAL_APIC) {
pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
+ topo_info.nr_rejected_cpus++;
return;
}
- if (!present) {
- topo_info.nr_disabled_cpus++;
+ if (disabled_cpu_apicid == apic_id) {
+ pr_info("Disabling CPU as requested via 'disable_cpu_apicid=0x%x'.\n", apic_id);
+ topo_info.nr_rejected_cpus++;
return;
}
- cpu = generic_processor_info(apic_id);
- if (cpu >= 0)
- early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ /* CPU numbers exhausted? */
+ if (apic_id != topo_info.boot_cpu_apic_id && topo_info.nr_assigned_cpus >= nr_cpu_ids) {
+ pr_warn_once("CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
+ topo_info.nr_rejected_cpus++;
+ return;
+ }
+
+ if (present) {
+ set_bit(apic_id, phys_cpu_present_map);
+
+ /*
+ * Double registration is valid in case of the boot CPU
+ * APIC because that is registered before the enumeration
+ * of the APICs via firmware parsers or VM guest
+ * mechanisms.
+ */
+ if (apic_id == topo_info.boot_cpu_apic_id)
+ cpu = 0;
+ else
+ cpu = topo_get_cpunr(apic_id);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
+ } else {
+ topo_info.nr_disabled_cpus++;
+ }
}
/**
@@ -255,8 +245,10 @@ void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
*/
void __init topology_register_boot_apic(u32 apic_id)
{
- cpuid_to_apicid[0] = apic_id;
- cpu_update_apic(0, apic_id);
+ WARN_ON_ONCE(topo_info.boot_cpu_apic_id != BAD_APICID);
+
+ topo_info.boot_cpu_apic_id = apic_id;
+ topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
#ifdef CONFIG_ACPI_HOTPLUG_CPU
@@ -274,10 +266,13 @@ int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
cpu = topo_lookup_cpuid(apic_id);
if (cpu < 0) {
- cpu = generic_processor_info(apic_id);
- if (cpu >= 0)
- per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ if (topo_info.nr_assigned_cpus >= nr_cpu_ids)
+ return -ENOSPC;
+
+ cpu = topo_assign_cpunr(apic_id);
}
+ set_bit(apic_id, phys_cpu_present_map);
+ topo_set_cpuids(cpu, apic_id, acpi_id);
return cpu;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/smpboot: Make error message actually useful
2024-02-13 21:05 ` [patch 12/30] x86/smpboot: Make error message actually useful Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 6055f6cf0d462fa0d9212a8279b6b0d1130552e1
Gitweb: https://git.kernel.org/tip/6055f6cf0d462fa0d9212a8279b6b0d1130552e1
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:50 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/smpboot: Make error message actually useful
"smpboot: native_kick_ap: bad cpu 33" is absolutely useless information.
Replace it with something meaningful which allows to decode the failure
condition.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.170806023@linutronix.de
---
arch/x86/kernel/smpboot.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index bfb99b5..d850fac 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1072,9 +1072,13 @@ int native_kick_ap(unsigned int cpu, struct task_struct *tidle)
pr_debug("++++++++++++++++++++=_---CPU UP %u\n", cpu);
- if (apicid == BAD_APICID || !test_bit(apicid, phys_cpu_present_map) ||
- !apic_id_valid(apicid)) {
- pr_err("%s: bad cpu %d\n", __func__, cpu);
+ if (apicid == BAD_APICID || !apic_id_valid(apicid)) {
+ pr_err("CPU %u has invalid APIC ID %x. Aborting bringup\n", cpu, apicid);
+ return -EINVAL;
+ }
+
+ if (!test_bit(apicid, phys_cpu_present_map)) {
+ pr_err("CPU %u APIC ID %x is not present. Aborting bringup\n", cpu, apicid);
return -EINVAL;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Simplify APIC registration
2024-02-13 21:05 ` [patch 10/30] x86/cpu/topology: Simplify APIC registration Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 4c4c6f38704ab0e3f85f660b7479de7aa559d79a
Gitweb: https://git.kernel.org/tip/4c4c6f38704ab0e3f85f660b7479de7aa559d79a
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:47 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu/topology: Simplify APIC registration
Having the same check whether the number of assigned CPUs has reached the
nr_cpu_ids limit twice in the same code path is pointless. Repeating the
information that CPUs are ignored over and over is also pointless noise.
Remove the redundant check and reduce the noise by using a pr_warn_once().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.050264369@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 23 +++--------------------
1 file changed, 3 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index a6c9314..8b42918 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -107,14 +107,6 @@ static int allocate_logical_cpuid(u32 apic_id)
if (cpu >= 0)
return cpu;
- /* Allocate a new cpuid. */
- if (nr_logical_cpuids >= nr_cpu_ids) {
- WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
- "Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apic_id);
- return -EINVAL;
- }
-
cpuid_to_apicid[nr_logical_cpuids] = apic_id;
return nr_logical_cpuids++;
}
@@ -135,7 +127,7 @@ static void cpu_update_apic(int cpu, u32 apicid)
static int generic_processor_info(int apicid)
{
- int cpu, max = nr_cpu_ids;
+ int cpu;
/* The boot CPU must be set before MADT/MPTABLE parsing happens */
if (cpuid_to_apicid[0] == BAD_APICID)
@@ -155,21 +147,12 @@ static int generic_processor_info(int apicid)
}
if (num_processors >= nr_cpu_ids) {
- int thiscpu = max + disabled_cpus;
-
- pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
- "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
-
+ pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
disabled_cpus++;
- return -EINVAL;
+ return -ENOSPC;
}
cpu = allocate_logical_cpuid(apicid);
- if (cpu < 0) {
- disabled_cpus++;
- return -EINVAL;
- }
-
cpu_update_apic(cpu, apicid);
return cpu;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Use a data structure for topology info
2024-02-13 21:05 ` [patch 11/30] x86/cpu/topology: Use a data structure for topology info Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 72530464ed609bcdd49a760e9d0bc3b16717ff2b
Gitweb: https://git.kernel.org/tip/72530464ed609bcdd49a760e9d0bc3b16717ff2b
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:49 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:43 +01:00
x86/cpu/topology: Use a data structure for topology info
Put the processor accounting into a data structure, which will gain more
topology related information in the next steps, and sanitize the accounting.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210252.111451909@linutronix.de
---
arch/x86/kernel/cpu/topology.c | 59 ++++++++++++++++-----------------
1 file changed, 29 insertions(+), 30 deletions(-)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 8b42918..815e3ee 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -23,25 +23,24 @@ DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
/*
+ * Keep track of assigned, disabled and rejected CPUs. Present assigned
+ * with 1 as CPU #0 is reserved for the boot CPU.
+ */
+static struct {
+ unsigned int nr_assigned_cpus;
+ unsigned int nr_disabled_cpus;
+ unsigned int nr_rejected_cpus;
+} topo_info __read_mostly = {
+ .nr_assigned_cpus = 1,
+};
+
+/*
* Processor to be disabled specified by kernel parameter
* disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
* avoid undefined behaviour caused by sending INIT from AP to BSP.
*/
static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-static unsigned int num_processors;
-static unsigned int disabled_cpus;
-
-/*
- * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
- * contiguously, it equals to current allocated max logical CPU ID plus 1.
- * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
- * so the maximum of nr_logical_cpuids is nr_cpu_ids.
- *
- * NOTE: Reserve 0 for BSP.
- */
-static int nr_logical_cpuids = 1;
-
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
{
return phys_id == (u64)cpuid_to_apicid[cpu];
@@ -75,7 +74,7 @@ static int __init smp_init_primary_thread_mask(void)
return 0;
}
- for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
+ for (cpu = 0; cpu < topo_info.nr_assigned_cpus; cpu++)
cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
return 0;
}
@@ -89,7 +88,7 @@ static int topo_lookup_cpuid(u32 apic_id)
int i;
/* CPU# to APICID mapping is persistent once it is established */
- for (i = 0; i < nr_logical_cpuids; i++) {
+ for (i = 0; i < topo_info.nr_assigned_cpus; i++) {
if (cpuid_to_apicid[i] == apic_id)
return i;
}
@@ -107,22 +106,21 @@ static int allocate_logical_cpuid(u32 apic_id)
if (cpu >= 0)
return cpu;
- cpuid_to_apicid[nr_logical_cpuids] = apic_id;
- return nr_logical_cpuids++;
+ return topo_info.nr_assigned_cpus++;
}
-static void cpu_update_apic(int cpu, u32 apicid)
+static void cpu_update_apic(unsigned int cpu, u32 apic_id)
{
#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apic_id;
#endif
+ cpuid_to_apicid[cpu] = apic_id;
set_cpu_possible(cpu, true);
- set_bit(apicid, phys_cpu_present_map);
+ set_bit(apic_id, phys_cpu_present_map);
set_cpu_present(cpu, true);
- num_processors++;
if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apicid);
+ cpu_mark_primary_thread(cpu, apic_id);
}
static int generic_processor_info(int apicid)
@@ -137,18 +135,18 @@ static int generic_processor_info(int apicid)
return 0;
if (disabled_cpu_apicid == apicid) {
- int thiscpu = num_processors + disabled_cpus;
+ int thiscpu = topo_info.nr_assigned_cpus + topo_info.nr_disabled_cpus;
pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
thiscpu, apicid);
- disabled_cpus++;
+ topo_info.nr_rejected_cpus++;
return -ENODEV;
}
- if (num_processors >= nr_cpu_ids) {
+ if (topo_info.nr_assigned_cpus >= nr_cpu_ids) {
pr_warn_once("APIC: CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
- disabled_cpus++;
+ topo_info.nr_rejected_cpus++;
return -ENOSPC;
}
@@ -178,14 +176,16 @@ static int __initdata setup_possible_cpus = -1;
*/
__init void prefill_possible_map(void)
{
+ unsigned int num_processors = topo_info.nr_assigned_cpus;
+ unsigned int disabled_cpus = topo_info.nr_disabled_cpus;
int i, possible;
i = setup_max_cpus ?: 1;
if (setup_possible_cpus == -1) {
- possible = num_processors;
+ possible = topo_info.nr_assigned_cpus;
#ifdef CONFIG_HOTPLUG_CPU
if (setup_max_cpus)
- possible += disabled_cpus;
+ possible += num_processors;
#else
if (possible > i)
possible = i;
@@ -238,7 +238,7 @@ void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
}
if (!present) {
- disabled_cpus++;
+ topo_info.nr_disabled_cpus++;
return;
}
@@ -295,7 +295,6 @@ void topology_hotunplug_apic(unsigned int cpu)
per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
clear_bit(apic_id, phys_cpu_present_map);
set_cpu_present(cpu, false);
- num_processors--;
}
#endif
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Confine topology information
2024-02-13 21:05 ` [patch 09/30] x86/cpu/topology: Confine topology information Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 58aa34abe9954cd5dfbf322fc612146c5f45e52b
Gitweb: https://git.kernel.org/tip/58aa34abe9954cd5dfbf322fc612146c5f45e52b
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:46 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/cpu/topology: Confine topology information
Now that all external fiddling with num_processors and disabled_cpus is
gone, move the last user prefill_possible_map() into the topology code too
and remove the global visibility of these variables.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.994756960@linutronix.de
---
arch/x86/include/asm/smp.h | 3 +-
arch/x86/kernel/apic/apic.c | 1 +-
arch/x86/kernel/cpu/topology.c | 76 ++++++++++++++++++++++++++++++++-
arch/x86/kernel/smpboot.c | 72 +-------------------------------
4 files changed, 74 insertions(+), 78 deletions(-)
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4fab2ed..f1510d6 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -9,7 +9,6 @@
#include <asm/thread_info.h>
extern int smp_num_siblings;
-extern unsigned int num_processors;
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);
@@ -174,8 +173,6 @@ static inline struct cpumask *cpu_llc_shared_mask(int cpu)
}
#endif /* CONFIG_SMP */
-extern unsigned disabled_cpus;
-
#ifdef CONFIG_DEBUG_NMI_SELFTEST
extern void nmi_selftest(void);
#else
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 5814b80..a42d8a6 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2054,7 +2054,6 @@ void __init init_apic_mappings(void)
pr_info("APIC: disable apic facility\n");
apic_disable();
}
- num_processors = 1;
}
}
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 669e258..a6c9314 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -29,8 +29,8 @@ u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
*/
static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-unsigned int num_processors;
-unsigned disabled_cpus;
+static unsigned int num_processors;
+static unsigned int disabled_cpus;
/*
* The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
@@ -174,6 +174,71 @@ static int generic_processor_info(int apicid)
return cpu;
}
+static int __initdata setup_possible_cpus = -1;
+
+/*
+ * cpu_possible_mask should be static, it cannot change as cpu's
+ * are onlined, or offlined. The reason is per-cpu data-structures
+ * are allocated by some modules at init time, and don't expect to
+ * do this dynamically on cpu arrival/departure.
+ * cpu_present_mask on the other hand can change dynamically.
+ * In case when cpu_hotplug is not compiled, then we resort to current
+ * behaviour, which is cpu_possible == cpu_present.
+ * - Ashok Raj
+ *
+ * Three ways to find out the number of additional hotplug CPUs:
+ * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
+ * - The user can overwrite it with possible_cpus=NUM
+ * - Otherwise don't reserve additional CPUs.
+ * We do this because additional CPUs waste a lot of memory.
+ * -AK
+ */
+__init void prefill_possible_map(void)
+{
+ int i, possible;
+
+ i = setup_max_cpus ?: 1;
+ if (setup_possible_cpus == -1) {
+ possible = num_processors;
+#ifdef CONFIG_HOTPLUG_CPU
+ if (setup_max_cpus)
+ possible += disabled_cpus;
+#else
+ if (possible > i)
+ possible = i;
+#endif
+ } else
+ possible = setup_possible_cpus;
+
+ total_cpus = max_t(int, possible, num_processors + disabled_cpus);
+
+ /* nr_cpu_ids could be reduced via nr_cpus= */
+ if (possible > nr_cpu_ids) {
+ pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
+ possible, nr_cpu_ids);
+ possible = nr_cpu_ids;
+ }
+
+#ifdef CONFIG_HOTPLUG_CPU
+ if (!setup_max_cpus)
+#endif
+ if (possible > i) {
+ pr_warn("%d Processors exceeds max_cpus limit of %u\n",
+ possible, setup_max_cpus);
+ possible = i;
+ }
+
+ set_nr_cpu_ids(possible);
+
+ pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
+ possible, max_t(int, possible - num_processors, 0));
+
+ reset_cpu_possible_mask();
+
+ for (i = 0; i < possible; i++)
+ set_cpu_possible(i, true);
+}
+
/**
* topology_register_apic - Register an APIC in early topology maps
* @apic_id: The APIC ID to set up
@@ -251,6 +316,13 @@ void topology_hotunplug_apic(unsigned int cpu)
}
#endif
+static int __init _setup_possible_cpus(char *str)
+{
+ get_option(&str, &setup_possible_cpus);
+ return 0;
+}
+early_param("possible_cpus", _setup_possible_cpus);
+
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
if (!arg || !get_option(&arg, &disabled_cpu_apicid))
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 000b856..bfb99b5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1291,78 +1291,6 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
cache_aps_init();
}
-static int __initdata setup_possible_cpus = -1;
-static int __init _setup_possible_cpus(char *str)
-{
- get_option(&str, &setup_possible_cpus);
- return 0;
-}
-early_param("possible_cpus", _setup_possible_cpus);
-
-
-/*
- * cpu_possible_mask should be static, it cannot change as cpu's
- * are onlined, or offlined. The reason is per-cpu data-structures
- * are allocated by some modules at init time, and don't expect to
- * do this dynamically on cpu arrival/departure.
- * cpu_present_mask on the other hand can change dynamically.
- * In case when cpu_hotplug is not compiled, then we resort to current
- * behaviour, which is cpu_possible == cpu_present.
- * - Ashok Raj
- *
- * Three ways to find out the number of additional hotplug CPUs:
- * - If the BIOS specified disabled CPUs in ACPI/mptables use that.
- * - The user can overwrite it with possible_cpus=NUM
- * - Otherwise don't reserve additional CPUs.
- * We do this because additional CPUs waste a lot of memory.
- * -AK
- */
-__init void prefill_possible_map(void)
-{
- int i, possible;
-
- i = setup_max_cpus ?: 1;
- if (setup_possible_cpus == -1) {
- possible = num_processors;
-#ifdef CONFIG_HOTPLUG_CPU
- if (setup_max_cpus)
- possible += disabled_cpus;
-#else
- if (possible > i)
- possible = i;
-#endif
- } else
- possible = setup_possible_cpus;
-
- total_cpus = max_t(int, possible, num_processors + disabled_cpus);
-
- /* nr_cpu_ids could be reduced via nr_cpus= */
- if (possible > nr_cpu_ids) {
- pr_warn("%d Processors exceeds NR_CPUS limit of %u\n",
- possible, nr_cpu_ids);
- possible = nr_cpu_ids;
- }
-
-#ifdef CONFIG_HOTPLUG_CPU
- if (!setup_max_cpus)
-#endif
- if (possible > i) {
- pr_warn("%d Processors exceeds max_cpus limit of %u\n",
- possible, setup_max_cpus);
- possible = i;
- }
-
- set_nr_cpu_ids(possible);
-
- pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
- possible, max_t(int, possible - num_processors, 0));
-
- reset_cpu_possible_mask();
-
- for (i = 0; i < possible; i++)
- set_cpu_possible(i, true);
-}
-
/* correctly size the local cpu masks */
void __init setup_cpu_local_masks(void)
{
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/xen/smp_pv: Register fake APICs
2024-02-13 21:05 ` [patch 08/30] x86/xen/smp_pv: Register fake APICs Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: e7530702346637af46bca1d114e6d63312eb3461
Gitweb: https://git.kernel.org/tip/e7530702346637af46bca1d114e6d63312eb3461
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:44 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/xen/smp_pv: Register fake APICs
XENPV does not use the APIC. It's just piggy packing on the infrastructure
and fiddles with global variables as it sees fit.
These global variables are going away, so let XENPV register pseudo APIC
IDs to keep the accounting correct and keep up the illusion that XEN/PV is
something sane.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.940043512@linutronix.de
---
arch/x86/xen/smp_pv.c | 37 ++++++++++---------------------------
1 file changed, 10 insertions(+), 27 deletions(-)
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 7f6f340..98849b3 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -29,6 +29,7 @@
#include <asm/idtentry.h>
#include <asm/desc.h>
#include <asm/cpu.h>
+#include <asm/apic.h>
#include <asm/io_apic.h>
#include <xen/interface/xen.h>
@@ -150,34 +151,16 @@ int xen_smp_intr_init_pv(unsigned int cpu)
static void __init xen_pv_smp_config(void)
{
- int i, rc;
- unsigned int subtract = 0;
-
- num_processors = 0;
- disabled_cpus = 0;
- for (i = 0; i < nr_cpu_ids; i++) {
- rc = HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL);
- if (rc >= 0) {
- num_processors++;
- set_cpu_possible(i, true);
- } else {
- set_cpu_possible(i, false);
- set_cpu_present(i, false);
- subtract++;
- }
+ u32 apicid = 0;
+ int i;
+
+ topology_register_boot_apic(apicid++);
+
+ for (i = 1; i < nr_cpu_ids; i++) {
+ if (HYPERVISOR_vcpu_op(VCPUOP_is_up, i, NULL) < 0)
+ break;
+ topology_register_apic(apicid++, CPU_ACPIID_INVALID, true);
}
-#ifdef CONFIG_HOTPLUG_CPU
- /* This is akin to using 'nr_cpus' on the Linux command line.
- * Which is OK as when we use 'dom0_max_vcpus=X' we can only
- * have up to X, while nr_cpu_ids is greater than X. This
- * normally is not a problem, except when CPU hotplugging
- * is involved and then there might be more than X CPUs
- * in the guest - which will not work as there is no
- * hypercall to expand the max number of VCPUs an already
- * running guest has. So cap it up to X. */
- if (subtract)
- set_nr_cpu_ids(nr_cpu_ids - subtract);
-#endif
/* Pretend to be a proper enumerated system */
smp_found_config = 1;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/acpi: Dont invoke topology_register_apic() for XEN PV
2024-02-13 21:05 ` [patch 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: cab8e164a49c0ee5c9acb7edec33d76422d831bf
Gitweb: https://git.kernel.org/tip/cab8e164a49c0ee5c9acb7edec33d76422d831bf
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:43 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/acpi: Dont invoke topology_register_apic() for XEN PV
The MADT table for XEN/PV dom0 is not really useful and registering the
APICs is momentarily a pointless exercise because XENPV does not use an
APIC at all.
It overrides the x86_init.mpparse.parse_smp_config() callback, resets
num_processors and counts how many of them are provided by the hypervisor.
This is in the way of cleaning up the APIC registration. Prevent MADT
registration for XEN/PV temporarily until the rework is completed and
XEN/PV can use the MADT again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.885489468@linutronix.de
---
arch/x86/kernel/acpi/boot.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 42799d5..df741fb 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -23,6 +23,8 @@
#include <linux/serial_core.h>
#include <linux/pgtable.h>
+#include <xen/xen.h>
+
#include <asm/e820/api.h>
#include <asm/irqdomain.h>
#include <asm/pci_x86.h>
@@ -166,7 +168,8 @@ static int __init acpi_parse_madt(struct acpi_table_header *table)
static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
{
- topology_register_apic(apic_id, acpi_id, present);
+ if (!xen_pv_domain())
+ topology_register_apic(apic_id, acpi_id, present);
}
static bool __init acpi_is_processor_usable(u32 lapic_flags)
@@ -1087,7 +1090,8 @@ static int __init early_acpi_parse_madt_lapic_addr_ovr(void)
return count;
}
- register_lapic_address(acpi_lapic_addr);
+ if (!xen_pv_domain())
+ register_lapic_address(acpi_lapic_addr);
return count;
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/mpparse: Use new APIC registration function
2024-02-13 21:05 ` [patch 06/30] x86/mpparse: Use new APIC registration function Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 8098428c541212e9835c1771ee90caa968ffef4f
Gitweb: https://git.kernel.org/tip/8098428c541212e9835c1771ee90caa968ffef4f
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:42 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/mpparse: Use new APIC registration function
Aside of switching over to the new interface, record the number of
registered CPUs locally, which allows to make num_processors and
disabled_cpus confined to the topology code.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.830955273@linutronix.de
---
arch/x86/include/asm/mpspec.h | 2 --
arch/x86/kernel/cpu/topology.c | 2 +-
arch/x86/kernel/mpparse.c | 17 +++++++++--------
3 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/mpspec.h b/arch/x86/include/asm/mpspec.h
index 1b79d0e..c72c7ff 100644
--- a/arch/x86/include/asm/mpspec.h
+++ b/arch/x86/include/asm/mpspec.h
@@ -61,8 +61,6 @@ static inline void e820__memblock_alloc_reserved_mpc_new(void) { }
#define mpparse_parse_smp_config x86_init_noop
#endif
-int generic_processor_info(int apicid);
-
extern DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC);
static inline void reset_phys_cpu_present_map(u32 apicid)
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index 3dd7e6c..669e258 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -133,7 +133,7 @@ static void cpu_update_apic(int cpu, u32 apicid)
cpu_mark_primary_thread(cpu, apicid);
}
-int generic_processor_info(int apicid)
+static int generic_processor_info(int apicid)
{
int cpu, max = nr_cpu_ids;
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 9c000c4..1ccd30c 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -36,6 +36,8 @@
* Checksum an MP configuration block.
*/
+static unsigned int num_procs __initdata;
+
static int __init mpf_checksum(unsigned char *mp, int len)
{
int sum = 0;
@@ -50,16 +52,15 @@ static void __init MP_processor_info(struct mpc_cpu *m)
{
char *bootup_cpu = "";
- if (!(m->cpuflag & CPU_ENABLED)) {
- disabled_cpus++;
+ topology_register_apic(m->apicid, CPU_ACPIID_INVALID, m->cpuflag & CPU_ENABLED);
+ if (!(m->cpuflag & CPU_ENABLED))
return;
- }
if (m->cpuflag & CPU_BOOTPROCESSOR)
bootup_cpu = " (Bootup-CPU)";
pr_info("Processor #%d%s\n", m->apicid, bootup_cpu);
- generic_processor_info(m->apicid);
+ num_procs++;
}
#ifdef CONFIG_X86_IO_APIC
@@ -236,9 +237,9 @@ static int __init smp_read_mpc(struct mpc_table *mpc, unsigned early)
}
}
- if (!num_processors)
+ if (!num_procs && !acpi_lapic)
pr_err("MPTABLE: no processors registered!\n");
- return num_processors;
+ return num_procs || acpi_lapic;
}
#ifdef CONFIG_X86_IO_APIC
@@ -529,8 +530,8 @@ static __init void mpparse_get_smp_config(unsigned int early)
} else
BUG();
- if (!early)
- pr_info("Processors: %d\n", num_processors);
+ if (!early && !acpi_lapic)
+ pr_info("Processors: %d\n", num_procs);
/*
* Only use the first configuration found.
*/
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/of: Use new APIC registration functions
2024-02-13 21:05 ` [patch 05/30] x86/of: Use new APIC registration functions Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 7d319c0fcae68e489fcf806cdea46a795062eaf7
Gitweb: https://git.kernel.org/tip/7d319c0fcae68e489fcf806cdea46a795062eaf7
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:40 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/of: Use new APIC registration functions
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.776009244@linutronix.de
---
arch/x86/kernel/devicetree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/devicetree.c b/arch/x86/kernel/devicetree.c
index c05b900..4aeafe6 100644
--- a/arch/x86/kernel/devicetree.c
+++ b/arch/x86/kernel/devicetree.c
@@ -136,7 +136,7 @@ static void __init dtb_cpu_setup(void)
pr_warn("%pOF: missing local APIC ID\n", dn);
continue;
}
- generic_processor_info(apic_id);
+ topology_register_apic(apic_id, CPU_ACPIID_INVALID, true);
}
}
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/jailhouse: Use new APIC registration function
2024-02-13 21:05 ` [patch 04/30] x86/jailhouse: Use new APIC registration function Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 8cd01c8a68b083e2a0af601c5464e2dfa64f1421
Gitweb: https://git.kernel.org/tip/8cd01c8a68b083e2a0af601c5464e2dfa64f1421
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:39 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/jailhouse: Use new APIC registration function
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.720970412@linutronix.de
---
arch/x86/kernel/jailhouse.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
index 5128ac8..df33786 100644
--- a/arch/x86/kernel/jailhouse.c
+++ b/arch/x86/kernel/jailhouse.c
@@ -102,7 +102,7 @@ static void __init jailhouse_parse_smp_config(void)
register_lapic_address(0xfee00000);
for (cpu = 0; cpu < setup_data.v1.num_cpus; cpu++)
- generic_processor_info(setup_data.v1.cpu_ids[cpu]);
+ topology_register_apic(setup_data.v1.cpu_ids[cpu], CPU_ACPIID_INVALID, true);
smp_found_config = 1;
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/acpi: Use new APIC registration functions
2024-02-13 21:05 ` [patch 03/30] x86/acpi: Use new " Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: ff37b09c8495ed897ea470014d1461660db6a942
Gitweb: https://git.kernel.org/tip/ff37b09c8495ed897ea470014d1461660db6a942
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:37 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/acpi: Use new APIC registration functions
Use the new topology registration functions and make the early boot code
path __init. No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.664738831@linutronix.de
---
arch/x86/kernel/acpi/boot.c | 44 +++++-------------------------------
1 file changed, 7 insertions(+), 37 deletions(-)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 85a3ce2..42799d5 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -164,33 +164,9 @@ static int __init acpi_parse_madt(struct acpi_table_header *table)
return 0;
}
-/**
- * acpi_register_lapic - register a local apic and generates a logic cpu number
- * @id: local apic id to register
- * @acpiid: ACPI id to register
- * @enabled: this cpu is enabled or not
- *
- * Returns the logic cpu number which maps to the local apic
- */
-static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
+static __init void acpi_register_lapic(u32 apic_id, u32 acpi_id, bool present)
{
- int cpu;
-
- if (id >= MAX_LOCAL_APIC) {
- pr_info("skipped apicid that is too big\n");
- return -EINVAL;
- }
-
- if (!enabled) {
- ++disabled_cpus;
- return -EINVAL;
- }
-
- cpu = generic_processor_info(id);
- if (cpu >= 0)
- early_per_cpu(x86_cpu_to_acpiid, cpu) = acpiid;
-
- return cpu;
+ topology_register_apic(apic_id, acpi_id, present);
}
static bool __init acpi_is_processor_usable(u32 lapic_flags)
@@ -844,12 +820,10 @@ static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
return 0;
}
-int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
- int *pcpu)
+int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id, int *pcpu)
{
- int cpu;
+ int cpu = topology_hotplug_apic(physid, acpi_id);
- cpu = acpi_register_lapic(physid, acpi_id, ACPI_MADT_ENABLED);
if (cpu < 0) {
pr_info("Unable to map lapic to logical cpu number\n");
return cpu;
@@ -868,15 +842,11 @@ int acpi_unmap_cpu(int cpu)
#ifdef CONFIG_ACPI_NUMA
set_apicid_to_node(per_cpu(x86_cpu_to_apicid, cpu), NUMA_NO_NODE);
#endif
-
- per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
- set_cpu_present(cpu, false);
- num_processors--;
-
- return (0);
+ topology_hotunplug_apic(cpu);
+ return 0;
}
EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_CPU */
int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
{
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Provide separate APIC registration functions
2024-02-13 21:05 ` [patch 02/30] x86/cpu/topology: Provide separate APIC registration functions Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: 4176b541c2c68bf79d0a05f316713ed8f0c9cdb4
Gitweb: https://git.kernel.org/tip/4176b541c2c68bf79d0a05f316713ed8f0c9cdb4
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:36 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:42 +01:00
x86/cpu/topology: Provide separate APIC registration functions
generic_processor_info() aside of being a complete misnomer is used for
both early boot registration and ACPI CPU hotplug.
While it's arguable that this can share some code, it results in code which
is hard to understand and kept around post init for no real reason.
Also the call sites do lots of manual fiddling in topology related
variables instead of having proper interfaces for the purpose which handle
the topology internals correctly.
Provide topology_register_apic(), topology_hotplug_apic() and
topology_hotunplug_apic() which have the extra magic of the call sites
incorporated and for now are wrappers around generic_processor_info().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.605007456@linutronix.de
---
arch/x86/include/asm/apic.h | 3 +-
arch/x86/kernel/cpu/topology.c | 113 ++++++++++++++++++++++++++------
2 files changed, 98 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 11938d5..28e9aa4 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -171,7 +171,10 @@ extern bool apic_needs_pit(void);
extern void apic_send_IPI_allbutself(unsigned int vector);
+extern void topology_register_apic(u32 apic_id, u32 acpi_id, bool present);
extern void topology_register_boot_apic(u32 apic_id);
+extern int topology_hotplug_apic(u32 apic_id, u32 acpi_id);
+extern void topology_hotunplug_apic(unsigned int cpu);
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index b99cd19..3dd7e6c 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -84,32 +84,38 @@ early_initcall(smp_init_primary_thread_mask);
static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
#endif
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(int apicid)
+static int topo_lookup_cpuid(u32 apic_id)
{
int i;
- /*
- * cpuid <-> apicid mapping is persistent, so when a cpu is up,
- * check if the kernel has allocated a cpuid for it.
- */
+ /* CPU# to APICID mapping is persistent once it is established */
for (i = 0; i < nr_logical_cpuids; i++) {
- if (cpuid_to_apicid[i] == apicid)
+ if (cpuid_to_apicid[i] == apic_id)
return i;
}
+ return -ENODEV;
+}
+
+/*
+ * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
+ * and cpuid_to_apicid[] synchronized.
+ */
+static int allocate_logical_cpuid(u32 apic_id)
+{
+ int cpu = topo_lookup_cpuid(apic_id);
+
+ if (cpu >= 0)
+ return cpu;
/* Allocate a new cpuid. */
if (nr_logical_cpuids >= nr_cpu_ids) {
WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
"Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apicid);
+ nr_cpu_ids, nr_logical_cpuids, apic_id);
return -EINVAL;
}
- cpuid_to_apicid[nr_logical_cpuids] = apicid;
+ cpuid_to_apicid[nr_logical_cpuids] = apic_id;
return nr_logical_cpuids++;
}
@@ -127,12 +133,6 @@ static void cpu_update_apic(int cpu, u32 apicid)
cpu_mark_primary_thread(cpu, apicid);
}
-void __init topology_register_boot_apic(u32 apic_id)
-{
- cpuid_to_apicid[0] = apic_id;
- cpu_update_apic(0, apic_id);
-}
-
int generic_processor_info(int apicid)
{
int cpu, max = nr_cpu_ids;
@@ -174,6 +174,83 @@ int generic_processor_info(int apicid)
return cpu;
}
+/**
+ * topology_register_apic - Register an APIC in early topology maps
+ * @apic_id: The APIC ID to set up
+ * @acpi_id: The ACPI ID associated to the APIC
+ * @present: True if the corresponding CPU is present
+ */
+void __init topology_register_apic(u32 apic_id, u32 acpi_id, bool present)
+{
+ int cpu;
+
+ if (apic_id >= MAX_LOCAL_APIC) {
+ pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
+ return;
+ }
+
+ if (!present) {
+ disabled_cpus++;
+ return;
+ }
+
+ cpu = generic_processor_info(apic_id);
+ if (cpu >= 0)
+ early_per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+}
+
+/**
+ * topology_register_boot_apic - Register the boot CPU APIC
+ * @apic_id: The APIC ID to set up
+ *
+ * Separate so CPU #0 can be assigned
+ */
+void __init topology_register_boot_apic(u32 apic_id)
+{
+ cpuid_to_apicid[0] = apic_id;
+ cpu_update_apic(0, apic_id);
+}
+
+#ifdef CONFIG_ACPI_HOTPLUG_CPU
+/**
+ * topology_hotplug_apic - Handle a physical hotplugged APIC after boot
+ * @apic_id: The APIC ID to set up
+ * @acpi_id: The ACPI ID associated to the APIC
+ */
+int topology_hotplug_apic(u32 apic_id, u32 acpi_id)
+{
+ int cpu;
+
+ if (apic_id >= MAX_LOCAL_APIC)
+ return -EINVAL;
+
+ cpu = topo_lookup_cpuid(apic_id);
+ if (cpu < 0) {
+ cpu = generic_processor_info(apic_id);
+ if (cpu >= 0)
+ per_cpu(x86_cpu_to_acpiid, cpu) = acpi_id;
+ }
+ return cpu;
+}
+
+/**
+ * topology_hotunplug_apic - Remove a physical hotplugged APIC after boot
+ * @cpu: The CPU number for which the APIC ID is removed
+ */
+void topology_hotunplug_apic(unsigned int cpu)
+{
+ u32 apic_id = cpuid_to_apicid[cpu];
+
+ if (apic_id == BAD_APICID)
+ return;
+
+ per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
+ clear_bit(apic_id, phys_cpu_present_map);
+ set_cpu_present(cpu, false);
+ num_processors--;
+}
+#endif
+
static int __init apic_set_disabled_cpu_apicid(char *arg)
{
if (!arg || !get_option(&arg, &disabled_cpu_apicid))
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [tip: x86/apic] x86/cpu/topology: Move registration out of APIC code
2024-02-13 21:05 ` [patch 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
@ 2024-02-16 15:17 ` tip-bot2 for Thomas Gleixner
0 siblings, 0 replies; 61+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2024-02-16 15:17 UTC (permalink / raw)
To: linux-tip-commits
Cc: Thomas Gleixner, Michael Kelley, Sohil Mehta, x86, linux-kernel
The following commit has been merged into the x86/apic branch of tip:
Commit-ID: c0a66c2847908e41c771ca2355fba935a82a9f62
Gitweb: https://git.kernel.org/tip/c0a66c2847908e41c771ca2355fba935a82a9f62
Author: Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 13 Feb 2024 22:05:35 +01:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 15 Feb 2024 22:07:41 +01:00
x86/cpu/topology: Move registration out of APIC code
The APIC/CPU registration sits in the middle of the APIC code. In fact this
is a topology evaluation function and has nothing to do with the inner
workings of the local APIC.
Move it out into a file which reflects what this is about.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210251.543948812@linutronix.de
---
arch/x86/include/asm/apic.h | 2 +-
arch/x86/kernel/apic/apic.c | 185 +--------------------------------
arch/x86/kernel/cpu/Makefile | 12 +-
arch/x86/kernel/cpu/topology.c | 184 ++++++++++++++++++++++++++++++++-
4 files changed, 195 insertions(+), 188 deletions(-)
create mode 100644 arch/x86/kernel/cpu/topology.c
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 109f980..11938d5 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -171,6 +171,8 @@ extern bool apic_needs_pit(void);
extern void apic_send_IPI_allbutself(unsigned int vector);
+extern void topology_register_boot_apic(u32 apic_id);
+
#else /* !CONFIG_X86_LOCAL_APIC */
static inline void lapic_shutdown(void) { }
#define local_apic_timer_c2_ok 1
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fa11e25..5814b80 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -68,26 +68,12 @@
#include "local.h"
-unsigned int num_processors;
-
-unsigned disabled_cpus;
-
/* Processor that is doing the boot up */
u32 boot_cpu_physical_apicid __ro_after_init = BAD_APICID;
EXPORT_SYMBOL_GPL(boot_cpu_physical_apicid);
u8 boot_cpu_apic_version __ro_after_init;
-/* Bitmap of physically present CPUs. */
-DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC);
-
-/*
- * Processor to be disabled specified by kernel parameter
- * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
- * avoid undefined behaviour caused by sending INIT from AP to BSP.
- */
-static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
-
/*
* This variable controls which CPUs receive external NMIs. By default,
* external NMIs are delivered only to the BSP.
@@ -107,14 +93,6 @@ static inline bool apic_accessible(void)
return x2apic_mode || apic_mmio_base;
}
-/*
- * Map cpu index to physical APIC ID
- */
-DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid, BAD_APICID);
-DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid, CPU_ACPIID_INVALID);
-EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
-EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
-
#ifdef CONFIG_X86_32
/* Local APIC was disabled by the BIOS and enabled by the kernel */
static int enabled_via_apicbase __ro_after_init;
@@ -1676,8 +1654,6 @@ void apic_ap_setup(void)
end_local_APIC_setup();
}
-static __init void cpu_set_boot_apic(void);
-
static __init void apic_read_boot_cpu_id(bool x2apic)
{
/*
@@ -1692,7 +1668,8 @@ static __init void apic_read_boot_cpu_id(bool x2apic)
boot_cpu_physical_apicid = read_apic_id();
boot_cpu_apic_version = GET_APIC_VERSION(apic_read(APIC_LVR));
}
- cpu_set_boot_apic();
+ topology_register_boot_apic(boot_cpu_physical_apicid);
+ x86_32_probe_bigsmp_early();
}
#ifdef CONFIG_X86_X2APIC
@@ -2291,155 +2268,6 @@ void disconnect_bsp_APIC(int virt_wire_setup)
apic_write(APIC_LVT1, value);
}
-/*
- * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
- * contiguously, it equals to current allocated max logical CPU ID plus 1.
- * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
- * so the maximum of nr_logical_cpuids is nr_cpu_ids.
- *
- * NOTE: Reserve 0 for BSP.
- */
-static int nr_logical_cpuids = 1;
-
-/*
- * Used to store mapping between logical CPU IDs and APIC IDs.
- */
-u32 cpuid_to_apicid[] = { [0 ... NR_CPUS - 1] = BAD_APICID, };
-
-bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
-{
- return phys_id == (u64)cpuid_to_apicid[cpu];
-}
-
-#ifdef CONFIG_SMP
-static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
-{
- /* Isolate the SMT bit(s) in the APICID and check for 0 */
- u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
-
- if (smp_num_siblings == 1 || !(apicid & mask))
- cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
-}
-
-/*
- * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
- * during early boot. Initialize the primary thread mask before SMP
- * bringup.
- */
-static int __init smp_init_primary_thread_mask(void)
-{
- unsigned int cpu;
-
- /*
- * XEN/PV provides either none or useless topology information.
- * Pretend that all vCPUs are primary threads.
- */
- if (xen_pv_domain()) {
- cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
- return 0;
- }
-
- for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
- cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
- return 0;
-}
-early_initcall(smp_init_primary_thread_mask);
-#else
-static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
-#endif
-
-/*
- * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
- * and cpuid_to_apicid[] synchronized.
- */
-static int allocate_logical_cpuid(int apicid)
-{
- int i;
-
- /*
- * cpuid <-> apicid mapping is persistent, so when a cpu is up,
- * check if the kernel has allocated a cpuid for it.
- */
- for (i = 0; i < nr_logical_cpuids; i++) {
- if (cpuid_to_apicid[i] == apicid)
- return i;
- }
-
- /* Allocate a new cpuid. */
- if (nr_logical_cpuids >= nr_cpu_ids) {
- WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
- "Processor %d/0x%x and the rest are ignored.\n",
- nr_cpu_ids, nr_logical_cpuids, apicid);
- return -EINVAL;
- }
-
- cpuid_to_apicid[nr_logical_cpuids] = apicid;
- return nr_logical_cpuids++;
-}
-
-static void cpu_update_apic(int cpu, u32 apicid)
-{
-#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
- early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
-#endif
- set_cpu_possible(cpu, true);
- set_bit(apicid, phys_cpu_present_map);
- set_cpu_present(cpu, true);
- num_processors++;
-
- if (system_state != SYSTEM_BOOTING)
- cpu_mark_primary_thread(cpu, apicid);
-}
-
-static __init void cpu_set_boot_apic(void)
-{
- cpuid_to_apicid[0] = boot_cpu_physical_apicid;
- cpu_update_apic(0, boot_cpu_physical_apicid);
- x86_32_probe_bigsmp_early();
-}
-
-int generic_processor_info(int apicid)
-{
- int cpu, max = nr_cpu_ids;
-
- /* The boot CPU must be set before MADT/MPTABLE parsing happens */
- if (cpuid_to_apicid[0] == BAD_APICID)
- panic("Boot CPU APIC not registered yet\n");
-
- if (apicid == boot_cpu_physical_apicid)
- return 0;
-
- if (disabled_cpu_apicid == apicid) {
- int thiscpu = num_processors + disabled_cpus;
-
- pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
- thiscpu, apicid);
-
- disabled_cpus++;
- return -ENODEV;
- }
-
- if (num_processors >= nr_cpu_ids) {
- int thiscpu = max + disabled_cpus;
-
- pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
- "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
-
- disabled_cpus++;
- return -EINVAL;
- }
-
- cpu = allocate_logical_cpuid(apicid);
- if (cpu < 0) {
- disabled_cpus++;
- return -EINVAL;
- }
-
- cpu_update_apic(cpu, apicid);
- return cpu;
-}
-
-
void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
bool dmar)
{
@@ -2828,15 +2656,6 @@ static int __init lapic_insert_resource(void)
*/
late_initcall(lapic_insert_resource);
-static int __init apic_set_disabled_cpu_apicid(char *arg)
-{
- if (!arg || !get_option(&arg, &disabled_cpu_apicid))
- return -EINVAL;
-
- return 0;
-}
-early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
-
static int __init apic_set_extnmi(char *arg)
{
if (!arg)
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 9e0a1c1..eb4dbcd 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -26,14 +26,16 @@ obj-y += bugs.o
obj-y += aperfmperf.o
obj-y += cpuid-deps.o
obj-y += umwait.o
+obj-y += capflags.o powerflags.o
-obj-$(CONFIG_PROC_FS) += proc.o
-obj-y += capflags.o powerflags.o
+obj-$(CONFIG_X86_LOCAL_APIC) += topology.o
-obj-$(CONFIG_IA32_FEAT_CTL) += feat_ctl.o
+obj-$(CONFIG_PROC_FS) += proc.o
+
+obj-$(CONFIG_IA32_FEAT_CTL) += feat_ctl.o
ifdef CONFIG_CPU_SUP_INTEL
-obj-y += intel.o intel_pconfig.o tsx.o
-obj-$(CONFIG_PM) += intel_epb.o
+obj-y += intel.o intel_pconfig.o tsx.o
+obj-$(CONFIG_PM) += intel_epb.o
endif
obj-$(CONFIG_CPU_SUP_AMD) += amd.o
obj-$(CONFIG_CPU_SUP_HYGON) += hygon.o
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
new file mode 100644
index 0000000..b99cd19
--- /dev/null
+++ b/arch/x86/kernel/cpu/topology.c
@@ -0,0 +1,184 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/cpu.h>
+
+#include <xen/xen.h>
+
+#include <asm/apic.h>
+#include <asm/mpspec.h>
+#include <asm/smp.h>
+
+/*
+ * Map cpu index to physical APIC ID
+ */
+DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_apicid, BAD_APICID);
+DEFINE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid, CPU_ACPIID_INVALID);
+EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_apicid);
+EXPORT_EARLY_PER_CPU_SYMBOL(x86_cpu_to_acpiid);
+
+/* Bitmap of physically present CPUs. */
+DECLARE_BITMAP(phys_cpu_present_map, MAX_LOCAL_APIC) __read_mostly;
+
+/* Used for CPU number allocation and parallel CPU bringup */
+u32 cpuid_to_apicid[] __read_mostly = { [0 ... NR_CPUS - 1] = BAD_APICID, };
+
+/*
+ * Processor to be disabled specified by kernel parameter
+ * disable_cpu_apicid=<int>, mostly used for the kdump 2nd kernel to
+ * avoid undefined behaviour caused by sending INIT from AP to BSP.
+ */
+static u32 disabled_cpu_apicid __ro_after_init = BAD_APICID;
+
+unsigned int num_processors;
+unsigned disabled_cpus;
+
+/*
+ * The number of allocated logical CPU IDs. Since logical CPU IDs are allocated
+ * contiguously, it equals to current allocated max logical CPU ID plus 1.
+ * All allocated CPU IDs should be in the [0, nr_logical_cpuids) range,
+ * so the maximum of nr_logical_cpuids is nr_cpu_ids.
+ *
+ * NOTE: Reserve 0 for BSP.
+ */
+static int nr_logical_cpuids = 1;
+
+bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
+{
+ return phys_id == (u64)cpuid_to_apicid[cpu];
+}
+
+#ifdef CONFIG_SMP
+static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
+{
+ /* Isolate the SMT bit(s) in the APICID and check for 0 */
+ u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
+
+ if (smp_num_siblings == 1 || !(apicid & mask))
+ cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
+}
+
+/*
+ * Due to the utter mess of CPUID evaluation smp_num_siblings is not valid
+ * during early boot. Initialize the primary thread mask before SMP
+ * bringup.
+ */
+static int __init smp_init_primary_thread_mask(void)
+{
+ unsigned int cpu;
+
+ /*
+ * XEN/PV provides either none or useless topology information.
+ * Pretend that all vCPUs are primary threads.
+ */
+ if (xen_pv_domain()) {
+ cpumask_copy(&__cpu_primary_thread_mask, cpu_possible_mask);
+ return 0;
+ }
+
+ for (cpu = 0; cpu < nr_logical_cpuids; cpu++)
+ cpu_mark_primary_thread(cpu, cpuid_to_apicid[cpu]);
+ return 0;
+}
+early_initcall(smp_init_primary_thread_mask);
+#else
+static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
+#endif
+
+/*
+ * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids
+ * and cpuid_to_apicid[] synchronized.
+ */
+static int allocate_logical_cpuid(int apicid)
+{
+ int i;
+
+ /*
+ * cpuid <-> apicid mapping is persistent, so when a cpu is up,
+ * check if the kernel has allocated a cpuid for it.
+ */
+ for (i = 0; i < nr_logical_cpuids; i++) {
+ if (cpuid_to_apicid[i] == apicid)
+ return i;
+ }
+
+ /* Allocate a new cpuid. */
+ if (nr_logical_cpuids >= nr_cpu_ids) {
+ WARN_ONCE(1, "APIC: NR_CPUS/possible_cpus limit of %u reached. "
+ "Processor %d/0x%x and the rest are ignored.\n",
+ nr_cpu_ids, nr_logical_cpuids, apicid);
+ return -EINVAL;
+ }
+
+ cpuid_to_apicid[nr_logical_cpuids] = apicid;
+ return nr_logical_cpuids++;
+}
+
+static void cpu_update_apic(int cpu, u32 apicid)
+{
+#if defined(CONFIG_SMP) || defined(CONFIG_X86_64)
+ early_per_cpu(x86_cpu_to_apicid, cpu) = apicid;
+#endif
+ set_cpu_possible(cpu, true);
+ set_bit(apicid, phys_cpu_present_map);
+ set_cpu_present(cpu, true);
+ num_processors++;
+
+ if (system_state != SYSTEM_BOOTING)
+ cpu_mark_primary_thread(cpu, apicid);
+}
+
+void __init topology_register_boot_apic(u32 apic_id)
+{
+ cpuid_to_apicid[0] = apic_id;
+ cpu_update_apic(0, apic_id);
+}
+
+int generic_processor_info(int apicid)
+{
+ int cpu, max = nr_cpu_ids;
+
+ /* The boot CPU must be set before MADT/MPTABLE parsing happens */
+ if (cpuid_to_apicid[0] == BAD_APICID)
+ panic("Boot CPU APIC not registered yet\n");
+
+ if (apicid == boot_cpu_physical_apicid)
+ return 0;
+
+ if (disabled_cpu_apicid == apicid) {
+ int thiscpu = num_processors + disabled_cpus;
+
+ pr_warn("APIC: Disabling requested cpu. Processor %d/0x%x ignored.\n",
+ thiscpu, apicid);
+
+ disabled_cpus++;
+ return -ENODEV;
+ }
+
+ if (num_processors >= nr_cpu_ids) {
+ int thiscpu = max + disabled_cpus;
+
+ pr_warn("APIC: NR_CPUS/possible_cpus limit of %i reached. "
+ "Processor %d/0x%x ignored.\n", max, thiscpu, apicid);
+
+ disabled_cpus++;
+ return -EINVAL;
+ }
+
+ cpu = allocate_logical_cpuid(apicid);
+ if (cpu < 0) {
+ disabled_cpus++;
+ return -EINVAL;
+ }
+
+ cpu_update_apic(cpu, apicid);
+ return cpu;
+}
+
+static int __init apic_set_disabled_cpu_apicid(char *arg)
+{
+ if (!arg || !get_option(&arg, &disabled_cpu_apicid))
+ return -EINVAL;
+
+ return 0;
+}
+early_param("disable_cpu_apicid", apic_set_disabled_cpu_apicid);
^ permalink raw reply related [flat|nested] 61+ messages in thread
end of thread, other threads:[~2024-02-16 15:17 UTC | newest]
Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-13 21:05 [patch 00/30] x86/apic: Rework APIC registration Thomas Gleixner
2024-02-13 21:05 ` [patch 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 02/30] x86/cpu/topology: Provide separate APIC registration functions Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 03/30] x86/acpi: Use new " Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 04/30] x86/jailhouse: Use new APIC registration function Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 05/30] x86/of: Use new APIC registration functions Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 06/30] x86/mpparse: Use new APIC registration function Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 08/30] x86/xen/smp_pv: Register fake APICs Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 09/30] x86/cpu/topology: Confine topology information Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 10/30] x86/cpu/topology: Simplify APIC registration Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 11/30] x86/cpu/topology: Use a data structure for topology info Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 12/30] x86/smpboot: Make error message actually useful Thomas Gleixner
2024-02-16 15:17 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 13/30] x86/cpu/topology: Sanitize the APIC admission logic Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 14/30] x86/cpu/topology: Rework possible CPU management Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 15/30] x86/cpu: Detect real BSP on crash kernels Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 16/30] x86/topology: Add a mechanism to track topology via APIC IDs Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:05 ` [patch 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 19/30] x86/xen/smp_pv: Count number of vCPUs early Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 21/30] x86/cpu/topology: Use topology bitmaps for sizing Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 22/30] x86/cpu/topology: Mop up primary thread mask handling Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread() Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 24/30] x86/cpu/topology: Provide logical pkg/die mapping Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 25/30] x86/cpu/topology: Use topology logical mapping mechanism Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 27/30] x86/cpu/topology: Rename smp_num_siblings Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 28/30] x86/cpu/topology: Rename topology_max_die_per_package() Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner
2024-02-13 21:06 ` [patch 30/30] x86/cpu/topology: Get rid of cpuinfo:: X86_max_cores Thomas Gleixner
2024-02-16 15:16 ` [tip: x86/apic] x86/cpu/topology: Get rid of cpuinfo::x86_max_cores tip-bot2 for Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox