public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, Tom Lendacky <thomas.lendacky@amd.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Huang Rui <ray.huang@amd.com>, Juergen Gross <jgross@suse.com>,
	Dimitri Sivanich <dimitri.sivanich@hpe.com>,
	Sohil Mehta <sohil.mehta@intel.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Zhang Rui <rui.zhang@intel.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Feng Tang <feng.tang@intel.com>,
	Andy Shevchenko <andy@infradead.org>,
	Michael Kelley <mhklinux@outlook.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>
Subject: [patch v2 00/30] x86/apic: Rework APIC registration
Date: Tue, 23 Jan 2024 14:10:49 +0100 (CET)	[thread overview]
Message-ID: <20240118123127.055361964@linutronix.de> (raw)

This is a breakout from:

  https://lore.kernel.org/all/20230807130108.853357011@linutronix.de

addressing the issues of the current topology code:

  - Wrong core count on hybrid systems

  - Heuristics based size information for packages and dies which
    are failing to work correctly with certain command line parameters.

  - Full evaluation fail for a theoretical hybrid system which boots
    from an E-core

  - The complete insanity of manipulating global data from firmware parsers
    or the XEN/PV fake SMP enumeration. The latter is really a piece of art.

This series addresses this by

  - Consolidating all topology relevant functionality into one place

  - Providing separate interfaces for boot time and ACPI hotplug operations

  - A sane ordering of command line options and restrictions

  - A sensible way to handle the BSP problem in kdump kernels instead of
    the unreliable command line option.

  - Confinement of topology relevant variables by replacing the XEN/PV SMP
    enumeration fake with something halfways sensible.

  - Evaluation of sizes by analysing the topology via the CPUID provided
    APIC ID segmentation and the actual APIC IDs which are registered at
    boot time.

  - Removal of heuristics and broken size calculations

The idea behind this is the following:

The APIC IDs describe the system topology in multiple domain levels. The
CPUID topology parser provides the information which part of the APIC ID is
associated to the individual levels (Intel terminology):

   [ROOT][PACKAGE][DIEGRP][DIE][TILE][MODULE][CORE][THREAD]

The root space contains the package (socket) IDs. Not enumerated levels
consume 0 bits space, but conceptually they are always represented. If
e.g. only CORE and THREAD levels are enumerated then the DIEGRP, DIE,
MODULE and TILE have the same physical ID as the PACKAGE.

If SMT is not supported, then the THREAD domain is still used. It then
has the same physical ID as the CORE domain and is the only child of
the core domain.

This allows an unified view on the system independent of the enumerated
domain levels without requiring any conditionals in the code.

AMD does only expose 4 domain levels with obviously different terminology,
but that can be easily mapped into the Intel variant with a trivial lookup
table added to the CPUID parser.

The resulting topology information of an ADL hybrid system with 8 P-Cores
and 8 E-Cores looks like this:

 CPU topo: Max. logical packages:   1
 CPU topo: Max. logical dies:       1
 CPU topo: Max. dies per package:   1
 CPU topo: Max. threads per core:   2
 CPU topo: Num. cores per package:    16
 CPU topo: Num. threads per package:  24
 CPU topo: Allowing 24 present CPUs plus 0 hotplug CPUs
 CPU topo: Thread    :    24
 CPU topo: Core      :    16
 CPU topo: Module    :     1
 CPU topo: Tile      :     1
 CPU topo: Die       :     1
 CPU topo: Package   :     1

This is happening on the boot CPU before any of the APs is started and
provides correct size information right from the start.

Even the XEN/PV trainwreck makes use of this now. On Dom0 it utilizes the
MADT and on DomU it provides fake APIC IDs, which combined with the
provided CPUID information make it at least look halfways realistic instead
of claiming to have one CPU per package as the current upstream code does.

This is solely addressing the core topology issues, but there is a plan for
further consolidation of other topology related information into one single
source of information instead of having a gazillion of localized special
parsers and representations all over the place. There are quite some other
things which can be simplified on top of this, like updating the various
cpumasks during CPU bringup, but that's all left for later.

Changes vs. V1:

	- Breakout of the actual topology management changes

	- Adopt DIEGRP

	- Different approach to identify the BSP on enumeration (Rui)

The current series applies on top of 

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-cleanup-v2

and is available from git here:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git topo-full-v2

Thanks,

	tglx
---
 Documentation/admin-guide/kdump/kdump.rst                      |    7 
 Documentation/admin-guide/kernel-parameters.txt                |    9 
 Documentation/arch/x86/topology.rst                            |   24 
 arch/x86/events/intel/cstate.c                                 |    2 
 arch/x86/events/intel/uncore.c                                 |    2 
 arch/x86/events/intel/uncore_nhmex.c                           |    4 
 arch/x86/events/intel/uncore_snb.c                             |    8 
 arch/x86/events/intel/uncore_snbep.c                           |   18 
 arch/x86/events/rapl.c                                         |    2 
 arch/x86/include/asm/apic.h                                    |   10 
 arch/x86/include/asm/cpu.h                                     |   10 
 arch/x86/include/asm/mpspec.h                                  |    2 
 arch/x86/include/asm/perf_event_p4.h                           |    4 
 arch/x86/include/asm/processor.h                               |    2 
 arch/x86/include/asm/smp.h                                     |    6 
 arch/x86/include/asm/topology.h                                |   53 -
 arch/x86/kernel/acpi/boot.c                                    |   59 -
 arch/x86/kernel/apic/apic.c                                    |  186 ---
 arch/x86/kernel/cpu/Makefile                                   |   12 
 arch/x86/kernel/cpu/cacheinfo.c                                |    2 
 arch/x86/kernel/cpu/common.c                                   |   33 
 arch/x86/kernel/cpu/debugfs.c                                  |    7 
 arch/x86/kernel/cpu/mce/inject.c                               |    3 
 arch/x86/kernel/cpu/microcode/intel.c                          |    2 
 arch/x86/kernel/cpu/topology.c                                 |  484 ++++++++++
 arch/x86/kernel/cpu/topology.h                                 |   11 
 arch/x86/kernel/cpu/topology_common.c                          |   45 
 arch/x86/kernel/devicetree.c                                   |    2 
 arch/x86/kernel/jailhouse.c                                    |    2 
 arch/x86/kernel/mpparse.c                                      |   17 
 arch/x86/kernel/process.c                                      |    2 
 arch/x86/kernel/setup.c                                        |    9 
 arch/x86/kernel/smpboot.c                                      |  219 ----
 arch/x86/xen/apic.c                                            |   14 
 arch/x86/xen/enlighten_pv.c                                    |    3 
 arch/x86/xen/smp.c                                             |    2 
 arch/x86/xen/smp.h                                             |    2 
 arch/x86/xen/smp_pv.c                                          |   58 -
 drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c               |    2 
 drivers/hwmon/coretemp.c                                       |    2 
 drivers/hwmon/fam15h_power.c                                   |    2 
 drivers/platform/x86/intel/uncore-frequency/uncore-frequency.c |    2 
 drivers/powercap/intel_rapl_common.c                           |    2 
 drivers/thermal/intel/intel_hfi.c                              |    2 
 drivers/thermal/intel/intel_powerclamp.c                       |    2 
 drivers/thermal/intel/x86_pkg_temp_thermal.c                   |    2 
 46 files changed, 698 insertions(+), 655 deletions(-)



             reply	other threads:[~2024-01-23 13:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23 13:10 Thomas Gleixner [this message]
2024-01-23 13:10 ` [patch v2 01/30] x86/cpu/topology: Move registration out of APIC code Thomas Gleixner
2024-01-23 13:10 ` [patch v2 02/30] x86/cpu/topology: Provide separate APIC registration functions Thomas Gleixner
2024-01-23 13:10 ` [patch v2 03/30] x86/acpi: Use new " Thomas Gleixner
2024-01-23 13:10 ` [patch v2 04/30] x86/jailhouse: Use new APIC registration function Thomas Gleixner
2024-01-23 13:10 ` [patch v2 05/30] x86/of: Use new APIC registration functions Thomas Gleixner
2024-01-23 13:10 ` [patch v2 06/30] x86/mpparse: Use new APIC registration function Thomas Gleixner
2024-01-23 13:11 ` [patch v2 07/30] x86/acpi: Dont invoke topology_register_apic() for XEN PV Thomas Gleixner
2024-01-23 13:11 ` [patch v2 08/30] x86/xen/smp_pv: Register fake APICs Thomas Gleixner
2024-01-23 13:11 ` [patch v2 09/30] x86/cpu/topology: Confine topology information Thomas Gleixner
2024-01-23 13:11 ` [patch v2 10/30] x86/cpu/topology: Simplify APIC registration Thomas Gleixner
2024-01-23 13:11 ` [patch v2 11/30] x86/cpu/topology: Use a data structure for topology info Thomas Gleixner
2024-01-23 13:11 ` [patch v2 12/30] x86/smpboot: Make error message actually useful Thomas Gleixner
2024-01-23 13:11 ` [patch v2 13/30] x86/cpu/topology: Sanitize the APIC admission logic Thomas Gleixner
2024-01-23 13:11 ` [patch v2 14/30] x86/cpu/topology: Rework possible CPU management Thomas Gleixner
2024-01-31 23:47   ` Sohil Mehta
2024-01-23 13:11 ` [patch v2 15/30] x86/cpu: Detect real BSP on crash kernels Thomas Gleixner
2024-01-31 17:59   ` Michael Kelley
2024-01-23 13:11 ` [patch v2 16/30] x86/topology: Add a mechanism to track topology via APIC IDs Thomas Gleixner
2024-01-23 13:11 ` [patch v2 17/30] x86/cpu/topology: Reject unknown APIC IDs on ACPI hotplug Thomas Gleixner
2024-01-23 13:11 ` [patch v2 18/30] x86/cpu/topology: Assign hotpluggable CPUIDs during init Thomas Gleixner
2024-01-23 13:11 ` [patch v2 19/30] x86/xen/smp_pv: Count number of vCPUs early Thomas Gleixner
2024-01-23 13:11 ` [patch v2 20/30] x86/cpu/topology: Let XEN/PV use topology from CPUID/MADT Thomas Gleixner
2024-01-23 13:11 ` [patch v2 21/30] x86/cpu/topology: Use topology bitmaps for sizing Thomas Gleixner
2024-01-26  7:07   ` Zhang, Rui
2024-01-26 20:22     ` Thomas Gleixner
2024-01-28 20:01       ` Paul E. McKenney
2024-02-12 16:40       ` Thomas Gleixner
2024-02-12 19:49         ` Michael Kelley
2024-02-13 20:23         ` Sohil Mehta
2024-01-23 13:11 ` [patch v2 22/30] x86/cpu/topology: Mop up primary thread mask handling Thomas Gleixner
2024-01-23 13:11 ` [patch v2 23/30] x86/cpu/topology: Simplify cpu_mark_primary_thread() Thomas Gleixner
2024-01-23 13:11 ` [patch v2 24/30] x86/cpu/topology: Provide logical pkg/die mapping Thomas Gleixner
2024-01-23 13:11 ` [patch v2 25/30] x86/cpu/topology: Use topology logical mapping mechanism Thomas Gleixner
2024-02-01 22:31   ` Sohil Mehta
2024-02-02  6:45   ` Zhang, Rui
2024-02-12 16:21     ` Thomas Gleixner
2024-01-23 13:11 ` [patch v2 26/30] x86/cpu/topology: Retrieve cores per package from topology bitmaps Thomas Gleixner
2024-01-23 13:11 ` [patch v2 27/30] x86/cpu/topology: Rename smp_num_siblings Thomas Gleixner
2024-01-23 13:11 ` [patch v2 28/30] x86/cpu/topology: Rename topology_max_die_per_package() Thomas Gleixner
2024-01-23 13:11 ` [patch v2 29/30] x86/cpu/topology: Provide __num_[cores|threads]_per_package Thomas Gleixner
2024-01-23 13:11 ` [patch v2 30/30] x86/cpu/topology: Get rid of cpuinfo::x86_max_cores Thomas Gleixner
2024-01-24 14:31 ` [patch v2 00/30] x86/apic: Rework APIC registration Zhang, Rui
2024-02-01 22:10 ` Sohil Mehta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240118123127.055361964@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=andy@infradead.org \
    --cc=arjan@linux.intel.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=feng.tang@intel.com \
    --cc=jgross@suse.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ray.huang@amd.com \
    --cc=rui.zhang@intel.com \
    --cc=sohil.mehta@intel.com \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox