public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Brown, Len" <len.brown@intel.com>,
	"Zhang, Rui" <rui.zhang@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: "Gross, Jurgen" <jgross@suse.com>,
	"mikelley@microsoft.com" <mikelley@microsoft.com>,
	"arjan@linux.intel.com" <arjan@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"ray.huang@amd.com" <ray.huang@amd.com>,
	"andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	"Sivanich, Dimitri" <dimitri.sivanich@hpe.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>
Subject: RE: [patch V3 27/40] x86/cpu: Provide a sane leaf 0xb/0x1f parser
Date: Mon, 14 Aug 2023 18:13:49 +0200	[thread overview]
Message-ID: <877cpxioua.ffs@tglx> (raw)
In-Reply-To: <MN0PR11MB6010CB72411BA723BEFC5565E017A@MN0PR11MB6010.namprd11.prod.outlook.com>

Len!

On Mon, Aug 14 2023 at 14:48, Len Brown wrote:
> First,  with multiple cores having the same core_id.
> A consumer of core_id must know about packages to understand core_id.
>
> This is the original sin of the current interface -- which should
> never have used the word "sibling" *anyplace*, Because to make sense
> of the word sibling, you must know what *contains* the siblings.

You're conflating things. The fact that core_id is relative to the
package has absolutely nothing to do with the concept of siblings.

Whether sibling is the right terminology or not is a problem on its own.

Fact is that CPUs are structured in different domain levels and that
structuring makes some sense as it reflects the actual hardware.

The question whether this structuring has a relevance for software or
not, is independent of that. That's something which needs to be defined
and there are certainly aspects which affect scheduling, while others
affect power or other entities and some affect all of them.

> We introduced "core_cpus" a number of years ago to address this for
> core_ids (and for other levels, Such as die_cpus).  Unfortunately, we
> can probably never actually delete threads_siblings and core_siblings
> Without breaking some program someplace...

Sorry, but "core_cpus" is just a different terminology. I'm not seeing
how this is solving anything.

> Core_id should be system-wide global, just like the cpu number is
> system wide global.  Today, it is the only level id that is not system
> wide global.

That's simply not true. cpu_info::die_id is package relative too, which
is silly to begin with and caused to add this noisy logical_die_id muck.

> This could be implemented by simply not masking off the package_id
> bits when creating the core_id,

You have to shift it right by one, if the system is SMT capable, or just
use the APIC ID if it's not (i.e. 0x1f subleaf 0 has a shift of 0). Not
more not less.

Alternatively all IDs are not shifted right at all and just the bits
below the actual level are masked off.

> Like we have done for other levels.

Which one exactly? The only level ID which is truly system wide unique
is package ID.

Die ID is not and core ID is not and there are no other levels the
current upstream code is dealing with.

> Yes, this could be awkward for some existing code that indexes arrays
> with core_id, and doesn't like them to be sparse.  But that rough edge
> is a much smaller problem than having to comprehend a level (package)
> that you may Otherwise not care about.  Besides, core_id's can already
> be sparse today.

It's not awkward. It's just a matter of auditing all the places which
care about core ID and fixing them up in case they can't deal with it.
I went through the die ID usage before making die ID unique to ensure
that it wont break anything.

I surely have mopped up more complex things than that, so where is the
problem doing the same for core ID?

> Secondly, with the obsolescence of CPUID.0b and its replacement with
> CPUID.1F, the contract between The hardware and the software is that a
> level can appear and can in between any existing levels.  (the only
> exception is that SMT is married to core).

In theory, yes. But what's the practical relevance that there might be a
new level between CORE and MODULE or MODULE and TILE etc...?

> It is not possible For an old kernel to know the name or position of a
> new level in the hierarchy, going forward.

Again, where is the practical problem? These new levels are not going to
be declared nilly willy and every other week, right?

> Today, this manifests with a (currently) latent bug that I caused.
> For I implemented die_id In the style of package_id, and I shouldn't
> have followed that example.

You did NOT. You implemented die_id relative to the package, which does
not make it unique in the same way as core_id is relative to the package
and therefore not unique.

Package ID is unique and the only reason why logical package ID exists
is because there have been systems with massive gaps between the package
IDs. That could have been handled differently, but that's again a
different story.

> Today, if CPUID.1F doesn't know anything about multiple DIE, Linux
> conjurs up A die_id 0 in sysfs.  It should not.  The reason is that
> when CPUID.1F enumerates A level that legacy code doesn't know about,
> we can't possibly tell if it is above DIE, or below DIE.  If it is
> above DIE, then our default die_id 0 is becomes bogus.

That's an implementation problem and the code I posted fixes this by
making die_id unique and taking the documented domain levels into
account.

So if 0x1f does not enumerate dies, then each package has one die and
the die ID is the same as the package ID. It's that simple.

> That said, I have voiced my objection inside Intel to the creation of
> random levels Which do not have an architectural (software)
> definition; and I'm advocating that They be *removed* from the SDM
> until a software programming definition that Spans all generation is
> documented.
>
> SMT, core, module, die and the (implicit) package may not be well
> documented, But they do have existing uses and will thus live on.  The
> others maybe not.

Why removing them? If there is no documentation for using them then
software will ignore them, but they reflect how the hardware is built
and according to conversations with various people this topology
reflects other things which are undocumented.

What do you win by removing them from the SDM?

Absolutely nothing. Actually you lose because if they get added with
information how software should use those levels then the whole problem
I discussed with Rui about imaginary new domain levels surfacing in the
future is there sooner than later. If we deal with them correctly today
and ignore them for then kernels will just work on systems which
enumerate them, they just wont make use of these levels.

The amount of extra storage to handle them is marginal and really not
worth to debate.

Thanks,

        tglx

  reply	other threads:[~2023-08-14 16:14 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-02 10:20 [patch V3 00/40] x86/cpu: Rework the topology evaluation Thomas Gleixner
2023-08-02 10:20 ` [patch V3 01/40] cpu/SMT: Make SMT control more robust against enumeration failures Thomas Gleixner
2023-08-04 17:50   ` Borislav Petkov
2023-08-04 20:01     ` Thomas Gleixner
2023-08-02 10:21 ` [patch V3 02/40] x86/apic: Fake primary thread mask for XEN/PV Thomas Gleixner
2023-08-04 18:12   ` Borislav Petkov
2023-08-04 20:02     ` Thomas Gleixner
2023-08-02 10:21 ` [patch V3 03/40] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
2023-08-02 10:21 ` [patch V3 04/40] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
2023-08-02 10:21 ` [patch V3 05/40] x86/cpu: Move cpu_die_id " Thomas Gleixner
2023-08-09 14:32   ` Zhang, Rui
2023-08-09 15:14     ` Thomas Gleixner
2023-08-02 10:21 ` [patch V3 06/40] scsi: lpfc: Use topology_core_id() Thomas Gleixner
2023-08-02 10:21 ` [patch V3 07/40] hwmon: (fam15h_power) " Thomas Gleixner
2023-08-02 10:21 ` [patch V3 08/40] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
2023-08-02 10:21 ` [patch V3 09/40] x86/cpu: Move cu_id " Thomas Gleixner
2023-08-02 10:21 ` [patch V3 10/40] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
2023-08-02 10:21 ` [patch V3 11/40] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
2023-08-02 10:21 ` [patch V3 12/40] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
2023-08-02 10:21 ` [patch V3 13/40] x86/apic: Use BAD_APICID consistently Thomas Gleixner
2023-08-02 10:21 ` [patch V3 14/40] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
2023-08-02 10:21 ` [patch V3 15/40] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
2023-08-02 10:21 ` [patch V3 16/40] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
2023-08-02 10:21 ` [patch V3 17/40] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
2023-08-02 10:21 ` [patch V3 18/40] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
2023-08-02 10:21 ` [patch V3 19/40] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
2023-08-10  7:58   ` Qiuxu Zhuo
2023-08-02 10:21 ` [patch V3 20/40] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
2023-08-02 10:21 ` [patch V3 21/40] x86/cpu: Provide debug interface Thomas Gleixner
2023-08-02 10:21 ` [patch V3 22/40] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
2023-08-02 10:21 ` [patch V3 23/40] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
2023-08-04  8:14   ` K Prateek Nayak
2023-08-04  8:28     ` Thomas Gleixner
2023-08-04  8:34       ` K Prateek Nayak
2023-08-12  6:41   ` Zhang, Rui
2023-08-12  8:00     ` Zhang, Rui
2023-08-14  6:26       ` Thomas Gleixner
2023-08-14  7:11         ` Zhang, Rui
2023-08-13 13:30     ` Zhang, Rui
2023-08-13 14:36       ` Thomas Gleixner
2023-08-14  6:20         ` Thomas Gleixner
2023-08-14  6:42           ` Zhang, Rui
2023-08-02 10:21 ` [patch V3 24/40] x86/cpu: Add legacy topology parser Thomas Gleixner
2023-08-02 10:21 ` [patch V3 25/40] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
2023-08-02 10:21 ` [patch V3 26/40] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
2023-08-02 10:21 ` [patch V3 27/40] x86/cpu: Provide a sane leaf 0xb/0x1f parser Thomas Gleixner
2023-08-12  8:21   ` Zhang, Rui
2023-08-12 20:04     ` Thomas Gleixner
2023-08-13 15:04       ` Thomas Gleixner
2023-08-14  8:25         ` Zhang, Rui
2023-08-14 12:26           ` Thomas Gleixner
2023-08-14 14:48             ` Brown, Len
2023-08-14 16:13               ` Thomas Gleixner [this message]
2023-08-15 19:30                 ` Brown, Len
2023-08-17  9:09                   ` Thomas Gleixner
2023-08-18  5:01                     ` Brown, Len
2023-08-21 10:27                       ` Thomas Gleixner
2023-08-30  2:46                         ` Brown, Len
2023-08-30 12:39                           ` Thomas Gleixner
2023-09-01  3:09                             ` Brown, Len
2023-09-01  7:45                               ` Thomas Gleixner
2023-08-14 15:28             ` Zhang, Rui
2023-08-14 17:12               ` Thomas Gleixner
2023-08-02 10:21 ` [patch V3 28/40] x86/cpu: Use common topology code for Intel Thomas Gleixner
2023-08-02 10:21 ` [patch V3 29/40] x86/cpu/amd: Provide a separate accessor for Node ID Thomas Gleixner
2023-08-02 10:21 ` [patch V3 30/40] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
2023-08-02 19:28   ` Michael Kelley (LINUX)
2023-08-02 19:46     ` Thomas Gleixner
2023-08-02 19:51   ` [patch V3a " Thomas Gleixner
2023-08-11 12:58     ` Pu Wen
2023-08-11 17:11       ` Thomas Gleixner
2023-08-12  3:58         ` Pu Wen
2023-10-13  9:38           ` [tip: x86/core] x86/cpu/hygon: Fix the CPU topology evaluation for real tip-bot2 for Pu Wen
2023-08-02 10:21 ` [patch V3 31/40] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
2023-08-02 10:21 ` [patch V3 32/40] x86/cpu: Use common topology code for AMD Thomas Gleixner
2023-08-02 10:21 ` [patch V3 33/40] x86/cpu: Use common topology code for HYGON Thomas Gleixner
2023-08-11 13:00   ` Pu Wen
2023-08-11 17:11     ` Thomas Gleixner
2023-08-02 10:21 ` [patch V3 34/40] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
2023-08-02 10:21 ` [patch V3 35/40] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
2023-08-02 10:21 ` [patch V3 36/40] x86/cpu: Remove topology.c Thomas Gleixner
2023-08-02 10:21 ` [patch V3 37/40] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
2023-08-02 10:21 ` [patch V3 38/40] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
2023-08-02 10:21 ` [patch V3 39/40] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
2023-08-02 10:22 ` [patch V3 40/40] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
2023-08-02 11:56 ` [patch V3 00/40] x86/cpu: Rework the topology evaluation Juergen Gross
2023-08-03  0:07 ` Sohil Mehta
2023-08-03  3:44 ` Michael Kelley (LINUX)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cpxioua.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=jgross@suse.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=ray.huang@amd.com \
    --cc=rui.zhang@intel.com \
    --cc=thomas.lendacky@amd.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox