linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Michael Kelley (LINUX)" <mikelley@microsoft.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Arjan van de Ven <arjan@linux.intel.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Dick Kennedy <dick.kennedy@broadcom.com>,
	James Smart <james.smart@broadcom.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	Guenter Roeck <linux@roeck-us.net>,
	"linux-hwmon@vger.kernel.org" <linux-hwmon@vger.kernel.org>,
	Jean Delvare <jdelvare@suse.com>, Huang Rui <ray.huang@amd.com>,
	Juergen Gross <jgross@suse.com>, Steve Wahl <steve.wahl@hpe.com>,
	Mike Travis <mike.travis@hpe.com>,
	Dimitri Sivanich <dimitri.sivanich@hpe.com>,
	Russ Anderson <russ.anderson@hpe.com>,
	linux-hyperv@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
Date: Wed, 02 Aug 2023 00:25:07 +0200	[thread overview]
Message-ID: <87r0omjt8c.ffs@tglx> (raw)
In-Reply-To: <873513n31m.ffs@tglx>

Michael!

On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> Clearly the hyper-v BIOS people put a lot of thoughts into this:
>
>>    x2APIC features / processor topology (0xb):
>>       extended APIC ID                      = 0
>>       --- level 0 ---
>>       level number                          = 0x0 (0)
>>       level type                            = thread (1)
>>       bit width of level                    = 0x1 (1)
>>       number of logical processors at level = 0x2 (2)
>>       --- level 1 ---
>>       level number                          = 0x1 (1)
>>       level type                            = core (2)
>>       bit width of level                    = 0x6 (6)
>>       number of logical processors at level = 0x40 (64)
>
> FAIL:                                           ^^^^^
>
> While that field is not meant for topology evaluation it is at least
> expected to tell the actual number of logical processors at that level
> which are actually available. 
>
> The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> 36 logical CPUs in package 0 and 36 in package 1 according to your
> table.
>
> On real hardware this looks like this:
>
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x38 (56)
>
> Which corresponds to reality and is consistent. But sure, consistency is
> overrated.

So I looked really hard to find some hint how to detect this situation
on the boot CPU, which allows us to mitigate it, but there is none at
all.

So we are caught between a rock and a hard place, which provides us two
mutually exclusive options to chose from:

  1) Have a sane topology evaluation mechanism which solves the known
     problems of hybrid systems, wrong sizing estimates and other
     unpleasantries.

  2) Support the Hyper-V BIOS trainwreck forever.

Unsurprisingly #2 is not really an option as #1 is a crucial issue for
the kernel and we need it resolved urgently as of yesterday.

So while I'm definitely a strong supporter of no-regression policy, I
have to make an argument here why this particular issue is _not_
covered:

 1) Hyper-V BIOS/firmware violates the firmware specification and
    requirements which are clearly spelled out in the SDM.

 2) This violatation is reported on every boot with one promiment
    message per brought up AP where the initial APIC ID as provided by
    CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
    also provided by MADT starting with CPU 36 in the provided example:

    "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"

    repeating itself up to CPU71 with the relevant diverging APIC IDs.

    At least that's what the upstream kernel produces according to
    validate_apic_and_package_id() in such an situation.

 3) This is known for years and the Hyper-V Linux team tried to get this
    resolved, but obviously their arguments fell on deaf ears.

    IOW, the firmware BUG message has been ignored willfully for years
    due to "works for me, why should I care?" attitude.

Seriously, kernel development cannot be held hostage forever by the
wilful ignorance of a BIOS team, which refuses to adhere to
specifications and defines their own world order.

The x86 maintainer team is chosing the lesser of two evils and lets
those who created the problem and refused to resolve it deal with the
outcome.

Just to clarify. This is not preventing affected guests from booting.
The worst consequence is a slight performance regression because the
firmware provided topology information is not matching reality and
therefore the scheduler placement vs. L3 affinity sucks. That's clearly
not a kernel problem.

I'm happy to aid accelerating this thought process by elevating the
existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
-	if (apicid != c->topo.apicid) {
+	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
 		       cpu, apicid, c->topo.initial_apicid);
 	}

  reply	other threads:[~2023-08-01 22:25 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28 12:12 [patch v2 00/38] x86/cpu: Rework the topology evaluation Thomas Gleixner
2023-07-28 12:12 ` [patch v2 01/38] x86/cpu: Encapsulate topology information in cpuinfo_x86 Thomas Gleixner
2023-07-28 12:12 ` [patch v2 02/38] x86/cpu: Move phys_proc_id into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 03/38] x86/cpu: Move cpu_die_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 04/38] scsi: lpfc: Use topology_core_id() Thomas Gleixner
2023-07-28 12:12 ` [patch v2 05/38] hwmon: (fam15h_power) " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 06/38] x86/cpu: Move cpu_core_id into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 07/38] x86/cpu: Move cu_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 08/38] x86/cpu: Remove pointless evaluation of x86_coreid_bits Thomas Gleixner
2023-07-28 12:12 ` [patch v2 09/38] x86/cpu: Move logical package and die IDs into topology info Thomas Gleixner
2023-07-28 12:12 ` [patch v2 10/38] x86/cpu: Move cpu_l[l2]c_id " Thomas Gleixner
2023-07-28 12:12 ` [patch v2 11/38] x86/apic: Use BAD_APICID consistently; Thomas Gleixner
2023-07-28 22:36   ` Sohil Mehta
2023-07-28 22:50     ` Thomas Gleixner
2023-07-28 12:12 ` [patch v2 12/38] x86/apic: Use u32 for APIC IDs in global data Thomas Gleixner
2023-07-28 12:12 ` [patch v2 13/38] x86/apic: Use u32 for check_apicid_used() Thomas Gleixner
2023-07-28 12:13 ` [patch v2 14/38] x86/apic: Use u32 for cpu_present_to_apicid() Thomas Gleixner
2023-07-28 12:13 ` [patch v2 15/38] x86/apic: Use u32 for phys_pkg_id() Thomas Gleixner
2023-08-08 20:14   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 16/38] x86/apic: Use u32 for [gs]et_apic_id() Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 17/38] x86/apic: Use u32 for wakeup_secondary_cpu[_64]() Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 18/38] x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids Thomas Gleixner
2023-07-28 12:13 ` [patch v2 19/38] x86/cpu: Provide debug interface Thomas Gleixner
2023-07-28 23:57   ` Sohil Mehta
2023-07-28 12:13 ` [patch v2 20/38] x86/cpu: Provide cpuid_read() et al Thomas Gleixner
2023-07-28 12:13 ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
2023-07-29  0:02   ` Sohil Mehta
2023-07-31  4:05   ` Michael Kelley (LINUX)
2023-07-31 12:34     ` Thomas Gleixner
2023-07-31 13:27       ` Peter Zijlstra
2023-07-31 15:38         ` Thomas Gleixner
2023-07-31 16:10           ` Michael Kelley (LINUX)
2023-07-31 20:48             ` Thomas Gleixner
2023-07-31 21:27               ` Michael Kelley (LINUX)
2023-07-31 22:12                 ` Thomas Gleixner
2023-08-01 22:25                   ` Thomas Gleixner [this message]
2023-08-01 22:35                     ` Andrew Cooper
2023-08-02 14:43                     ` Michael Kelley (LINUX)
2023-07-31 16:25       ` Michael Kelley (LINUX)
2023-07-31 20:41         ` Thomas Gleixner
2023-07-31 13:47     ` Arjan van de Ven
2023-07-31 14:08       ` Andrew Cooper
2023-08-01  7:05   ` Gautham R. Shenoy
2023-08-01  7:34     ` Thomas Gleixner
2023-07-28 12:13 ` [patch v2 22/38] x86/cpu: Add legacy topology parser Thomas Gleixner
2023-07-28 21:33   ` [patch v2A " Thomas Gleixner
2023-07-28 12:13 ` [patch v2 23/38] x86/cpu: Use common topology code for Centaur and Zhaoxin Thomas Gleixner
2023-07-28 12:13 ` [patch v2 24/38] x86/cpu: Move __max_die_per_package to common.c Thomas Gleixner
2023-07-28 12:13 ` [patch v2 25/38] x86/cpu: Provide a sane leaf 0xb/0x1f parser Thomas Gleixner
2023-07-28 12:13 ` [patch v2 26/38] x86/cpu: Use common topology code for Intel Thomas Gleixner
2023-07-28 12:13 ` [patch v2 27/38] x86/cpu/amd: Provide a separate acessor for Node ID Thomas Gleixner
2023-07-29  0:07   ` Sohil Mehta
2023-07-28 12:13 ` [patch v2 28/38] x86/cpu: Provide an AMD/HYGON specific topology parser Thomas Gleixner
2023-07-29  0:05   ` Sohil Mehta
2023-07-30  5:20   ` Michael Kelley (LINUX)
2023-07-30  7:41     ` Thomas Gleixner
2023-07-28 12:13 ` [patch v2 29/38] x86/smpboot: Teach it about topo.amd_node_id Thomas Gleixner
2023-07-28 12:13 ` [patch v2 30/38] x86/cpu: Use common topology code for AMD Thomas Gleixner
2023-07-28 12:13 ` [patch v2 31/38] x86/cpu: Use common topology code for HYGON Thomas Gleixner
2023-07-28 12:13 ` [patch v2 32/38] x86/mm/numa: Use core domain size on AMD Thomas Gleixner
2023-07-28 12:13 ` [patch v2 33/38] x86/cpu: Make topology_amd_node_id() use the actual node info Thomas Gleixner
2023-07-28 12:13 ` [patch v2 34/38] x86/cpu: Remove topology.c Thomas Gleixner
2023-07-28 12:13 ` [patch v2 35/38] x86/cpu: Remove x86_coreid_bits Thomas Gleixner
2023-07-28 12:13 ` [patch v2 36/38] x86/apic: Remove unused phys_pkg_id() callback Thomas Gleixner
2023-08-08 20:15   ` Steve Wahl
2023-07-28 12:13 ` [patch v2 37/38] x86/xen/smp_pv: Remove cpudata fiddling Thomas Gleixner
2023-07-28 12:13 ` [patch v2 38/38] x86/apic/uv: Remove the private leaf 0xb parser Thomas Gleixner
2023-08-08 20:16   ` Steve Wahl
2023-07-28 15:41 ` [patch v2 00/38] x86/cpu: Rework the topology evaluation Dimitri Sivanich
2023-07-28 19:01 ` Sohil Mehta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r0omjt8c.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=dick.kennedy@broadcom.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=james.smart@broadcom.com \
    --cc=jdelvare@suse.com \
    --cc=jejb@linux.ibm.com \
    --cc=jgross@suse.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=martin.petersen@oracle.com \
    --cc=mike.travis@hpe.com \
    --cc=mikelley@microsoft.com \
    --cc=peterz@infradead.org \
    --cc=ray.huang@amd.com \
    --cc=russ.anderson@hpe.com \
    --cc=steve.wahl@hpe.com \
    --cc=thomas.lendacky@amd.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).