linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
       [not found]                 ` <873513n31m.ffs@tglx>
@ 2023-08-01 22:25                   ` Thomas Gleixner
  2023-08-01 22:35                     ` Andrew Cooper
  2023-08-02 14:43                     ` Michael Kelley (LINUX)
  0 siblings, 2 replies; 3+ messages in thread
From: Thomas Gleixner @ 2023-08-01 22:25 UTC (permalink / raw)
  To: Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86@kernel.org, Tom Lendacky, Andrew Cooper,
	Arjan van de Ven, James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi@vger.kernel.org, Guenter Roeck,
	linux-hwmon@vger.kernel.org, Jean Delvare, Huang Rui,
	Juergen Gross, Steve Wahl, Mike Travis, Dimitri Sivanich,
	Russ Anderson, linux-hyperv, Linus Torvalds, Greg Kroah-Hartman

Michael!

On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> Clearly the hyper-v BIOS people put a lot of thoughts into this:
>
>>    x2APIC features / processor topology (0xb):
>>       extended APIC ID                      = 0
>>       --- level 0 ---
>>       level number                          = 0x0 (0)
>>       level type                            = thread (1)
>>       bit width of level                    = 0x1 (1)
>>       number of logical processors at level = 0x2 (2)
>>       --- level 1 ---
>>       level number                          = 0x1 (1)
>>       level type                            = core (2)
>>       bit width of level                    = 0x6 (6)
>>       number of logical processors at level = 0x40 (64)
>
> FAIL:                                           ^^^^^
>
> While that field is not meant for topology evaluation it is at least
> expected to tell the actual number of logical processors at that level
> which are actually available. 
>
> The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> 36 logical CPUs in package 0 and 36 in package 1 according to your
> table.
>
> On real hardware this looks like this:
>
>       --- level 1 ---
>       level number                          = 0x1 (1)
>       level type                            = core (2)
>       bit width of level                    = 0x6 (6)
>       number of logical processors at level = 0x38 (56)
>
> Which corresponds to reality and is consistent. But sure, consistency is
> overrated.

So I looked really hard to find some hint how to detect this situation
on the boot CPU, which allows us to mitigate it, but there is none at
all.

So we are caught between a rock and a hard place, which provides us two
mutually exclusive options to chose from:

  1) Have a sane topology evaluation mechanism which solves the known
     problems of hybrid systems, wrong sizing estimates and other
     unpleasantries.

  2) Support the Hyper-V BIOS trainwreck forever.

Unsurprisingly #2 is not really an option as #1 is a crucial issue for
the kernel and we need it resolved urgently as of yesterday.

So while I'm definitely a strong supporter of no-regression policy, I
have to make an argument here why this particular issue is _not_
covered:

 1) Hyper-V BIOS/firmware violates the firmware specification and
    requirements which are clearly spelled out in the SDM.

 2) This violatation is reported on every boot with one promiment
    message per brought up AP where the initial APIC ID as provided by
    CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
    also provided by MADT starting with CPU 36 in the provided example:

    "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"

    repeating itself up to CPU71 with the relevant diverging APIC IDs.

    At least that's what the upstream kernel produces according to
    validate_apic_and_package_id() in such an situation.

 3) This is known for years and the Hyper-V Linux team tried to get this
    resolved, but obviously their arguments fell on deaf ears.

    IOW, the firmware BUG message has been ignored willfully for years
    due to "works for me, why should I care?" attitude.

Seriously, kernel development cannot be held hostage forever by the
wilful ignorance of a BIOS team, which refuses to adhere to
specifications and defines their own world order.

The x86 maintainer team is chosing the lesser of two evils and lets
those who created the problem and refused to resolve it deal with the
outcome.

Just to clarify. This is not preventing affected guests from booting.
The worst consequence is a slight performance regression because the
firmware provided topology information is not matching reality and
therefore the scheduler placement vs. L3 affinity sucks. That's clearly
not a kernel problem.

I'm happy to aid accelerating this thought process by elevating the
existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
 
 	apicid = apic->cpu_present_to_apicid(cpu);
 
-	if (apicid != c->topo.apicid) {
+	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
 		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
 		       cpu, apicid, c->topo.initial_apicid);
 	}

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-08-01 22:25                   ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
@ 2023-08-01 22:35                     ` Andrew Cooper
  2023-08-02 14:43                     ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Cooper @ 2023-08-01 22:35 UTC (permalink / raw)
  To: Thomas Gleixner, Michael Kelley (LINUX), Peter Zijlstra
  Cc: LKML, x86@kernel.org, Tom Lendacky, Arjan van de Ven,
	James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi@vger.kernel.org, Guenter Roeck,
	linux-hwmon@vger.kernel.org, Jean Delvare, Huang Rui,
	Juergen Gross, Steve Wahl, Mike Travis, Dimitri Sivanich,
	Russ Anderson, linux-hyperv, Linus Torvalds, Greg Kroah-Hartman

On 01/08/2023 11:25 pm, Thomas Gleixner wrote:
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
>  
>  	apicid = apic->cpu_present_to_apicid(cpu);
>  
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",

While you're fixing this, care to remove the chaotic-evil path of mixing
%u and %x with no 0x prefix?

~Andrew

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology()
  2023-08-01 22:25                   ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
  2023-08-01 22:35                     ` Andrew Cooper
@ 2023-08-02 14:43                     ` Michael Kelley (LINUX)
  1 sibling, 0 replies; 3+ messages in thread
From: Michael Kelley (LINUX) @ 2023-08-02 14:43 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra
  Cc: LKML, x86@kernel.org, Tom Lendacky, Andrew Cooper,
	Arjan van de Ven, James E.J. Bottomley, Dick Kennedy, James Smart,
	Martin K. Petersen, linux-scsi@vger.kernel.org, Guenter Roeck,
	linux-hwmon@vger.kernel.org, Jean Delvare, Huang Rui,
	Juergen Gross, Steve Wahl, Mike Travis, Dimitri Sivanich,
	Russ Anderson, linux-hyperv@vger.kernel.org, Linus Torvalds,
	Greg Kroah-Hartman

From: Thomas Gleixner <tglx@linutronix.de> Sent: Tuesday, August 1, 2023 3:25 PM
> 
> Michael!
> 
> On Tue, Aug 01 2023 at 00:12, Thomas Gleixner wrote:
> > On Mon, Jul 31 2023 at 21:27, Michael Kelley wrote:
> > Clearly the hyper-v BIOS people put a lot of thoughts into this:
> >
> >>    x2APIC features / processor topology (0xb):
> >>       extended APIC ID                      = 0
> >>       --- level 0 ---
> >>       level number                          = 0x0 (0)
> >>       level type                            = thread (1)
> >>       bit width of level                    = 0x1 (1)
> >>       number of logical processors at level = 0x2 (2)
> >>       --- level 1 ---
> >>       level number                          = 0x1 (1)
> >>       level type                            = core (2)
> >>       bit width of level                    = 0x6 (6)
> >>       number of logical processors at level = 0x40 (64)
> >
> > FAIL:                                           ^^^^^
> >
> > While that field is not meant for topology evaluation it is at least
> > expected to tell the actual number of logical processors at that level
> > which are actually available.
> >
> > The CPUID APIC ID aka initial_apicid clearly tells that the topology has
> > 36 logical CPUs in package 0 and 36 in package 1 according to your
> > table.
> >
> > On real hardware this looks like this:
> >
> >       --- level 1 ---
> >       level number                          = 0x1 (1)
> >       level type                            = core (2)
> >       bit width of level                    = 0x6 (6)
> >       number of logical processors at level = 0x38 (56)
> >
> > Which corresponds to reality and is consistent. But sure, consistency is
> > overrated.
> 
> So I looked really hard to find some hint how to detect this situation
> on the boot CPU, which allows us to mitigate it, but there is none at
> all.
> 
> So we are caught between a rock and a hard place, which provides us two
> mutually exclusive options to chose from:
> 
>   1) Have a sane topology evaluation mechanism which solves the known
>      problems of hybrid systems, wrong sizing estimates and other
>      unpleasantries.
> 
>   2) Support the Hyper-V BIOS trainwreck forever.
> 
> Unsurprisingly #2 is not really an option as #1 is a crucial issue for
> the kernel and we need it resolved urgently as of yesterday.
> 
> So while I'm definitely a strong supporter of no-regression policy, I
> have to make an argument here why this particular issue is _not_
> covered:
> 
>  1) Hyper-V BIOS/firmware violates the firmware specification and
>     requirements which are clearly spelled out in the SDM.
> 
>  2) This violatation is reported on every boot with one promiment
>     message per brought up AP where the initial APIC ID as provided by
>     CPUID leaf 0xB deviates from the APIC ID read from "hardware", which is
>     also provided by MADT starting with CPU 36 in the provided example:
> 
>     "[FIRMWARE BUG] CPU36: APIC id mismatch. Firmware: 40 APIC: 24"
> 
>     repeating itself up to CPU71 with the relevant diverging APIC IDs.
> 
>     At least that's what the upstream kernel produces according to
>     validate_apic_and_package_id() in such an situation.
> 
>  3) This is known for years and the Hyper-V Linux team tried to get this
>     resolved, but obviously their arguments fell on deaf ears.
> 
>     IOW, the firmware BUG message has been ignored willfully for years
>     due to "works for me, why should I care?" attitude.
> 
> Seriously, kernel development cannot be held hostage forever by the
> wilful ignorance of a BIOS team, which refuses to adhere to
> specifications and defines their own world order.
> 
> The x86 maintainer team is chosing the lesser of two evils and lets
> those who created the problem and refused to resolve it deal with the
> outcome.

Fair enough.  I don't have any basis to argue otherwise.  I'm in
discussions with the Hyper-V team about getting it fully fixed in
Hyper-V, and it looks like there's some movement to make it happen.

> 
> Just to clarify. This is not preventing affected guests from booting.
> The worst consequence is a slight performance regression because the
> firmware provided topology information is not matching reality and
> therefore the scheduler placement vs. L3 affinity sucks. That's clearly
> not a kernel problem.

Yes, if Linux will still boots and runs, that helps.  Then it really is up the
(virtual) firmware in Hyper-V to provide the correct topology information
so performance is as expected.

> 
> I'm happy to aid accelerating this thought process by elevating the
> existing pr_err(FW_BUG....) to a solid WARN_ON_ONCE(). See below.

Your choice.  In this particular case, it won't make a difference either
way.

Michael

> 
> Thanks,
> 
>         tglx
> ---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1688,7 +1688,7 @@ static void validate_apic_and_package_id
> 
>  	apicid = apic->cpu_present_to_apicid(cpu);
> 
> -	if (apicid != c->topo.apicid) {
> +	if (WARN_ON_ONCE(apicid != c->topo.apicid)) {
>  		pr_err(FW_BUG "CPU%u: APIC id mismatch. Firmware: %x APIC: %x\n",
>  		       cpu, apicid, c->topo.initial_apicid);
>  	}

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-02 14:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20230728105650.565799744@linutronix.de>
     [not found] ` <20230728120930.839913695@linutronix.de>
     [not found]   ` <BYAPR21MB16889FD224344B1B28BE22A1D705A@BYAPR21MB1688.namprd21.prod.outlook.com>
     [not found]     ` <871qgop8dc.ffs@tglx>
     [not found]       ` <20230731132714.GH29590@hirez.programming.kicks-ass.net>
     [not found]         ` <87sf94nlaq.ffs@tglx>
     [not found]           ` <BYAPR21MB16885EA9B2A7F382D2F9C5A2D705A@BYAPR21MB1688.namprd21.prod.outlook.com>
     [not found]             ` <87fs53n6xd.ffs@tglx>
     [not found]               ` <BYAPR21MB1688CE738E6DB857031B829DD705A@BYAPR21MB1688.namprd21.prod.outlook.com>
     [not found]                 ` <873513n31m.ffs@tglx>
2023-08-01 22:25                   ` [patch v2 21/38] x86/cpu: Provide cpu_init/parse_topology() Thomas Gleixner
2023-08-01 22:35                     ` Andrew Cooper
2023-08-02 14:43                     ` Michael Kelley (LINUX)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).