linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9
       [not found] ` <878qv8ypkl.ffs@tglx>
@ 2024-10-04 10:05   ` Niels Dettenbach
  2024-10-04 10:29     ` Jürgen Groß
  0 siblings, 1 reply; 5+ messages in thread
From: Niels Dettenbach @ 2024-10-04 10:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, H. Peter Anvin, Peter Zijlstra (Intel), Ingo Molnar,
	Juergen Gross, Thomas Gleixner, linux-kernel, linux-arch

Virtual machines under Xen Hypervisor (DomU) running in Xen PV mode use a 
special, nonstandard synthetized CPU topology which "just works" under 
kernels 6.9.x while newer kernels wrongly assuming a "crash kernel" and 
disable SMP (reducing to one CPU core) because the newer topology 
implementation produces a wrong error "[Firmware Bug]: APIC enumeration 
order not specification compliant" after new topology checks which are 
improper for Xen PV platform. As a result, the kernel disables SMP and 
activates just one CPU core within the PV DomU "VM" (DomU in PV mode).

The patch disables the regarding checks if it is running in Xen PV 
mode (only) and bring back SMP / all CPUs as in the past to such DomU 
VMs. The Xen subsystem takes care of the proper interaction between "guest" 
(DomU) and the "host" (Dom0).

Signed-off-by: Niels Dettenbach <nd@syndicat.com>

---


The current behaviour leads all of our production Xen Host platforms 
(amd64 - HPE proliant) unusable after updating to newer linux kernels 
(with just one CPU available/activated per VM) while older kernels and
other OS (current NetBSD PV DomU) still work fully (and stable since many 
years on the platform). 

Xen PV mode is still provided by current Xen and widely used - even 
if less wide as the newer Xen PVH mode today. So a solution probably 
will be required and the other exemptions in topology.c seems to support that 
as well.

So we assume that bug affects stable@vger.kernel.org as well.


dmesg from affected kernel on affected DomU PV:

-- snip --
[    0.640364] CPU topo: Enumerated BSP APIC 0 is not marked in APICBASE MSR
[    0.640367] CPU topo: Assuming crash kernel. Limiting to one CPU to 
prevent machine INIT
[    0.640368] CPU topo: [Firmware Bug]: APIC enumeration order not 
specification compliant
[    0.640376] CPU topo: Max. logical packages:   1
[    0.640378] CPU topo: Max. logical dies:       1
[    0.640379] CPU topo: Max. dies per package:   1
[    0.640386] CPU topo: Max. threads per core:   1
[    0.640388] CPU topo: Num. cores per package:     1
[    0.640389] CPU topo: Num. threads per package:   1
[    0.640390] CPU topo: Allowing 1 present CPUs plus 0 hotplug CPUs
[    0.640402] Cannot find an available gap in the 32-bit address range
-- snap --


after patch applied:
-- snip --
[    0.369439] CPU topo: Max. logical packages:   1
[    0.369441] CPU topo: Max. logical dies:       1
[    0.369442] CPU topo: Max. dies per package:   1
[    0.369450] CPU topo: Max. threads per core:   2
[    0.369452] CPU topo: Num. cores per package:     3
[    0.369453] CPU topo: Num. threads per package:   6
[    0.369453] CPU topo: Allowing 6 present CPUs plus 0 hotplug CPUs
-- snap --

We tested the patch intensely under productive / high load since 2 weeks now 
with no issues (no crashes emulated).


references:

arch/x86/kernel/cpu/topology.c
[line 448]
-- snip --
        /*
         * XEN PV is special as it does not advertise the local APIC
         * properly, but provides a fake topology for it so that the
         * infrastructure works. So don't apply the restrictions vs. APIC
         * here.
         */
--snap --


Am Dienstag, 1. Oktober 2024, 14:28:58  schrieben Sie:
> Please Cc LKML on such things as docuemnted...
> 
> On Mon, Sep 30 2024 at 09:43, Niels Dettenbach wrote:
> > Virtual machines under Xen Hypervisor (DomU) running in Xen PV mode use a
> > special, nonstandard synthetized CPU topology which "just works" under
> > kernels 6.9.x while newer kernels wrongly assuming a "crash kernel" and
> > disable SMP (reducing to one CPU core) because the newer topology
> > implementation produces a wrong error "[Firmware Bug]: APIC enumeration
> > order not specification compliant" after new topology checks which are
> > improper for Xen PV platform.
> 
> Why are they incorrect for XENPV? If the hypervisor exposes a topology
> via CPUID then that topology wants to be correct. Everything else is a
> firmware bug. And no, we are not papering over that.
> 
> > As a result, the kernel disables SMT and activates just one CPU core
> > within the VM (DomU).
> 
> I'm not seing how that happens due to the firmware bug issue. That's a
> different problem and has nothing to do with
> topology_register[_boot]_apic().
> 
> > The patch disables the regarding checks if it is running in Xen PV
> > mode (only) and bring back SMP / all CPUs as in the past to such DomU
> > VMs.
> 
> So what enumerates the APICs on your systems? Is the topology exposed
> via ACPI? If not, then this can't happen at all.
> 
> And please report this with LKML cc'ed with the above questions
> answered.

Hi Thomas,

my patch just extends existing exceptions for xen in topology.c

i took the exact inline explanation of my patch from existing (so i assume 
approved) code in topology.c - please have a look (line 449 of arch/x86/
kernel/cpu/topology.c).

xen is "part" of the linux kernel and in PV mode DomU (guest) the xen Dom0 
(„host“) apic topology is "emulated" by Xen itself in a „non-standard“ but 
working structure while Xen in PV DomU takes care of the frontend side. The 
patch only affects this very special constellation / setup.


In production environments different kernel / xen versions typically are used 
- means: even if the Xen project will „fix“ / or adapt to that new behaviour 
of linux kernel, it still breaks compatibility on affected existing Xen (PV) 
productive installations for at least several years.


The >6.9 kernel runs SMP fine if i boot it on the same bare metal of that 
different HP proliant machines (means instead of booting it as Xen PV guest 
onto a Linux xen Dom0 on that hardware) so the hardwares firmware could not 
be the issue here, while it is affected in Xen DomU VM guests (in PV mode) 
only on different hardware with different (but older) Xen versions / binaries 
(in Xen PV mode only). So from my prospective and as far as i read  the whole 
topology.c (i'm not a skilled kernel developer) which helds several such 
exemptions from different tests/checks / fnctions this is a (known) Xen 
specialty.

I know that Xen PV mode gets more succeeded by the newer xen PVHM mode these 
days, but PV mode is officially supported by Xen as linux kernel and still 
used in productive scenarios.

see i.e.: 
https://xen-orchestra.com/blog/xen-virtualization-modes/
https://wiki.xenproject.org/wiki/Understanding_the_Virtualization_Spectrum



many thanks for your time,


niels.




--- linux/arch/x86/kernel/cpu/topology.c.orig   2024-09-11 09:53:16.194095250 
+0200
+++ linux/arch/x86/kernel/cpu/topology.c        2024-09-30 09:39:11.041326786 
+0200
@@ -131,6 +131,18 @@ static __init bool check_for_real_bsp(u3
        bool is_bsp = false, has_apic_base = boot_cpu_data.x86 >= 6;
        u64 msr;

+
+        /*
+         * XEN PV is special as it does not advertise the local APIC
+         * properly, but provides a fake topology for it so that the
+         * infrastructure works. So don't apply the restrictions vs. APIC
+         * here.
+         */
+       if (xen_pv_domain()) {
+               topo_info.real_bsp_apic_id = topo_info.boot_cpu_apic_id;
+               return false;
+       }
+
        /*
         * There is no real good way to detect whether this a kdump()
         * kernel, but except on the Voyager SMP monstrosity which is not










^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9
  2024-10-04 10:05   ` [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9 Niels Dettenbach
@ 2024-10-04 10:29     ` Jürgen Groß
  2024-10-04 10:36       ` Niels Dettenbach
                         ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Jürgen Groß @ 2024-10-04 10:29 UTC (permalink / raw)
  To: x86, Borislav Petkov
  Cc: Dave Hansen, H. Peter Anvin, Peter Zijlstra (Intel), Ingo Molnar,
	Thomas Gleixner, linux-kernel, linux-arch


[-- Attachment #1.1.1: Type: text/plain, Size: 1041 bytes --]

On 04.10.24 12:05, Niels Dettenbach wrote:
> Virtual machines under Xen Hypervisor (DomU) running in Xen PV mode use a
> special, nonstandard synthetized CPU topology which "just works" under
> kernels 6.9.x while newer kernels wrongly assuming a "crash kernel" and
> disable SMP (reducing to one CPU core) because the newer topology
> implementation produces a wrong error "[Firmware Bug]: APIC enumeration
> order not specification compliant" after new topology checks which are
> improper for Xen PV platform. As a result, the kernel disables SMP and
> activates just one CPU core within the PV DomU "VM" (DomU in PV mode).
> 
> The patch disables the regarding checks if it is running in Xen PV
> mode (only) and bring back SMP / all CPUs as in the past to such DomU
> VMs. The Xen subsystem takes care of the proper interaction between "guest"
> (DomU) and the "host" (Dom0).
> 
> Signed-off-by: Niels Dettenbach <nd@syndicat.com>

Does the attached patch instead of yours help?

Compile tested only.


Juergen


[-- Attachment #1.1.2: 0001-x86-xen-mark-boot-CPU-of-PV-guest-in-MSR_IA32_APICBA.patch --]
[-- Type: text/x-patch, Size: 1066 bytes --]

From 2d48fb9ddca0aa6510f4f18966112222d405aedc Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@suse.com>
Date: Fri, 4 Oct 2024 12:22:12 +0200
Subject: [PATCH] x86/xen: mark boot CPU of PV guest in MSR_IA32_APICBASE

Recent topology checks of the x86 boot code uncovered the need for
PV guests to have the boot cpu marked in the APICBASE MSR.

Fixes: 9d22c96316ac ("x86/topology: Handle bogus ACPI tables correctly")
Reported-by: Niels Dettenbach <nd@syndicat.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/xen/enlighten_pv.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 2c12ae42dc8b..d6818c6cafda 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1032,6 +1032,10 @@ static u64 xen_do_read_msr(unsigned int msr, int *err)
 	switch (msr) {
 	case MSR_IA32_APICBASE:
 		val &= ~X2APIC_ENABLE;
+		if (smp_processor_id() == 0)
+			val |= MSR_IA32_APICBASE_BSP;
+		else
+			val &= ~MSR_IA32_APICBASE_BSP;
 		break;
 	}
 	return val;
-- 
2.43.0


[-- Attachment #1.1.3: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9
  2024-10-04 10:29     ` Jürgen Groß
@ 2024-10-04 10:36       ` Niels Dettenbach
  2024-10-07 10:29       ` Niels Dettenbach
  2024-10-07 12:12       ` Thomas Gleixner
  2 siblings, 0 replies; 5+ messages in thread
From: Niels Dettenbach @ 2024-10-04 10:36 UTC (permalink / raw)
  To: x86, Borislav Petkov, Jürgen Groß
  Cc: Dave Hansen, H. Peter Anvin, Peter Zijlstra (Intel), Ingo Molnar,
	Thomas Gleixner, linux-kernel, linux-arch

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

Am Freitag, 4. Oktober 2024, 12:29:57  schrieben Sie:
> On 04.10.24 12:05, Niels Dettenbach wrote:
> > Virtual machines under Xen Hypervisor (DomU) running in Xen PV mode use a
> > special, nonstandard synthetized CPU topology which "just works" under
> > kernels 6.9.x while newer kernels wrongly assuming a "crash kernel" and
> > disable SMP (reducing to one CPU core) because the newer topology
> > implementation produces a wrong error "[Firmware Bug]: APIC enumeration
> > order not specification compliant" after new topology checks which are
> > improper for Xen PV platform. As a result, the kernel disables SMP and
> > activates just one CPU core within the PV DomU "VM" (DomU in PV mode).
> > 
> > The patch disables the regarding checks if it is running in Xen PV
> > mode (only) and bring back SMP / all CPUs as in the past to such DomU
> > VMs. The Xen subsystem takes care of the proper interaction between
> > "guest" (DomU) and the "host" (Dom0).
> > 
> > Signed-off-by: Niels Dettenbach <nd@syndicat.com>
> 
> Does the attached patch instead of yours help?
> 
> Compile tested only.

Thanks Jürgen - will try until monday...


niels.






[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9
  2024-10-04 10:29     ` Jürgen Groß
  2024-10-04 10:36       ` Niels Dettenbach
@ 2024-10-07 10:29       ` Niels Dettenbach
  2024-10-07 12:12       ` Thomas Gleixner
  2 siblings, 0 replies; 5+ messages in thread
From: Niels Dettenbach @ 2024-10-07 10:29 UTC (permalink / raw)
  To: x86, H. Peter Anvin
  Cc: Borislav Petkov, Dave Hansen, Peter Zijlstra (Intel), Ingo Molnar,
	Thomas Gleixner, linux-kernel, linux-arch, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2277 bytes --]

Am Freitag, 4. Oktober 2024, 12:29:57  schrieb Jürgen Groß:
> On 04.10.24 12:05, Niels Dettenbach wrote:
> 
> > Virtual machines under Xen Hypervisor (DomU) running in Xen PV mode use
> > a
> > special, nonstandard synthetized CPU topology which "just works" under
> > kernels 6.9.x while newer kernels wrongly assuming a "crash kernel" and
> > disable SMP (reducing to one CPU core) because the newer topology
> > implementation produces a wrong error "[Firmware Bug]: APIC enumeration
> > order not specification compliant" after new topology checks which are
> > improper for Xen PV platform. As a result, the kernel disables SMP and
> > activates just one CPU core within the PV DomU "VM" (DomU in PV mode).
> > 
> > The patch disables the regarding checks if it is running in Xen PV
> > mode (only) and bring back SMP / all CPUs as in the past to such DomU
> > VMs. The Xen subsystem takes care of the proper interaction between
> > "guest" (DomU) and the "host" (Dom0).
> > 
> > Signed-off-by: Niels Dettenbach <nd@syndicat.com>
> 
> 
> Does the attached patch instead of yours help?
> 
> Compile tested only.


it does ß)))


domU:
-- snip --
vcpus=6
cpu="12,13,14,15,23,24"
-- snap --


-- snip --
[    0.500458] cpu 0 spinlock event irq 1
[    0.500485] VPMU disabled by hypervisor.
[    0.501273] Performance Events: unsupported p6 CPU model 62 no PMU driver, software events only.
[    0.501304] signal: max sigframe size: 1776
[    0.501410] rcu: Hierarchical SRCU implementation.
[    0.501428] rcu:     Max phase no-delay instances is 400.
[    0.502032] NMI watchdog: Perf NMI watchdog permanently disabled
[    0.502309] smp: Bringing up secondary CPUs ...
[    0.502759] installing Xen timer for CPU 2
[    0.503384] installing Xen timer for CPU 4
[    0.503838] cpu 2 spinlock event irq 16
[    0.503870] cpu 4 spinlock event irq 17
[    0.504867] installing Xen timer for CPU 1
[    0.505495] installing Xen timer for CPU 3
[    0.506125] installing Xen timer for CPU 5
[    0.506363] cpu 1 spinlock event irq 33
[    0.507869] cpu 3 spinlock event irq 34
[    0.507901] cpu 5 spinlock event irq 35
[    0.507923] smp: Brought up 1 node, 6 CPUs
-- snap --



thank you very much!


niels.


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9
  2024-10-04 10:29     ` Jürgen Groß
  2024-10-04 10:36       ` Niels Dettenbach
  2024-10-07 10:29       ` Niels Dettenbach
@ 2024-10-07 12:12       ` Thomas Gleixner
  2 siblings, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2024-10-07 12:12 UTC (permalink / raw)
  To: Jürgen Groß, x86, Borislav Petkov
  Cc: Dave Hansen, H. Peter Anvin, Peter Zijlstra (Intel), Ingo Molnar,
	linux-kernel, linux-arch

On Fri, Oct 04 2024 at 12:29, Jürgen Groß wrote:
> From: Juergen Gross <jgross@suse.com>
> Date: Fri, 4 Oct 2024 12:22:12 +0200
> Subject: [PATCH] x86/xen: mark boot CPU of PV guest in MSR_IA32_APICBASE
>
> Recent topology checks of the x86 boot code uncovered the need for
> PV guests to have the boot cpu marked in the APICBASE MSR.
>
> Fixes: 9d22c96316ac ("x86/topology: Handle bogus ACPI tables correctly")
> Reported-by: Niels Dettenbach <nd@syndicat.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-07 12:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <2210883.Icojqenx9y@gongov>
     [not found] ` <878qv8ypkl.ffs@tglx>
2024-10-04 10:05   ` [PATCH 1/1] x86: SMP broken on Xen PV DomU since 6.9 Niels Dettenbach
2024-10-04 10:29     ` Jürgen Groß
2024-10-04 10:36       ` Niels Dettenbach
2024-10-07 10:29       ` Niels Dettenbach
2024-10-07 12:12       ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).