qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric DeVolder <eric.devolder@oracle.com>
To: Igor Mammedov <imammedo@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Cc: shannon.zhaosl@gmail.com, ani@anisinha.ca,
	peter.maydell@linaro.org, qemu-arm@nongnu.org,
	qemu-devel@nongnu.org, marcel.apfelbaum@gmail.com,
	pbonzini@redhat.com, richard.henderson@linaro.org,
	eduardo@habkost.net, boris.ostrovsky@oracle.com
Subject: Re: [PATCH 0/2] hw/acpi: bump MADT to revision 5
Date: Tue, 4 Apr 2023 09:52:09 -0500	[thread overview]
Message-ID: <bae30149-0f0a-750a-a050-88e40fc1e4c1@oracle.com> (raw)
In-Reply-To: <20230331182538.15980cc9@imammedo.users.ipa.redhat.com>

I'm back from travel and catching up. More info below.
eric


On 3/31/23 11:25, Igor Mammedov wrote:
> On Wed, 29 Mar 2023 12:47:05 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>> On Wed, Mar 29, 2023 at 08:14:37AM -0500, Eric DeVolder wrote:
>>>
>>>
>>> On 3/29/23 00:19, Michael S. Tsirkin wrote:
>>>> Hmm I don't think we can reasonably make such a change for 8.0.
>>>> Seems too risky.
>>>> Also, I feel we want to have an internal (with "x-" prefix") flag to
>>>> revert to old behaviour, in case of breakage on some guests.  and maybe
>>>> we want to keep old revision for old machine types.
>>> Ok, what option name, for keeping old behavior, would you like?
>>
>> Don't much care. x-madt-rev?
> 
> if it works fine (cold & hot-plug) with older linux/windows guests
> I'd rather avoid adding compat knob (we typically do that in ACPI tables
> only when change breaks something).
> 
> (as old guest I'd define WinXP sp3 (/me wonders if we  still care about
> dead EOLed OS) perhaps WS2008 would be a better minimum target these days
> and RHEL6 (or some older ACPI enabled kernel with hotplug support))

Thus far I've tested this patch series with the following guests. In both cases below,
the guest started with -smp 8,maxcpus=16, and I simply attempted to hot plug and unplug
cpu8, with: device_add id=cpu8 driver=host-x86_64-cpu socket-id=0 core-id=8 thread-id=0.

1) Windows Server 2008.
  In Windows, after logging-in, I close the first full-screen window
  and then "Server Manager" window pops up. I navigate to left hand
  pane and choose Diagnostics -> Device Manager -> Processors.
  I count them before and after.
  When hotplugging a cpu, you can see the new processor show up in the
  processor list. That is, it goes from 8 to 9.
  When hot unplugging the CPU, Windows refuses:

   The 'Intel(R) Xeon(R) CPU D-1533 @ 2.10GHz' device is not
   removable and cannot be ejected or unplugged.

2) RHEL 6.9
  From dmesg:
  ACPI: APIC 000000007ffe32f0 000F0 (v05 BOCHS  BXPC    00000001 BXPC 00000001)
  SMP: Allowing 16 CPUs, 8 hotplug CPUs

  # cat /sys/devices/system/cpu/online
  0-7

  (QEMU) device_add ...

  CPU 8 got hotplugged
  Booting Node 0 Processor 8 APIC 0x8
  kvm-clock: cpu 8, msr 2830ed00
  Will online and init hotplugged CPU: 8
  microcode: CPU8 sig=0x50663, pf=0x1, revision=0x700001c
  platform microcode: firmware: requesting intel-ucode/06-56-03

  # cat /sys/devices/system/cpu/online
  0-8

  (QEMU) device_del ...

  Broke affinity for irq 24
  CPU 8 is now offline

  # cat /sys/devices/system/cpu/online
  0-7

  RHEL 6.9
  kernel 2.6.32-696.el6.x86_64
  build Feb 21 2017


So with these two older guest OS, it appears to still work. If there are others to be tested, let me 
know.


> 
>>
>>>>
>>>>
>>>> On Tue, Mar 28, 2023 at 11:59:24AM -0400, Eric DeVolder wrote:
>>>>> The following Linux kernel change broke CPU hotplug for MADT revision
>>>>> less than 5.
>>>>>
>>>>>    commit e2869bd7af60 ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC")
>>>>
>>>> Presumably it's being fixed? Link to discussion? Patch fixing that in
>>>> Linux?
>>>
>>> https://lore.kernel.org/linux-acpi/20230327191026.3454-1-eric.devolder@oracle.com/T/#t
>>
>> Great! Maybe stick a Link: tag in the commit log.

Sure, will include that with v2.

> 
> So it's guest bug which is in process of being fixed.
> (i.e. QEMU technically correct as long as MADT revision < 5)

Iiuc, if QEMU generates x2apic tables with .revision = 1, that is not correct
(as x2APIC shows up in .revision=3).
But if QEMU generates, apic, sapic, or x2apic tables with .revision = 5, that
is correct (as all are valid options thru .revision 5).

> 
> In this case I'd not touch x86 MADT at all (It should be upto
> downstream distros to fix guest kernel).

Fwiw, this has been fixed and should show up in 6.3-rc6 this weekend.

> 
> Probably the same applies to ARM variant
> i.e. we should bump rev only when current one gets in the way
> (aka we are pulling in new fields/definitions from new version)
> 
>     
>>>>> As part of the investigation into resolving this breakage, I learned
>>>>> that i386 QEMU reports revision 1, while technically it is at revision 3.
>>>>> (Arm QEMU reports revision 4, and that is valid/correct.)
>>>>>
>>>>> ACPI 6.3 bumps MADT revision to 5 as it introduces an Online Capable
>>>>> flag that the above Linux patch utilizes to denote hot pluggable CPUs.
>>>>>
>>>>> So in order to bump MADT to the current revision of 5, need to
>>>>> validate that all MADT table changes between 1 and 5 are present
>>>>> in QEMU.
>>>>>
>>>>> Below is a table summarizing the changes to the MADT. This information
>>>>> gleamed from the ACPI specs on uefi.org.
>>>>>
>>>>> ACPI    MADT    What
>>>>> Version Version
>>>>> 1.0             MADT not present
>>>>> 2.0     1       Section 5.2.10.4
>>>>> 3.0     2       Section 5.2.11.4
>>>>>                    5.2.11.13 Local SAPIC Structure added two new fields:
>>>>>                     ACPI Processor UID Value
>>>>>                     ACPI Processor UID String
>>>>>                    5.2.10.14 Platform Interrupt Sources Structure:
>>>>>                     Reserved changed to Platform Interrupt Sources Flags
>>>>> 3.0b    2       Section 5.2.11.4
>>>>>                    Added a section describing guidelines for the ordering of
>>>>>                    processors in the MADT to support proper boot processor
>>>>>                    and multi-threaded logical processor operation.
>>>>> 4.0     3       Section 5.2.12
>>>>>                    Adds Processor Local x2APIC structure type 9
>>>>>                    Adds Local x2APIC NMI structure type 0xA
>>>>> 5.0     3       Section 5.2.12
>>>>> 6.0     3       Section 5.2.12
>>>>> 6.0a    4       Section 5.2.12
>>>>>                    Adds ARM GIC structure types 0xB-0xF
>>>>> 6.2a    45      Section 5.2.12   <--- yep it says version 45!
>>>>> 6.2b    5       Section 5.2.12
>>>>>                    GIC ITS last Reserved offset changed to 16 from 20 (typo)
>>>>> 6.3     5       Section 5.2.12
>>>>>                    Adds Local APIC Flags Online Capable!
>>>>>                    Adds GICC SPE Overflow Interrupt field
>>>>> 6.4     5       Section 5.2.12
>>>>>                    Adds Multiprocessor Wakeup Structure type 0x10
>>>>>                    (change notes says structure previously misplaced?)
>>>>> 6.5     5       Section 5.2.12
>>>>>
>>>>> For the MADT revision change 1 -> 2, the spec has a change to the
>>>>> SAPIC structure. In general, QEMU does not generate/support SAPIC.
>>>>> So the QEMU i386 MADT revision can safely be moved to 2.
>>>>>
>>>>> For the MADT revision change 2 -> 3, the spec adds Local x2APIC
>>>>> structures. QEMU has long supported x2apic ACPI structures. A simple
>>>>> search of x2apic within QEMU source and hw/i386/acpi-common.c
>>>>> specifically reveals this.
>>>>
>>>> But not unconditionally.
>>>
>>> I don't think that reporting revision 3 requires that generation of x2apic;
>>> one could still see apic, x2apic, or sapic in theory. I realize qemu doesn't
>>> do sapic...
>>>    
>>>>    
>>>>> So the QEMU i386 MADT revision can safely
>>>>> be moved to 3.
>>>>>
>>>>> For the MADT revision change 3 -> 4, the spec adds support for the ARM
>>>>> GIC structures. QEMU ARM does in fact generate and report revision 4.
>>>>> As these will not be used by i386 QEMU, so then the QEMU i386 MADT
>>>>> revision can safely be moved to 4 as well.
>>>>>
>>>>> Now for the MADT revision change 4 -> 5, the spec adds the Online
>>>>> Capable flag to the Local APIC structure, and the ARM GICC SPE
>>>>> Overflow Interrupt field.
>>>>>
>>>>> For the ARM SPE, an existing 3-byte Reserved field is broken into a 1-
>>>>> byte Reserved field and a 2-byte SPE field.  The spec says that is SPE
>>>>> Overflow is not supported, it should be zero.
>>>>>
>>>>> For the i386 Local APIC flag Online Capable, the spec has certain rules
>>>>> about this value. And in particuar setting this value now explicitly
>>>>> indicates a hotpluggable CPU.
>>>>>
>>>>> So this patch makes the needed changes to move both ARM and i386 MADT
>>>>> to revision 5. These are not complicated, thankfully.
>>>>>
>>>>> Without these changes, the information below shows "how" CPU hotplug
>>>>> breaks with the current upstream Linux kernel 6.3.  For example, a Linux
>>>>> guest started with:
>>>>>
>>>>>    qemu-system-x86_64 -smp 30,maxcpus=32 ...
>>>>>
>>>>> and then attempting to hotplug a CPU:
>>>>>
>>>>>     (QEMU) device_add id=cpu30 driver=host-x86_64-cpu socket-id=0 core-id=30 thread-id=0
>>>>>
>>>>> fails with the following:
>>>>>
>>>>>     APIC: NR_CPUS/possible_cpus limit of 30 reached. Processor 30/0x.
>>>>>     ACPI: Unable to map lapic to logical cpu number
>>>>>     acpi LNXCPU:1e: Enumeration failure
>>>>>
>>>>>     # dmesg | grep smpboot
>>>>>     smpboot: Allowing 30 CPUs, 0 hotplug CPUs
>>>>>     smpboot: CPU0: Intel(R) Xeon(R) CPU D-1533 @ 2.10GHz (family: 0x)
>>>>>     smpboot: Max logical packages: 1
>>>>>     smpboot: Total of 30 processors activated (125708.76 BogoMIPS)
>>>>>
>>>>>     # iasl -d /sys/firmware/tables/acpi/APIC
>>>>>     [000h 0000   4]                    Signature : "APIC"    [Multiple APIC Descript
>>>>>     [004h 0004   4]                 Table Length : 00000170
>>>>>     [008h 0008   1]                     Revision : 01          <=====
>>>>>     [009h 0009   1]                     Checksum : 9C
>>>>>     [00Ah 0010   6]                       Oem ID : "BOCHS "
>>>>>     [010h 0016   8]                 Oem Table ID : "BXPC    "
>>>>>     [018h 0024   4]                 Oem Revision : 00000001
>>>>>     [01Ch 0028   4]              Asl Compiler ID : "BXPC"
>>>>>     [020h 0032   4]        Asl Compiler Revision : 00000001
>>>>>
>>>>>     ...
>>>>>
>>>>>     [114h 0276   1]                Subtable Type : 00 [Processor Local APIC]
>>>>>     [115h 0277   1]                       Length : 08
>>>>>     [116h 0278   1]                 Processor ID : 1D
>>>>>     [117h 0279   1]                Local Apic ID : 1D
>>>>>     [118h 0280   4]        Flags (decoded below) : 00000001
>>>>>                                Processor Enabled : 1          <=====
>>>>>
>>>>>     [11Ch 0284   1]                Subtable Type : 00 [Processor Local APIC]
>>>>>     [11Dh 0285   1]                       Length : 08
>>>>>     [11Eh 0286   1]                 Processor ID : 1E
>>>>>     [11Fh 0287   1]                Local Apic ID : 1E
>>>>>     [120h 0288   4]        Flags (decoded below) : 00000000
>>>>>                                Processor Enabled : 0          <=====
>>>>>
>>>>>     [124h 0292   1]                Subtable Type : 00 [Processor Local APIC]
>>>>>     [125h 0293   1]                       Length : 08
>>>>>     [126h 0294   1]                 Processor ID : 1F
>>>>>     [127h 0295   1]                Local Apic ID : 1F
>>>>>     [128h 0296   4]        Flags (decoded below) : 00000000
>>>>>                                Processor Enabled : 0          <=====
>>>>>
>>>>> The (latest upstream) Linux kernel sees 30 Enabled processors, and
>>>>> does not consider processors 31 and 32 to be hotpluggable.
>>>>>
>>>>> With this patch series applied, by bumping MADT to revision 5, the
>>>>> latest upstream Linux kernel correctly identifies 30 CPUs plus 2
>>>>> hotpluggable CPUS.
>>>>>
>>>>>     CPU30 has been hot-added
>>>>>     smpboot: Booting Node 0 Processor 30 APIC 0x1e
>>>>>     Will online and init hotplugged CPU: 30
>>>>>
>>>>>     # dmesg | grep smpboot
>>>>>     smpboot: Allowing 32 CPUs, 2 hotplug CPUs
>>>>>     smpboot: CPU0: Intel(R) Xeon(R) CPU D-1533 @ 2.10GHz (family: 0x6, model: 0x56, stepping: 0x3)
>>>>>     smpboot: Max logical packages: 2
>>>>>     smpboot: Total of 30 processors activated (125708.76 BogoMIPS)
>>>>>
>>>>>     # iasl -d /sys/firmware/tables/acpi/APIC
>>>>>     [000h 0000 004h]                   Signature : "APIC"    [Multiple APIC Descript
>>>>>     [004h 0004 004h]                Table Length : 00000170
>>>>>     [008h 0008 001h]                    Revision : 05          <=====
>>>>>     [009h 0009 001h]                    Checksum : 94
>>>>>     [00Ah 0010 006h]                      Oem ID : "BOCHS "
>>>>>     [010h 0016 008h]                Oem Table ID : "BXPC    "
>>>>>     [018h 0024 004h]                Oem Revision : 00000001
>>>>>     [01Ch 0028 004h]             Asl Compiler ID : "BXPC"
>>>>>     [020h 0032 004h]       Asl Compiler Revision : 00000001
>>>>>
>>>>>     ...
>>>>>
>>>>>     [114h 0276 001h]               Subtable Type : 00 [Processor Local APIC]
>>>>>     [115h 0277 001h]                      Length : 08
>>>>>     [116h 0278 001h]                Processor ID : 1D
>>>>>     [117h 0279 001h]               Local Apic ID : 1D
>>>>>     [118h 0280 004h]       Flags (decoded below) : 00000001
>>>>>                                Processor Enabled : 1          <=====
>>>>>                           Runtime Online Capable : 0          <=====
>>>>>
>>>>>     [11Ch 0284 001h]               Subtable Type : 00 [Processor Local APIC]
>>>>>     [11Dh 0285 001h]                      Length : 08
>>>>>     [11Eh 0286 001h]                Processor ID : 1E
>>>>>     [11Fh 0287 001h]               Local Apic ID : 1E
>>>>>     [120h 0288 004h]       Flags (decoded below) : 00000002
>>>>>                                Processor Enabled : 0          <=====
>>>>>                           Runtime Online Capable : 1          <=====
>>>>>
>>>>>     [124h 0292 001h]               Subtable Type : 00 [Processor Local APIC]
>>>>>     [125h 0293 001h]                      Length : 08
>>>>>     [126h 0294 001h]                Processor ID : 1F
>>>>>     [127h 0295 001h]               Local Apic ID : 1F
>>>>>     [128h 0296 004h]       Flags (decoded below) : 00000002
>>>>>                                Processor Enabled : 0          <=====
>>>>>                           Runtime Online Capable : 1          <=====
>>>>>
>>>>> Regards,
>>>>> Eric
>>>>
>>>> Can you please report which guests were tested?
>>>
>>> I've been using primarily upstream Linux. Kernels at and before 6.2.0 didn't
>>> have the "broken" patch mentioned above, and worked (for the reasons cited
>>> in the patch discussion to "fix" that patch). Any kernel since has the
>>> "broken" patch and will exhibit the issue.
>>>
>>> I've been using q35.
>>>
>>> If there are other samples you'd like to see, let me know and I'll try.
>>>
>>> Also, my responses will be delayed as I'm traveling the remainder of the week.
>>>
>>> Thanks!
>>> eric
>>
>> As a minimum some windows versions. The older the better.
>>
>>
>>>    
>>>>
>>>>    
>>>>>
>>>>> Eric DeVolder (2):
>>>>>     hw/acpi: arm: bump MADT to revision 5
>>>>>     hw/acpi: i386: bump MADT to revision 5
>>>>>
>>>>>    hw/arm/virt-acpi-build.c |  6 ++++--
>>>>>    hw/i386/acpi-common.c    | 13 ++++++++++---
>>>>>    2 files changed, 14 insertions(+), 5 deletions(-)
>>>>>
>>>>> -- 
>>>>> 2.31.1
>>>>>
>>>>>
>>>>>    
>>>>    
>>
> 


      reply	other threads:[~2023-04-04 14:53 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 15:59 [PATCH 0/2] hw/acpi: bump MADT to revision 5 Eric DeVolder
2023-03-28 15:59 ` [PATCH 1/2] hw/acpi: arm: " Eric DeVolder
2023-04-11 14:49   ` Igor Mammedov
2023-03-28 15:59 ` [PATCH 2/2] hw/acpi: i386: " Eric DeVolder
2023-03-29  5:03   ` Michael S. Tsirkin
2023-03-29 13:16     ` Eric DeVolder
2023-03-29 13:19       ` Eric DeVolder
2023-03-29 16:55         ` Michael S. Tsirkin
2023-03-31 16:29       ` Igor Mammedov
2023-04-11 16:00   ` Igor Mammedov
2023-04-12  7:58     ` Igor Mammedov
2023-04-18 16:58       ` Eric DeVolder
2023-03-28 16:37 ` [PATCH 0/2] hw/acpi: " Eric DeVolder
2023-03-29  5:19 ` Michael S. Tsirkin
2023-03-29 13:14   ` Eric DeVolder
2023-03-29 16:47     ` Michael S. Tsirkin
2023-03-30  7:36       ` Ani Sinha
2023-03-30 13:44         ` Michael S. Tsirkin
2023-03-30 14:02           ` Ani Sinha
2023-03-30 14:11             ` Michael S. Tsirkin
2023-03-31 16:25       ` Igor Mammedov
2023-04-04 14:52         ` Eric DeVolder [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bae30149-0f0a-750a-a050-88e40fc1e4c1@oracle.com \
    --to=eric.devolder@oracle.com \
    --cc=ani@anisinha.ca \
    --cc=boris.ostrovsky@oracle.com \
    --cc=eduardo@habkost.net \
    --cc=imammedo@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=shannon.zhaosl@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).