From: Laszlo Ersek <lersek@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>,
Ankur Arora <ankur.a.arora@oracle.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled
Date: Fri, 27 Nov 2020 16:02:28 +0100 [thread overview]
Message-ID: <3bad9b8e-6de1-37d4-8ac4-a8432885eb14@redhat.com> (raw)
In-Reply-To: <20201127123356.7b1349d8@redhat.com>
On 11/27/20 12:33, Igor Mammedov wrote:
> On Thu, 26 Nov 2020 19:35:30 -0800
> Ankur Arora <ankur.a.arora@oracle.com> wrote:
>
>> On 2020-11-26 4:46 a.m., Laszlo Ersek wrote:
>>> On 11/26/20 11:24, Ankur Arora wrote:
>>>> On 2020-11-24 4:25 a.m., Igor Mammedov wrote:
>>>>> If firmware negotiates ICH9_LPC_SMI_F_CPU_HOT_UNPLUG_BIT feature,
>>>>> OSPM on CPU eject will set bit #4 in CPU hotplug block for to be
>>>>> ejected CPU to mark it for removal by firmware and trigger SMI
>>>>> upcall to let firmware do actual eject.
>>>>>
>>>>> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>>>>> ---
>>>>> PS:
>>>>> - abuse 5.1 machine type for now to turn off unplug feature
>>>>> (it will be moved to 5.2 machine type once new merge window is open)
>>>>> ---
>>>>> include/hw/acpi/cpu.h | 2 ++
>>>>> docs/specs/acpi_cpu_hotplug.txt | 11 +++++++++--
>>>>> hw/acpi/cpu.c | 18 ++++++++++++++++--
>>>>> hw/i386/acpi-build.c | 5 +++++
>>>>> hw/i386/pc.c | 1 +
>>>>> hw/isa/lpc_ich9.c | 2 +-
>>>>> 6 files changed, 34 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/include/hw/acpi/cpu.h b/include/hw/acpi/cpu.h
>>>>> index 0eeedaa491..999caaf510 100644
>>>>> --- a/include/hw/acpi/cpu.h
>>>>> +++ b/include/hw/acpi/cpu.h
>>>>> @@ -22,6 +22,7 @@ typedef struct AcpiCpuStatus {
>>>>> uint64_t arch_id;
>>>>> bool is_inserting;
>>>>> bool is_removing;
>>>>> + bool fw_remove;
>>>>> uint32_t ost_event;
>>>>> uint32_t ost_status;
>>>>> } AcpiCpuStatus;
>>>>> @@ -50,6 +51,7 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object
>>>>> *owner,
>>>>> typedef struct CPUHotplugFeatures {
>>>>> bool acpi_1_compatible;
>>>>> bool has_legacy_cphp;
>>>>> + bool fw_unplugs_cpu;
>>>>> const char *smi_path;
>>>>> } CPUHotplugFeatures;
>>>>> diff --git a/docs/specs/acpi_cpu_hotplug.txt
>>>>> b/docs/specs/acpi_cpu_hotplug.txt
>>>>> index 9bb22d1270..f68ef6e06c 100644
>>>>> --- a/docs/specs/acpi_cpu_hotplug.txt
>>>>> +++ b/docs/specs/acpi_cpu_hotplug.txt
>>>>> @@ -57,7 +57,11 @@ read access:
>>>>> It's valid only when bit 0 is set.
>>>>> 2: Device remove event, used to distinguish device for which
>>>>> no device eject request to OSPM was issued.
>>>>> - 3-7: reserved and should be ignored by OSPM
>>>>> + 3: reserved and should be ignored by OSPM
>>>>> + 4: if set to 1, OSPM requests firmware to perform device
>>>>> eject,
>>>>> + firmware shall clear this event by writing 1 into it
>>>>> before
>>>>> + performing device eject> + 5-7: reserved and
>>>>> should be ignored by OSPM
>>>>> [0x5-0x7] reserved
>>>>> [0x8] Command data: (DWORD access)
>>>>> contains 0 unless value last stored in 'Command field' is
>>>>> one of:
>>>>> @@ -82,7 +86,10 @@ write access:
>>>>> selected CPU device
>>>>> 3: if set to 1 initiates device eject, set by OSPM when it
>>>>> triggers CPU device removal and calls _EJ0 method
>>>>> - 4-7: reserved, OSPM must clear them before writing to
>>>>> register
>>>>> + 4: if set to 1 OSPM hands over device eject to firmware,
>>>>> + Firmware shall issue device eject request as described
>>>>> above
>>>>> + (bit #3) and OSPM should not touch device eject bit (#3),
>>>>> + 5-7: reserved, OSPM must clear them before writing to
>>>>> register
>>>>> [0x5] Command field: (1 byte access)
>>>>> value:
>>>>> 0: selects a CPU device with inserting/removing events and
>>>>> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
>>>>> index f099b50927..09d2f20dae 100644
>>>>> --- a/hw/acpi/cpu.c
>>>>> +++ b/hw/acpi/cpu.c
>>>>> @@ -71,6 +71,7 @@ static uint64_t cpu_hotplug_rd(void *opaque, hwaddr
>>>>> addr, unsigned size)
>>>>> val |= cdev->cpu ? 1 : 0;
>>>>> val |= cdev->is_inserting ? 2 : 0;
>>>>> val |= cdev->is_removing ? 4 : 0;
>>>>> + val |= cdev->fw_remove ? 16 : 0;
>>>>
>>>> I might be missing something but I don't see where cdev->fw_remove is being
>>>> set.
>>>
>>> See just below, in the cpu_hotplug_wr() hunk. When bit#4 is written --
>>> which happens through the ACPI code change --, fw_remove is inverted.
>> Thanks that makes sense. I was reading the AML building code all wrong.
>>
>>>
>>>
>>>> We do set cdev->is_removing in acpi_cpu_unplug_request_cb() so AFAICS
>>>> we would always end up setting this bit:
>>>>> val |= cdev->is_removing ? 4 : 0;
>>>>
>>>> Also, if cdev->fw_remove and cdev->is_removing are both true, val would be
>>>> (4 | 16). I'm guessing that in that case the AML determines which case gets
>>>> handled but it might make sense to set just one of these?
>>>
>>> "is_removing" is set directly in response to the device_del QMP command.
>>> That QMP command is asynchronous to the execution of the guest OS.
>>> j
>>> "fw_remove" is set (by virtue of inverting) by ACPI CEJ0, which is
>>> executed by the guest OS's ACPI interpreter, after the guest OS has
>>> de-scheduled all processes from the CPU being removed (= basically after
>>> the OS has willfully forgotten about the CPU).
>>>
>>> Therefore, considering the bitmask (is_removing, fw_remove), three
>>> variations make sense:
>>
>> Just annotating these with the corresponding ACPI code to make sure
>> I have it straight. Please correct if my interpretation is wrong. Also,
>> a few questions inline:
>>
>>>
>>> #1 (is_removing=0, fw_remove=0) -- normal status; no unplug requested
>>>
>>> #2 (is_removing=1, fw_remove=0) -- unplug requested via QMP, guest OS
>>> is processing the request
>>
>> Guest executes the CSCN method and reads rm_evt (bit 2) (thus noticing
>> the is_removing=1), and then notifies the CPU to be removed via the
>> CTFY method.
>>
>> ifctx = aml_if(aml_equal(rm_evt, one));
>> {
>> aml_append(ifctx,
>> aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
>> aml_append(ifctx, aml_store(one, rm_evt));
>> aml_append(ifctx, aml_store(one, has_event));
>> }
>>
>> Then it does a store to rm_evt (bit 2). That would result in clearing
>> of is_removing. (Igor mentions that in a separate mail.)
>>
>> 1. Do we need to clear is_removing at all? AFAICS, it's only useful as
>> an ack to QEMU and I can't think of why that's useful. OTOH it
>> doesn't serve any useful purpose once the guest OS has seen the request.
> no firmware doesn't need to care about it, it's consumed by OSPM only
>
>> 2. Would it make sense to clear it first and then call CPU_NOTIFY_METHOD?
>> CPU_NOTIFY_METHOD (or _EJ0, COST) don't depend on is_removing but
>> that might change in the future.
>
> all methods are protected by be same mutex, so if _EJ0 is called while CSCN
> in progress it will wait till CSCN is finished.
> But clearing bit #2 before Notify should work too.
I'd suggest not reordering existent stuff unless we really have to; the
firmware will have to deal with "is_removing" being the *only* status
flag set anway, as QMP "device_del" command(s) may set that bit for
another CPU (or multiple other CPUs) while the SMI handler is running,
and the "get pending" method will return such CPUs as well.
I wouldn't complicate the patches just in order to "hide" is_removing --
that's not a goal, so let's just keep as much AML untouched as we can.
(BTW I now understand why "is_removing" is clear when the eject method
runs for the same CPU -- because Notify (and so the eject method) is not
entered synchronously, it's only queued asynchronously. So it's actually
dispatched after the is_removing flag has been cleared.)
>
>
>> The notify would end up in calling acpi_hotplug_schedule() which would be
>> responsible for queuing work (on CPU0) to detach+unplug the CPU.
>>
>> Once the OS level detach succeeds, the worker evaluates the "_EJ0" method
>> which would do the actual CPU_EJECT_METHOD work.
>>
>> If the detach fails then it evaluates the CPU_OST_METHOD which updates
>> the status for the event and the status.
>>
>> At this point the state is back to:
>>
>> (is_removing=0, fw_remove=0)
> if OSPM fails to release CPU for whatever reasons, it's valid
> state, we just notify user using OST event that requested unplug wasn't successful.
>
>>
>>> #3 (is_removing=1, fw_remove=1) -- guest OS removed all references from
>>> the CPU, firmware is permitted /
>>> required to forget about the CPU as
>>> well, and then unplug the CPU
>>
>> CPU_EJECT_METHOD will do a store to bit 4, which would invert (and
>> thus set) fw_remove and then do the SMI.
>>
>> So, this would be
>>> #3 (is_removing=0, fw_remove=1)
>>
>> At this point the firmware calls QemuCPUhpCollectApicIds() which
>> (after changes) notices CPU(s) with fw_remove set.
>>
>> Collects them and does a store to bit 4, which would clear fw_remove.
I don't think we should clear fw_remove as soon as we collect the CPU,
in the firmware. Status flag modifications should be kept out of
QemuCpuhpCollectApicIds().
>
> I'd skip this step on firmware side and make QEMU clear it
> when CPU is ejected.
>
>>
>>>
>>> #4 (is_removing=1, fw_remove=0) -- fimware is about to unplug the CPU
>>>
>>> #5 (is_removing=0, fw_remove=0) -- firmware performing unplug
>> Firmware does an unplug and writes to bit 3, thus clearing is_removing.
>>
>> On return from the firmware the guest evaluates the COST again.
> it's optional and depends on OSPM implementation (some do not call it on success)
>
>
>> And, eventually goes back to the CSCN where it processes more
>> hotplug or unplug events.
> CSCN in case of unplug finishes first, and only after that EJ0 calls
> are processed
>
>>> The variation (is_removing=0, fw_remove=1) is invalid / unused.
>>
>> /nods
>>>
>>>
>>> The firmware may be investigating the CPU register block between steps
>>> #2 and #3 -- in other words, the firmware may see a CPU for which
>>> is_remove is set (unplug requested via QMP), but the OS has not vacated
>>> yet (fw_remove=0). In that case, the firmware must just skip the CPU --
>>> once the OS is done, it will set fw_remove too, and raise another SMI.
>> Yeah, it makes sense for the firmware to only care about a CPU once it
>> sees fw_remove=1. (And as currently situated, the firmware would never
>> see is_removing=1 at all.)
The firmware may well see is_removing=1, as the QMP device_del command
may set that bit on some other CPU, asynchronously to the firmware's
execution. The firmware needs to recognize if the "get pending" command
returns a CPU because of this, and just continue scanning.
Thanks
Laszlo
next prev parent reply other threads:[~2020-11-27 15:24 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-24 12:25 [RFC] ich9:cpuhp: add support for cpu hot-unplug with SMI broadcast enabled Igor Mammedov
2020-11-24 22:58 ` Laszlo Ersek
2020-11-26 10:24 ` Ankur Arora
2020-11-26 12:46 ` Laszlo Ersek
2020-11-26 19:50 ` Igor Mammedov
2020-11-27 3:39 ` Ankur Arora
2020-11-27 3:35 ` Ankur Arora
2020-11-27 11:33 ` Igor Mammedov
2020-11-27 15:02 ` Laszlo Ersek [this message]
2020-11-27 23:48 ` Ankur Arora
2020-11-30 16:58 ` Laszlo Ersek
2020-11-30 19:45 ` Ankur Arora
2020-11-26 20:45 ` Igor Mammedov
2020-11-26 11:17 ` Laszlo Ersek
2020-11-26 20:38 ` Igor Mammedov
2020-11-27 4:10 ` Ankur Arora
2020-11-27 11:47 ` Igor Mammedov
2020-11-27 23:49 ` Ankur Arora
2020-11-27 15:19 ` Laszlo Ersek
2020-11-28 0:43 ` Ankur Arora
2020-11-30 17:00 ` Laszlo Ersek
2020-11-27 14:48 ` Laszlo Ersek
2020-11-27 15:07 ` Igor Mammedov
2020-11-27 16:52 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3bad9b8e-6de1-37d4-8ac4-a8432885eb14@redhat.com \
--to=lersek@redhat.com \
--cc=ankur.a.arora@oracle.com \
--cc=imammedo@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).