From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Igor Mammedov <imammedo@redhat.com>,
Eric Mackay <eric.mackay@oracle.com>,
Ani Sinha <anisinha@redhat.com>
Subject: [PULL 04/48] cpuhp: make sure that remove events are handled within the same SCI
Date: Wed, 15 Jan 2025 13:08:37 -0500 [thread overview]
Message-ID: <8aa35bebeeaed19ae57afbc3e110b8e7fe8587d0.1736964488.git.mst@redhat.com> (raw)
In-Reply-To: <cover.1736964487.git.mst@redhat.com>
From: Igor Mammedov <imammedo@redhat.com>
CPU_SCAN_METHOD was processing insert events first and only if insert event was
not present then it would check remove event.
Normally it's not an issue as it doesn't make much sense tho hotplug and
immediately unplug it. In this corner case, which can be reproduced with:
qemu -smp 1,maxcpus=2 -cpu host -monitor stdio \
-drive if=pflash,format=raw,readonly,file=edk2-x86_64-code.fd
* boot till GRUB prompt and pause guest (either via monitor or stop GRUB
from automatic boot)
* at monitor prompt add CPU:
device_add host-x86_64-cpu,socket-id=0,core-id=1,thread-id=0,id=foo
* let guest OS boot completely, and unplug CPU from monitor prompt:
device_del foo
which triggers GPE event that leads to CPU_SCAN_METHOD on guest side
as result of above cpu 'foo' will not be hotunplugged, since QEMU sees
insert event and ignores remove event (leaving it in pending state) for
the GPE event.
Any follow up CPU hotplug/unplug action from QEMU side will handle
previously ignored event, so as workaround user can repeat device_del.
Fix this corner-case by queuing remove events independently from insert
events, aka the same way as we do with insert events. And then go over remove
queue to send eject notify events to OSPM within the same GPE event.
PS:
Process remove queue after the cpu add queue has been processed 1st
to ensure that OSPM gets hotadd evets after hotremove ones.
PS2:
Case where it's still borken happens when guest OS is Linux and
device_del happens before guest OS initializes ACPI subsystem.
Culprit in this case though is the guest kernel, which mangles GPE.sts
(by clearing them up) and thus pending SCI turns to NOP leaving
insert/remove events in pending state.
That is the guest bug and should be fixed there.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Eric Mackay <eric.mackay@oracle.com>
Message-Id: <20241210163945.3422623-3-imammedo@redhat.com>
Tested-by: Eric Mackay <eric.mackay@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/acpi/cpu.c | 43 ++++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 9d530a24da..f70a2c045e 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -327,6 +327,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
#define CPU_EJECT_METHOD "CEJ0"
#define CPU_OST_METHOD "COST"
#define CPU_ADDED_LIST "CNEW"
+#define CPU_EJ_LIST "CEJL"
#define CPU_ENABLED "CPEN"
#define CPU_SELECTOR "CSEL"
@@ -488,7 +489,6 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
method = aml_method(CPU_SCAN_METHOD, 0, AML_SERIALIZED);
{
const uint8_t max_cpus_per_pass = 255;
- Aml *else_ctx;
Aml *while_ctx, *while_ctx2;
Aml *has_event = aml_local(0);
Aml *dev_chk = aml_int(1);
@@ -499,6 +499,8 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
Aml *uid = aml_local(3);
Aml *has_job = aml_local(4);
Aml *new_cpus = aml_name(CPU_ADDED_LIST);
+ Aml *ej_cpus = aml_name(CPU_EJ_LIST);
+ Aml *num_ej_cpus = aml_local(5);
aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
@@ -513,6 +515,8 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
*/
aml_append(method, aml_name_decl(CPU_ADDED_LIST,
aml_package(max_cpus_per_pass)));
+ aml_append(method, aml_name_decl(CPU_EJ_LIST,
+ aml_package(max_cpus_per_pass)));
aml_append(method, aml_store(zero, uid));
aml_append(method, aml_store(one, has_job));
@@ -527,6 +531,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
aml_append(while_ctx2, aml_store(one, has_event));
aml_append(while_ctx2, aml_store(zero, num_added_cpus));
+ aml_append(while_ctx2, aml_store(zero, num_ej_cpus));
/*
* Scan CPUs, till there are CPUs with events or
@@ -559,8 +564,10 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
* if CPU_ADDED_LIST is full, exit inner loop and process
* collected CPUs
*/
- ifctx = aml_if(
- aml_equal(num_added_cpus, aml_int(max_cpus_per_pass)));
+ ifctx = aml_if(aml_lor(
+ aml_equal(num_added_cpus, aml_int(max_cpus_per_pass)),
+ aml_equal(num_ej_cpus, aml_int(max_cpus_per_pass))
+ ));
{
aml_append(ifctx, aml_store(one, has_job));
aml_append(ifctx, aml_break());
@@ -577,16 +584,16 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
aml_append(ifctx, aml_store(one, has_event));
}
aml_append(while_ctx, ifctx);
- else_ctx = aml_else();
+
ifctx = aml_if(aml_equal(rm_evt, one));
{
- aml_append(ifctx,
- aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
- aml_append(ifctx, aml_store(one, rm_evt));
+ /* cache to be removed CPUs to Notify later */
+ aml_append(ifctx, aml_store(uid,
+ aml_index(ej_cpus, num_ej_cpus)));
+ aml_append(ifctx, aml_increment(num_ej_cpus));
aml_append(ifctx, aml_store(one, has_event));
}
- aml_append(else_ctx, ifctx);
- aml_append(while_ctx, else_ctx);
+ aml_append(while_ctx, ifctx);
aml_append(while_ctx, aml_increment(uid));
}
aml_append(while_ctx2, while_ctx);
@@ -620,6 +627,24 @@ void build_cpus_aml(Aml *table, MachineState *machine, CPUHotplugFeatures opts,
aml_append(while_ctx, aml_increment(cpu_idx));
}
aml_append(while_ctx2, while_ctx);
+
+ /*
+ * Notify OSPM about to be removed CPUs and clear remove flag
+ */
+ aml_append(while_ctx2, aml_store(zero, cpu_idx));
+ while_ctx = aml_while(aml_lless(cpu_idx, num_ej_cpus));
+ {
+ aml_append(while_ctx,
+ aml_store(aml_derefof(aml_index(ej_cpus, cpu_idx)),
+ uid));
+ aml_append(while_ctx,
+ aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
+ aml_append(while_ctx, aml_store(uid, cpu_selector));
+ aml_append(while_ctx, aml_store(one, rm_evt));
+ aml_append(while_ctx, aml_increment(cpu_idx));
+ }
+ aml_append(while_ctx2, while_ctx);
+
/*
* If another batch is needed, then it will resume scanning
* exactly at -- and not after -- the last CPU that's currently
--
MST
next prev parent reply other threads:[~2025-01-15 18:09 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-15 18:08 [PULL 00/48] virtio,pc,pci: features, fixes, cleanups Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 01/48] virtio-gpu: Add definition for resource_uuid feature Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 02/48] pci: ensure valid link status bits for downstream ports Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 03/48] tests: acpi: whitelist expected blobs Michael S. Tsirkin
2025-01-15 18:08 ` Michael S. Tsirkin [this message]
2025-01-15 18:08 ` [PULL 05/48] tests: acpi: update " Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 06/48] intel_iommu: Use the latest fault reasons defined by spec Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 07/48] intel_iommu: Make pasid entry type check accurate Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 08/48] intel_iommu: Add a placeholder variable for scalable mode stage-1 translation Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 09/48] intel_iommu: Flush stage-2 cache in PASID-selective PASID-based iotlb invalidation Michael S. Tsirkin
2025-01-15 18:08 ` [PULL 10/48] intel_iommu: Rename slpte to pte Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 11/48] intel_iommu: Implement stage-1 translation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 12/48] intel_iommu: Check if the input address is canonical Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 13/48] intel_iommu: Check stage-1 translation result with interrupt range Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 14/48] intel_iommu: Set accessed and dirty bits during stage-1 translation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 15/48] intel_iommu: Flush stage-1 cache in iotlb invalidation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 16/48] intel_iommu: Process PASID-based " Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 17/48] intel_iommu: Add an internal API to find an address space with PASID Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 18/48] intel_iommu: Add support for PASID-based device IOTLB invalidation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 19/48] intel_iommu: piotlb invalidation should notify unmap Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 20/48] tests/acpi: q35: allow DMAR acpi table changes Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 21/48] intel_iommu: Set default aw_bits to 48 starting from QEMU 9.2 Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 22/48] tests/acpi: q35: Update host address width in DMAR Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 23/48] intel_iommu: Introduce a property x-flts for stage-1 translation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 24/48] intel_iommu: Introduce a property to control FS1GP cap bit setting Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 25/48] tests/qtest: Add intel-iommu test Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 26/48] pci/msix: Fix msix pba read vector poll end calculation Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 27/48] acpi/ghes: get rid of ACPI_HEST_SRC_ID_RESERVED Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 28/48] acpi/ghes: simplify acpi_ghes_record_errors() code Michael S. Tsirkin
2025-01-15 18:09 ` [PULL 29/48] acpi/ghes: simplify the per-arch caller to build HEST table Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 30/48] acpi/ghes: better handle source_id and notification Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 31/48] acpi/ghes: Fix acpi_ghes_record_errors() argument Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 32/48] acpi/ghes: Remove a duplicated out of bounds check Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 33/48] acpi/ghes: Change the type for source_id Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 34/48] acpi/ghes: don't check if physical_address is not zero Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 35/48] acpi/ghes: make the GHES record generation more generic Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 36/48] acpi/ghes: better name GHES memory error function Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 37/48] acpi/ghes: don't crash QEMU if ghes GED is not found Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 38/48] acpi/ghes: rename etc/hardware_error file macros Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 39/48] acpi/ghes: better name the offset of the hardware error firmware Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 40/48] acpi/ghes: move offset calculus to a separate function Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 41/48] acpi/ghes: Change ghes fill logic to work with only one source Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 42/48] docs: acpi_hest_ghes: fix documentation for CPER size Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 43/48] tests: acpi: whitelist expected blobs Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 44/48] pci: acpi: Windows 'PCI Label Id' bug workaround Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 45/48] tests: acpi: update expected blobs Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 46/48] hw/cxl: Fix msix_notify: Assertion `vector < dev->msix_entries_nr` Michael S. Tsirkin
2025-01-15 18:10 ` [PULL 47/48] vhost: Add stubs for the migration state transfer interface Michael S. Tsirkin
2025-01-15 18:11 ` [PULL 48/48] virtio-net: vhost-user: Implement internal migration Michael S. Tsirkin
2025-01-15 18:15 ` [PULL 00/48] virtio,pc,pci: features, fixes, cleanups David Woodhouse
2025-01-15 22:42 ` Michael S. Tsirkin
2025-01-15 23:05 ` David Woodhouse
2025-01-16 7:05 ` Michael S. Tsirkin
2025-01-16 14:06 ` David Woodhouse
2025-01-16 7:06 ` Michael S. Tsirkin
2025-01-16 22:10 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8aa35bebeeaed19ae57afbc3e110b8e7fe8587d0.1736964488.git.mst@redhat.com \
--to=mst@redhat.com \
--cc=anisinha@redhat.com \
--cc=eric.mackay@oracle.com \
--cc=imammedo@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).