From: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
To: qemu-devel@nongnu.org
Cc: mst@redhat.com, thilo.fromm@profitbricks.com,
seabios@seabios.org,
Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>,
kevin@koconnor.net, imammedo@redhat.com
Subject: [Qemu-devel] [RFC PATCH] i386: Add _PXM method to ACPI CPU objects
Date: Thu, 7 Nov 2013 13:41:59 +0100 [thread overview]
Message-ID: <1383828119-2181-1-git-send-email-vasilis.liaskovitis@profitbricks.com> (raw)
This patch adds a _PXM method to ACPI CPU objects for the pc machine. The _PXM
value is derived from the passed in guest info, same way as CPU SRAT entries.
The motivation for this patch is a CPU hot-unplug/hot-plug bug observed when
using a 3.11 linux guest kernel on a multi-NUMA node qemu/kvm VM. The linux
guest kernel parses the SRAT CPU entries at boot time and stores them in the
array __apicid_to_node. When a CPU is hot-removed, the linux guest kernel
resets the removed CPU's __apicid_to_node entry to NO_NUMA_NODE (kernel commit
c4c60524). When the removed cpu is hot-added again, the linux kernel looks up
the hot-added cpu object's _PXM method instead of somehow re-discovering the
SRAT entry info. With current qemu/seabios, the _PXM method is not found, and
the CPU is thus hot-plugged in the default NUMA node 0. (The problem does not
show up on initial hotplug of a cpu; the PXM method is still not found in this
case, but the kernel still has the correct proximity value from the CPU's SRAT
entry stored in __apicid_to_node)
ACPI spec mentions that the _PXM method is the correct way to determine
proximity information at hot-add time. So far, qemu/seabios do not provide this
method for CPUs. So regardless of kernel behaviour, it is a good idea to add
this _PXM method. Since ACPI table generation has recently been moved from
seabios to qemu, we do this in qemu.
Note that the above hot-remove/hot-add scenario has been tested on an older
qemu + non-upstreamed patches for cpu hot-removal support, and not on qemu
master (since cpu-del support is still not on master). The only testing done
with qemu/seabios master and this patch, are successful boots of multi-node
linux and windows8 guests.
For the initial discussion on seabios and linux-acpi lists see
http://www.spinics.net/lists/linux-acpi/msg47058.html
Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Reviewed-by: Thilo Fromm <t-lo@thilo-fromm.de>
---
hw/i386/acpi-build.c | 2 ++
hw/i386/ssdt-proc.dsl | 2 ++
2 files changed, 4 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 6cfa044..9373f5e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -603,6 +603,7 @@ static inline char acpi_get_hex(uint32_t val)
#define ACPI_PROC_OFFSET_CPUHEX (*ssdt_proc_name - *ssdt_proc_start + 2)
#define ACPI_PROC_OFFSET_CPUID1 (*ssdt_proc_name - *ssdt_proc_start + 4)
#define ACPI_PROC_OFFSET_CPUID2 (*ssdt_proc_id - *ssdt_proc_start)
+#define ACPI_PROC_OFFSET_CPUPXM (*ssdt_proc_pxm - *ssdt_proc_start)
#define ACPI_PROC_SIZEOF (*ssdt_proc_end - *ssdt_proc_start)
#define ACPI_PROC_AML (ssdp_proc_aml + *ssdt_proc_start)
@@ -724,6 +725,7 @@ build_ssdt(GArray *table_data, GArray *linker,
proc[ACPI_PROC_OFFSET_CPUHEX+1] = acpi_get_hex(i);
proc[ACPI_PROC_OFFSET_CPUID1] = i;
proc[ACPI_PROC_OFFSET_CPUID2] = i;
+ proc[ACPI_PROC_OFFSET_CPUPXM] = guest_info->node_cpu[i];
}
/* build this code:
diff --git a/hw/i386/ssdt-proc.dsl b/hw/i386/ssdt-proc.dsl
index 8229bfd..7eef8b2 100644
--- a/hw/i386/ssdt-proc.dsl
+++ b/hw/i386/ssdt-proc.dsl
@@ -47,6 +47,8 @@ DefinitionBlock ("ssdt-proc.aml", "SSDT", 0x01, "BXPC", "BXSSDT", 0x1)
* also updating the C code.
*/
Name(_HID, "ACPI0007")
+ ACPI_EXTRACT_NAME_BYTE_CONST ssdt_proc_pxm
+ Name(_PXM, 0xAA)
External(CPMA, MethodObj)
External(CPST, MethodObj)
External(CPEJ, MethodObj)
--
1.7.10.4
next reply other threads:[~2013-11-07 12:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-07 12:41 Vasilis Liaskovitis [this message]
2013-11-07 13:03 ` [Qemu-devel] [RFC PATCH] i386: Add _PXM method to ACPI CPU objects Michael S. Tsirkin
2013-11-08 10:22 ` Vasilis Liaskovitis
2013-11-08 17:33 ` Igor Mammedov
2013-11-10 10:36 ` Michael S. Tsirkin
2013-11-11 8:59 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1383828119-2181-1-git-send-email-vasilis.liaskovitis@profitbricks.com \
--to=vasilis.liaskovitis@profitbricks.com \
--cc=imammedo@redhat.com \
--cc=kevin@koconnor.net \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=seabios@seabios.org \
--cc=thilo.fromm@profitbricks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).