From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48255) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VePCf-0000JF-Gv for qemu-devel@nongnu.org; Thu, 07 Nov 2013 08:00:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VePCa-00021I-8c for qemu-devel@nongnu.org; Thu, 07 Nov 2013 08:00:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:1317) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VePCa-000211-08 for qemu-devel@nongnu.org; Thu, 07 Nov 2013 08:00:48 -0500 Date: Thu, 7 Nov 2013 15:03:42 +0200 From: "Michael S. Tsirkin" Message-ID: <20131107130342.GA2212@redhat.com> References: <1383828119-2181-1-git-send-email-vasilis.liaskovitis@profitbricks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1383828119-2181-1-git-send-email-vasilis.liaskovitis@profitbricks.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC PATCH] i386: Add _PXM method to ACPI CPU objects List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vasilis Liaskovitis Cc: thilo.fromm@profitbricks.com, imammedo@redhat.com, kevin@koconnor.net, seabios@seabios.org, qemu-devel@nongnu.org On Thu, Nov 07, 2013 at 01:41:59PM +0100, Vasilis Liaskovitis wrote: > This patch adds a _PXM method to ACPI CPU objects for the pc machine. T= he _PXM > value is derived from the passed in guest info, same way as CPU SRAT en= tries. >=20 > The motivation for this patch is a CPU hot-unplug/hot-plug bug observed= when > using a 3.11 linux guest kernel on a multi-NUMA node qemu/kvm VM. The l= inux > guest kernel parses the SRAT CPU entries at boot time and stores them i= n the > array __apicid_to_node. When a CPU is hot-removed, the linux guest kern= el > resets the removed CPU's __apicid_to_node entry to NO_NUMA_NODE (kernel= commit > c4c60524). When the removed cpu is hot-added again, the linux kernel lo= oks up > the hot-added cpu object's _PXM method instead of somehow re-discoverin= g the > SRAT entry info. With current qemu/seabios, the _PXM method is not foun= d, and > the CPU is thus hot-plugged in the default NUMA node 0. (The problem do= es not > show up on initial hotplug of a cpu; the PXM method is still not found = in this > case, but the kernel still has the correct proximity value from the CPU= 's SRAT > entry stored in __apicid_to_node) >=20 > ACPI spec mentions that the _PXM method is the correct way to determine > proximity information at hot-add time. Where does it say this? I found this: If the Local APIC ID / Local SAPIC ID / Local x2APIC ID of a dynamically added processor is not present in the System Resource Affinity Table (SRAT), a _PXM object must exist for the processor=E2=80=99s device or on= e of its ancestors in the ACPI Namespace. Does this mean that linux is buggy, and should be fixed up to look up the apic ID in SRAT? > So far, qemu/seabios do not provide this > method for CPUs. So regardless of kernel behaviour, it is a good idea t= o add > this _PXM method. Since ACPI table generation has recently been moved f= rom > seabios to qemu, we do this in qemu. >=20 > Note that the above hot-remove/hot-add scenario has been tested on an o= lder > qemu + non-upstreamed patches for cpu hot-removal support, and not on q= emu > master (since cpu-del support is still not on master). The only testing= done > with qemu/seabios master and this patch, are successful boots of multi-= node > linux and windows8 guests. >=20 > For the initial discussion on seabios and linux-acpi lists see > http://www.spinics.net/lists/linux-acpi/msg47058.html >=20 > Signed-off-by: Vasilis Liaskovitis > Reviewed-by: Thilo Fromm Even if this is a linux bug, I have no issue with working around it in qemu. But I think proper testing needs to be done with rebased upport for cpu-d= el. > --- > hw/i386/acpi-build.c | 2 ++ > hw/i386/ssdt-proc.dsl | 2 ++ > 2 files changed, 4 insertions(+) >=20 > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index 6cfa044..9373f5e 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -603,6 +603,7 @@ static inline char acpi_get_hex(uint32_t val) > #define ACPI_PROC_OFFSET_CPUHEX (*ssdt_proc_name - *ssdt_proc_start + = 2) > #define ACPI_PROC_OFFSET_CPUID1 (*ssdt_proc_name - *ssdt_proc_start + = 4) > #define ACPI_PROC_OFFSET_CPUID2 (*ssdt_proc_id - *ssdt_proc_start) > +#define ACPI_PROC_OFFSET_CPUPXM (*ssdt_proc_pxm - *ssdt_proc_start) > #define ACPI_PROC_SIZEOF (*ssdt_proc_end - *ssdt_proc_start) > #define ACPI_PROC_AML (ssdp_proc_aml + *ssdt_proc_start) > =20 > @@ -724,6 +725,7 @@ build_ssdt(GArray *table_data, GArray *linker, > proc[ACPI_PROC_OFFSET_CPUHEX+1] =3D acpi_get_hex(i); > proc[ACPI_PROC_OFFSET_CPUID1] =3D i; > proc[ACPI_PROC_OFFSET_CPUID2] =3D i; > + proc[ACPI_PROC_OFFSET_CPUPXM] =3D guest_info->node_cpu[i]; > } > =20 > /* build this code: > diff --git a/hw/i386/ssdt-proc.dsl b/hw/i386/ssdt-proc.dsl > index 8229bfd..7eef8b2 100644 > --- a/hw/i386/ssdt-proc.dsl > +++ b/hw/i386/ssdt-proc.dsl > @@ -47,6 +47,8 @@ DefinitionBlock ("ssdt-proc.aml", "SSDT", 0x01, "BXPC= ", "BXSSDT", 0x1) > * also updating the C code. > */ > Name(_HID, "ACPI0007") > + ACPI_EXTRACT_NAME_BYTE_CONST ssdt_proc_pxm > + Name(_PXM, 0xAA) The ACPI spec says this should be a DWORD value: Return Value: An Integer (DWORD) containing a proximity domain identifier. > External(CPMA, MethodObj) > External(CPST, MethodObj) > External(CPEJ, MethodObj) > --=20 > 1.7.10.4