From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53164) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYg5j-00086A-Di for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYg5e-0005dQ-G8 for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57105) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYg5e-0005dG-8v for qemu-devel@nongnu.org; Thu, 19 Mar 2015 15:26:46 -0400 From: Eduardo Habkost Date: Thu, 19 Mar 2015 16:26:12 -0300 Message-Id: <1426793174-19012-5-git-send-email-ehabkost@redhat.com> In-Reply-To: <1426793174-19012-1-git-send-email-ehabkost@redhat.com> References: <1426793174-19012-1-git-send-email-ehabkost@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] [PULL 4/6] numa: introduce machine callback for VCPU to node mapping List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: qemu-devel@nongnu.org, Paolo Bonzini , Igor Mammedov , =?UTF-8?q?Andreas=20F=C3=A4rber?= , "Michael S. Tsirkin" From: Igor Mammedov Current default round-robin way of distributing VCPUs among NUMA nodes might be wrong in case on multi-core/threads CPUs. Making guests confused wrt topology where cores from the same socket are on different nodes. Allow a machine to override default mapping by providing MachineClass::cpu_index_to_socket_id() callback which would allow it group VCPUs from a socket on the same NUMA node. Signed-off-by: Igor Mammedov Reviewed-by: Andreas F=C3=A4rber Signed-off-by: Eduardo Habkost --- include/hw/boards.h | 5 +++++ include/sysemu/numa.h | 3 ++- numa.c | 18 +++++++++++++----- vl.c | 2 +- 4 files changed, 21 insertions(+), 7 deletions(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 1feea2b..78838d1 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -82,6 +82,10 @@ bool machine_mem_merge(MachineState *machine); * of HotplugHandler object, which handles hotplug operation * for a given @dev. It may return NULL if @dev doesn't require * any actions to be performed by hotplug handler. + * @cpu_index_to_socket_id: + * used to provide @cpu_index to socket number mapping, allowing + * a machine to group CPU threads belonging to the same socket/packag= e + * Returns: socket number given cpu_index belongs to. */ struct MachineClass { /*< private >*/ @@ -118,6 +122,7 @@ struct MachineClass { =20 HotplugHandler *(*get_hotplug_handler)(MachineState *machine, DeviceState *dev); + unsigned (*cpu_index_to_socket_id)(unsigned cpu_index); }; =20 /** diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index 5633b85..6523b4d 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -6,6 +6,7 @@ #include "qemu/option.h" #include "sysemu/sysemu.h" #include "sysemu/hostmem.h" +#include "hw/boards.h" =20 extern int nb_numa_nodes; /* Number of NUMA nodes */ =20 @@ -16,7 +17,7 @@ typedef struct node_info { bool present; } NodeInfo; extern NodeInfo numa_info[MAX_NODES]; -void parse_numa_opts(void); +void parse_numa_opts(MachineClass *mc); void numa_post_machine_init(void); void query_numa_node_mem(uint64_t node_mem[]); extern QemuOptsList qemu_numa_opts; diff --git a/numa.c b/numa.c index 518aedd..fe74e1e 100644 --- a/numa.c +++ b/numa.c @@ -202,7 +202,7 @@ static void validate_numa_cpus(void) } } =20 -void parse_numa_opts(void) +void parse_numa_opts(MachineClass *mc) { int i; =20 @@ -270,13 +270,21 @@ void parse_numa_opts(void) break; } } - /* assigning the VCPUs round-robin is easier to implement, guest= OSes - * must cope with this anyway, because there are BIOSes out ther= e in - * real machines which also use this scheme. + /* Historically VCPUs were assigned in round-robin order to NUMA + * nodes. However it causes issues with guest not handling it ni= ce + * in case where cores/threads from a multicore CPU appear on + * different nodes. So allow boards to override default distribu= tion + * rule grouping VCPUs by socket so that VCPUs from the same soc= ket + * would be on the same node. */ if (i =3D=3D nb_numa_nodes) { for (i =3D 0; i < max_cpus; i++) { - set_bit(i, numa_info[i % nb_numa_nodes].node_cpu); + unsigned node_id =3D i % nb_numa_nodes; + if (mc->cpu_index_to_socket_id) { + node_id =3D mc->cpu_index_to_socket_id(i) % nb_numa_= nodes; + } + + set_bit(i, numa_info[node_id].node_cpu); } } =20 diff --git a/vl.c b/vl.c index 69617d6..75ec292 100644 --- a/vl.c +++ b/vl.c @@ -4170,7 +4170,7 @@ int main(int argc, char **argv, char **envp) default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS); default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS); =20 - parse_numa_opts(); + parse_numa_opts(machine_class); =20 if (qemu_opts_foreach(qemu_find_opts("mon"), mon_init_func, NULL, 1)= !=3D 0) { exit(1); --=20 2.1.0