From: Igor Mammedov <imammedo@redhat.com>
To: qemu-devel@nongnu.org
Cc: ehabkost@redhat.com, afaerber@suse.de
Subject: [Qemu-devel] [PATCH v3 for-2.3 1/2] numa: introduce machine callback for VCPU to node mapping
Date: Thu, 19 Mar 2015 17:09:21 +0000 [thread overview]
Message-ID: <1426784962-7541-2-git-send-email-imammedo@redhat.com> (raw)
In-Reply-To: <1426784962-7541-1-git-send-email-imammedo@redhat.com>
Current default round-robin way of distributing VCPUs among
NUMA nodes might be wrong in case on multi-core/threads
CPUs. Making guests confused wrt topology where cores from
the same socket are on different nodes.
Allow a machine to override default mapping by providing
MachineClass->cpu_index_to_socket_id()
callback which would allow it group VCPUs from a socket
on the same NUMA node.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
v3:
- split out numa/machine change into a separate patch
---
include/hw/boards.h | 5 +++++
include/sysemu/numa.h | 3 ++-
numa.c | 18 +++++++++++++-----
vl.c | 2 +-
4 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 1feea2b..78838d1 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -82,6 +82,10 @@ bool machine_mem_merge(MachineState *machine);
* of HotplugHandler object, which handles hotplug operation
* for a given @dev. It may return NULL if @dev doesn't require
* any actions to be performed by hotplug handler.
+ * @cpu_index_to_socket_id:
+ * used to provide @cpu_index to socket number mapping, allowing
+ * a machine to group CPU threads belonging to the same socket/package
+ * Returns: socket number given cpu_index belongs to.
*/
struct MachineClass {
/*< private >*/
@@ -118,6 +122,7 @@ struct MachineClass {
HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
DeviceState *dev);
+ unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
};
/**
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 5633b85..6523b4d 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -6,6 +6,7 @@
#include "qemu/option.h"
#include "sysemu/sysemu.h"
#include "sysemu/hostmem.h"
+#include "hw/boards.h"
extern int nb_numa_nodes; /* Number of NUMA nodes */
@@ -16,7 +17,7 @@ typedef struct node_info {
bool present;
} NodeInfo;
extern NodeInfo numa_info[MAX_NODES];
-void parse_numa_opts(void);
+void parse_numa_opts(MachineClass *mc);
void numa_post_machine_init(void);
void query_numa_node_mem(uint64_t node_mem[]);
extern QemuOptsList qemu_numa_opts;
diff --git a/numa.c b/numa.c
index ffbec68..f1f571a 100644
--- a/numa.c
+++ b/numa.c
@@ -165,7 +165,7 @@ error:
return -1;
}
-void parse_numa_opts(void)
+void parse_numa_opts(MachineClass *mc)
{
int i;
@@ -233,13 +233,21 @@ void parse_numa_opts(void)
break;
}
}
- /* assigning the VCPUs round-robin is easier to implement, guest OSes
- * must cope with this anyway, because there are BIOSes out there in
- * real machines which also use this scheme.
+ /* Historically VCPUs were assigned in round-robin order to NUMA
+ * nodes. However it causes issues with guest not handling it nice
+ * in case where cores/threads from a multicore CPU appear on
+ * different nodes. So allow boards to override default distribution
+ * rule grouping VCPUs by socket so that VCPUs from the same socket
+ * would be on the same node.
*/
if (i == nb_numa_nodes) {
for (i = 0; i < max_cpus; i++) {
- set_bit(i, numa_info[i % nb_numa_nodes].node_cpu);
+ unsigned node_id = i % nb_numa_nodes;
+ if (mc->cpu_index_to_socket_id) {
+ node_id = mc->cpu_index_to_socket_id(i) % nb_numa_nodes;
+ }
+
+ set_bit(i, numa_info[node_id].node_cpu);
}
}
}
diff --git a/vl.c b/vl.c
index 69617d6..75ec292 100644
--- a/vl.c
+++ b/vl.c
@@ -4170,7 +4170,7 @@ int main(int argc, char **argv, char **envp)
default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
- parse_numa_opts();
+ parse_numa_opts(machine_class);
if (qemu_opts_foreach(qemu_find_opts("mon"), mon_init_func, NULL, 1) != 0) {
exit(1);
--
1.8.3.1
next prev parent reply other threads:[~2015-03-19 17:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 17:09 [Qemu-devel] [PATCH v3 for-2.3 0/2] numa: Fix default VCPUs to node mapping Igor Mammedov
2015-03-19 17:09 ` Igor Mammedov [this message]
2015-03-19 17:14 ` [Qemu-devel] [PATCH v3 for-2.3 1/2] numa: introduce machine callback for VCPU " Andreas Färber
2015-03-19 17:09 ` [Qemu-devel] [PATCH v3 for-2.3 2/2] pc: fix default VCPU to NUMA " Igor Mammedov
2015-03-19 17:13 ` [Qemu-devel] [PATCH v3 for-2.3 0/2] numa: Fix default VCPUs to " Andreas Färber
2015-03-19 17:44 ` Eduardo Habkost
2015-03-20 10:24 ` Igor Mammedov
2015-03-20 14:52 ` Eduardo Habkost
2015-03-20 15:01 ` Igor Mammedov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1426784962-7541-2-git-send-email-imammedo@redhat.com \
--to=imammedo@redhat.com \
--cc=afaerber@suse.de \
--cc=ehabkost@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).