From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32813) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYcwl-0004oQ-Po for qemu-devel@nongnu.org; Thu, 19 Mar 2015 12:05:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YYcwd-0007Xi-Q5 for qemu-devel@nongnu.org; Thu, 19 Mar 2015 12:05:23 -0400 Received: from cantor2.suse.de ([195.135.220.15]:49405 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YYcwd-0007Vs-Ky for qemu-devel@nongnu.org; Thu, 19 Mar 2015 12:05:15 -0400 Message-ID: <550AF3B9.3090104@suse.de> Date: Thu, 19 Mar 2015 17:05:13 +0100 From: =?ISO-8859-15?Q?Andreas_F=E4rber?= MIME-Version: 1.0 References: <1426696705-32472-1-git-send-email-imammedo@redhat.com> In-Reply-To: <1426696705-32472-1-git-send-email-imammedo@redhat.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 for-2.3] numa: pc: fix default VCPU to node mapping List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov , qemu-devel@nongnu.org Cc: ehabkost@redhat.com Am 18.03.2015 um 17:38 schrieb Igor Mammedov: > since commit > dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT > Linux kernel actually tries to use CPU to Node mapping from > QEMU provided SRAT table instead of discarding it, and that > in some cases breaks build_sched_domains() which expects > sane mapping where cores/threads belonging to the same socket > are on the same NUMA node. >=20 > With current default round-robin mapping of VCPUs to nodes > guest ends-up with cores/threads belonging to the same socket > being on different NUMA nodes. >=20 > For example with following CLI: > qemu-kvm -m 4G -smp 5,sockets=3D2,cores=3D4,threads=3D1,maxcpus=3D8 \ > -numa node,nodeid=3D0 -numa node,nodeid=3D1 > 2.6.32 based kernels will hang on boot due to incorrectly build > sched_group-s list in update_sd_lb_stats() > so comment in QEMU justifying dumb default mapping: > " > guest OSes must cope with this anyway, because there are BIOSes > out there in real machines which also use this scheme. > " > isn't really valid. >=20 > Replacing default mapping with a manual, where VCPUs belonging to > the same socket are on the same NUMA node, fixes issue for > guests which can't handle nonsense topology i.e. changing CLI to: > -numa node,nodeid=3D0,cpus=3D0-3 -numa node,nodeid=3D1,cpus=3D4-7 >=20 > So instead of simply scattering VCPUs around nodes, map > the same socket VCPUs to the same NUMA node, which is what > guest would expect from a sane hardware/BIOS. >=20 > Signed-off-by: Igor Mammedov > --- > v2: > - add machine callback cpu_index_to_socket_id() and use it > instead of stub approach > --- > hw/i386/pc.c | 9 +++++++++ > include/hw/boards.h | 5 +++++ > include/sysemu/numa.h | 3 ++- > numa.c | 18 +++++++++++++----- > vl.c | 2 +- > 5 files changed, 30 insertions(+), 7 deletions(-) Looks great to me now, the hook name with _socket_id is perfect, Reviewed-by: Andreas F=E4rber but can we do that in three steps please? "machine:" adding callback and default implementation, "numa:" switching to use it and "pc:" overriding the new callback - not only nicer subjects but easier to cherry-pick and bisect then. Regards, Andreas --=20 SUSE Linux GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany GF: Felix Imend=F6rffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton; HRB 21284 (AG N=FCrnberg)