From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54308) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXuqQ-0006Yb-4N for qemu-devel@nongnu.org; Tue, 17 Mar 2015 12:59:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YXuqM-0003jL-3W for qemu-devel@nongnu.org; Tue, 17 Mar 2015 12:59:54 -0400 Received: from cantor2.suse.de ([195.135.220.15]:56112 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YXuqL-0003jH-U8 for qemu-devel@nongnu.org; Tue, 17 Mar 2015 12:59:50 -0400 Message-ID: <55085D84.7000701@suse.de> Date: Tue, 17 Mar 2015 17:59:48 +0100 From: =?windows-1252?Q?Andreas_F=E4rber?= MIME-Version: 1.0 References: <1426607318-22728-1-git-send-email-imammedo@redhat.com> <20150317164236.GM3513@thinpad.lan.raisama.net> In-Reply-To: <20150317164236.GM3513@thinpad.lan.raisama.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapping List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost , Igor Mammedov Cc: qemu-devel@nongnu.org Am 17.03.2015 um 17:42 schrieb Eduardo Habkost: > On Tue, Mar 17, 2015 at 03:48:38PM +0000, Igor Mammedov wrote: >> since commit >> dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT >> Linux kernel actually tries to use CPU to Node mapping from >> QEMU provided SRAT table instead of discarding it, and that >> in some cases breaks build_sched_domains() which expects >> sane mapping where cores/threads belonging to the same socket >> are on the same NUMA node. >> >> With current default round-robin mapping of VCPUs to nodes >> guest ends-up with cores/threads belonging to the same socket >> being on different NUMA nodes. >> >> For example with following CLI: >> qemu-kvm -m 4G -smp 5,sockets=3D1,cores=3D4,threads=3D1,maxcpus=3D8 \ >> -numa node,nodeid=3D0 -numa node,nodeid=3D1 >> 2.6.32 based kernels will hang on boot due to incorrectly build >> sched_group-s list in update_sd_lb_stats() >> so comment in QEMU justifying dumb default mapping: >> " >> guest OSes must cope with this anyway, because there are BIOSes >> out there in real machines which also use this scheme. >> " >> isn't really valid. >> >> Replacing default mapping withi a manual, where VCPUs belonging to >> the same socket are on the same NUMA node, fixes issue for >> guests which can't handle nonsense topology i.e. cnaging CLI to: >> -numa node,nodeid=3D0,cpus=3D0-3 -numa node,nodeid=3D1,cpus=3D4-7 >> >> So instead of simply scattering VCPUs around nodes, map >> the same socket VCPUs to the same NUMA node, which is what >> guest would expect from a sane hardware/BIOS. >> >> Signed-off-by: Igor Mammedov >=20 > I believe the proposed behavior is much better. But if we are going to > break compatibility, shouldn't we at least do that before the first -rc > so we get feedback in case it break existing configurations? >=20 > About qemu_cpu_socket_id_from_index(): all qemu-system-* binaries have > smp_cores and smp_threads available (even if machines ignore it), but > the default stub can return values that are larger than the number of > sockets if smp_cores*smp_threads > 1, which would be obviously > incorrect. Isn't it easier to simply make > "cpu_index/(smp_cores*smp_sockets)" be the default cpu_index->socket > mapping function, and allow machine-specific (not arch-specific) > overrides if necessary? Agree that the proposed stub solution is not so nice. Can you propose a MachineClass based solution instead? The example I keep bringing up for x86 is that the Galileo boards or even the Minnow boards don't really have sockets, being a SoC. Thanks, Andreas --=20 SUSE Linux GmbH, Maxfeldstr. 5, 90409 N=FCrnberg, Germany GF: Felix Imend=F6rffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton; HRB 21284 (AG N=FCrnberg)