From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:35545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RRHvk-0006rB-Hr for qemu-devel@nongnu.org; Fri, 18 Nov 2011 01:28:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RRHvi-0006e2-NE for qemu-devel@nongnu.org; Fri, 18 Nov 2011 01:28:08 -0500 Received: from e28smtp02.in.ibm.com ([122.248.162.2]:60557) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RRHvf-0006dh-LC for qemu-devel@nongnu.org; Fri, 18 Nov 2011 01:28:06 -0500 Received: from /spool/local by e28smtp02.in.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 18 Nov 2011 11:57:46 +0530 Received: from d28av04.in.ibm.com (d28av04.in.ibm.com [9.184.220.66]) by d28relay01.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pAI6RgND4853878 for ; Fri, 18 Nov 2011 11:57:43 +0530 Received: from d28av04.in.ibm.com (loopback [127.0.0.1]) by d28av04.in.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pAI6RWAi015940 for ; Fri, 18 Nov 2011 17:27:32 +1100 Date: Fri, 18 Nov 2011 11:57:27 +0530 From: Bharata B Rao Message-ID: <20111118062727.GD4993@in.ibm.com> References: <20111117085751.16490.43574.malonedeb@chaenomeles.canonical.com> <20111117085751.16490.43574.malonedeb@chaenomeles.canonical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111117085751.16490.43574.malonedeb@chaenomeles.canonical.com> Subject: Re: [Qemu-devel] [Bug 891525] [NEW] Guest kernel crashes when booting a NUMA guest without explicitly specifying cpus= in -numa option Reply-To: bharata@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Bharata B Rao <891525@bugs.launchpad.net> Cc: qemu-devel@nongnu.org The reason for guest kernel crash is because qemu enumerates the VCPUs in a round robin fashion between the nodes. As per this comment from vl.c, guest kernel is supposed to handle this: /* assigning the VCPUs round-robin is easier to implement, guest OSes * must cope with this anyway, because there are BIOSes out there in * real machines which also use this scheme. */ I am not sure if this would be considered a bug in the guest kernel, but I have verifed that enumerating the VCPUs in a serial manner between nodes fixes the problem for me. I am not aware of the history behind round robin assignment nor do I understand the full implications of changing it, but here is a potential patch that fixes the problem for me. --- Fix VCPU enumeration between nodes From: Bharata B Rao Currently VCPUs are assigned to nodes in round robin manner. This is seen to break guest kernel for x86_64-softmmu. Hence assign VCPUs serially between nodes. Signed-off-by: Bharata B Rao --- vl.c | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/vl.c b/vl.c index f5afed4..75348d0 100644 --- a/vl.c +++ b/vl.c @@ -3307,13 +3307,18 @@ int main(int argc, char **argv, char **envp) if (node_cpumask[i] != 0) break; } - /* assigning the VCPUs round-robin is easier to implement, guest OSes - * must cope with this anyway, because there are BIOSes out there in - * real machines which also use this scheme. - */ + /* Assign VCPUs to nodes in serial manner */ if (i == nb_numa_nodes) { + int cpus_per_node = smp_cpus / nb_numa_nodes; + for (i = 0; i < smp_cpus; i++) { - node_cpumask[i % nb_numa_nodes] |= 1 << i; + int nodeid = i / cpus_per_node; + + /* Extra VCPUs goto Node 0 */ + if (nodeid >= nb_numa_nodes) { + nodeid = 0; + } + node_cpumask[nodeid] |= 1 << i; } } }