From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47543) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XvNb5-00030E-Mc for qemu-devel@nongnu.org; Mon, 01 Dec 2014 04:48:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XvNaz-0000dz-Bk for qemu-devel@nongnu.org; Mon, 01 Dec 2014 04:48:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36589) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XvNaz-0000dV-4U for qemu-devel@nongnu.org; Mon, 01 Dec 2014 04:48:41 -0500 Message-ID: <547C396A.4080001@redhat.com> Date: Mon, 01 Dec 2014 10:48:26 +0100 From: Paolo Bonzini MIME-Version: 1.0 References: <33183CC9F5247A488A2544077AF1902086E041A5@SZXEMA503-MBS.china.huawei.com> <54775D4A.8080709@redhat.com> <5477E019.6090408@huawei.com> In-Reply-To: <5477E019.6090408@huawei.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [BUG] Redhat-6.4_64bit-guest kernel panic with cpu-passthrough and guest numa List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gonglei Cc: "Huangweidong (C)" , "benoit@irqsave.net" , "wangxin (U)" , "qemu-devel@nongnu.org" , "Huangpeng (Peter)" , "Herongguang (Stephen)" On 28/11/2014 03:38, Gonglei wrote: >> > Can you find what line of kernel/sched.c it is? > Yes, of course. See below please: > "sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power; " > in update_sg_lb_stats(), file sched.c, line 4094 > And I can share the cause of we found. After commit 787aaf57(target-i386: > forward CPUID cache leaves when -cpu host is used), guest will get cpu cache > from host when -cpu host is used. But if we configure guest numa: > node 0 cpus 0~7 > node 1 cpus 8~15 > then the numa nodes lie in the same host cpu cache (cpus 0~16). > When the guest os boot, calculate group->cpu_power, but the guest find thoes > two different nodes own the same cache, then node1's group->cpu_power > will not be valued, just is the initial value '0'. And when vcpu is scheduled, > division by 0 causes kernel panic. Thanks. Please open a Red Hat bugzilla with the information, and Cc Larry Woodman who fixed a few instances of this in the past. Paolo