From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=50763 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OOQeE-0006iy-HJ for qemu-devel@nongnu.org; Tue, 15 Jun 2010 03:33:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OOQeD-0002eh-Bk for qemu-devel@nongnu.org; Tue, 15 Jun 2010 03:33:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54351) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OOQeD-0002eT-2g for qemu-devel@nongnu.org; Tue, 15 Jun 2010 03:33:25 -0400 Message-ID: <4C172CC1.6030905@redhat.com> Date: Tue, 15 Jun 2010 10:33:21 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] stop cpus before forking. References: <1276543644-32689-1-git-send-email-glommer@redhat.com> <4C1683EC.3010609@codemonkey.ws> In-Reply-To: <4C1683EC.3010609@codemonkey.ws> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Glauber Costa , qemu-devel@nongnu.org, aliguori@us.ibm.com On 06/14/2010 10:33 PM, Anthony Liguori wrote: > On 06/14/2010 02:27 PM, Glauber Costa wrote: >> This patch fixes a bug that happens with kvm, irqchip-in-kernel, >> while adding a netdev. Despite the situations of reproduction being >> specific to kvm, I believe this fix is pretty generic, and fits here. >> Specially if we ever want to have our own irqchip in kernel too. >> >> The problem happens after the fork system call, and although it is not >> 100 % reproduceable, happens pretty often. After fork, the memory where >> the apic is mapped is present in both processes. It ends up confusing >> the vcpus somewhere in the irq<-> ack path, and qemu hangs, with no >> irqs being delivered at all from that point on. >> >> Making sure the vcpus are stopped before forking makes the problem go >> away. Besides, this is a pretty unfrequent operation, which already >> hangs >> the io-thread for a while. So it should not hurt performance. > > This doesn't make very much sense to me but smells like a kernel bug > to me. It is, and the fix would be to create the APIC memory slot as sharable across forks (should be easy to fix in the kernel). > Even if it isn't, I can't rationalize why stopping the vm like this is > enough to fix such a problem. Is the problem that the KVM VCPU > threads get duplicated while potentially running or something like that? I think it's COW triggering a copy on one vcpu while the other reads from the old page. -- error compiling committee.c: too many arguments to function