From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Lieven Subject: Re: qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop Date: Thu, 28 Jun 2012 17:02:44 +0200 Message-ID: <4FEC7214.2020900@dlhnet.de> References: <4FEC56B2.6050502@dlhnet.de> <4FEC5B5A.4060302@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" To: Jan Kiszka Return-path: Received: from ssl.dlhnet.de ([91.198.192.8]:56137 "EHLO ssl.dlh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752164Ab2F1PCp (ORCPT ); Thu, 28 Jun 2012 11:02:45 -0400 In-Reply-To: <4FEC5B5A.4060302@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On 28.06.2012 15:25, Jan Kiszka wrote: > On 2012-06-28 15:05, Peter Lieven wrote: >> Hi, >> >> i debugged my initial problem further and found out that the problem >> happens to be that >> the main thread is stuck in pause_all_vcpus() on reset or quit commands >> in the monitor >> if one cpu is stuck in the do-while loop kvm_cpu_exec. If I modify the >> condition from while (ret == 0) >> to while ((ret == 0)&& !env->stop); it works, but is this the right fix? >> "Quit" command seems to work, but on "Reset" the VM enterns pause state. > Before entering the wait loop in pause_all_vcpus, there are kicks sent > to all vcpus. Now we need to find out why some of those kicks apparently > don't reach the destination. can you explain shot what exactly these kicks do? does these kicks lead to leaving the kernel mode and returning to userspace? > Again: > - on which host kernels does this occur, and which change may have > changed it? I do not see it in 3.0.0 and have also not seen it in 2.6.38. both the mainline 64-bit ubuntu-server kernels (for natty / oneiric respectively). If I compile a more recent kvm-kmod 3.3 or 3.4 on these machines, it is no longer working. > - with which qemu-kvm version is it reproducible, and which commit > introduced or fixed it? qemu-kvm-1.0.1 from sourceforge. to get into the scenario it is not sufficient to boot from an empty harddisk. to reproduce i have use a live cd like ubuntu-server 12.04 and choose to boot from the first harddisk. i think the isolinux loader does not check for a valid bootsector and just executes what is found in sector 0. this leads to the mmio reads i posted and 100% cpu load (most spent in kernel). at that time the monitor/qmp is still responsible. if i sent a command that pauses all vcpus, the first cpu is looping in kvm_cpu_exec and the main thread is waiting. at that time the monitor stops responding. i have also seen this issue on very old windows 2000 servers where the system fails to power off and is just halted. maybe this is also a busy loop. i will try to bisect this asap and let you know, maybe the above info helps you already to reproduce. thanks, peter > I failed reproducing so far. > > Jan >