From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:53705) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3jki-0006Ad-TQ for qemu-devel@nongnu.org; Tue, 21 Aug 2012 04:24:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T3jkb-00008L-ID for qemu-devel@nongnu.org; Tue, 21 Aug 2012 04:23:56 -0400 Received: from mail-lb0-f173.google.com ([209.85.217.173]:34967) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3jkb-00008A-8c for qemu-devel@nongnu.org; Tue, 21 Aug 2012 04:23:49 -0400 Received: by lbbgm13 with SMTP id gm13so3535280lbb.4 for ; Tue, 21 Aug 2012 01:23:47 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <50333717.6050207@siemens.com> References: <4FEC56B2.6050502@dlhnet.de> <502E42E9.2020402@siemens.com> <502E56D3.6060607@siemens.com> <502E5800.5060609@siemens.com> <502E5D66.1060003@siemens.com> <5030B51E.3010704@redhat.com> <50333717.6050207@siemens.com> Date: Tue, 21 Aug 2012 09:23:47 +0100 Message-ID: From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [Qemu-devel] qemu-kvm-1.0.1 - unable to exit if vcpu is in infinite loop List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Paolo Bonzini , Peter Lieven , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , Jan Kiszka On Tue, Aug 21, 2012 at 8:21 AM, Jan Kiszka wrote: > On 2012-08-19 11:42, Avi Kivity wrote: >> On 08/17/2012 06:04 PM, Jan Kiszka wrote: >>> >>>>> Can anyone imagine that such a barrier may actually be required? If it >>>>> is currently possible that env->stop is evaluated before we called into >>>>> sigtimedwait in qemu_kvm_eat_signals, then we could actually eat the >>>>> signal without properly processing its reason (stop). >>> >>> Should not be required (TM): Both signal eating / stop checking and stop >>> setting / signal generation happens under the BQL, thus the ordering >>> must not make a difference here. >> >> Agree. >> >> >>> Don't see where we could lose a signal. Maybe due to a subtle memory >>> corruption that sets thread_kicked to non-zero, preventing the kicking >>> this way. >> >> Cannot be ruled out, yet too much of a coincidence. >> >> Could be a kernel bug (either in kvm or elsewhere), we've had several >> before in this area. >> >> Is this reproducible? > > Not for me. Peter only hit it very rarely, Peter obviously more easily. I have only hit this once and was not able to reproduce it. Stefan