From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1O0HUW-0006ui-ON for qemu-devel@nongnu.org; Fri, 09 Apr 2010 12:55:36 -0400 Received: from [140.186.70.92] (port=55407 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1O0HUV-0006tw-1v for qemu-devel@nongnu.org; Fri, 09 Apr 2010 12:55:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1O0HUT-0005tF-P9 for qemu-devel@nongnu.org; Fri, 09 Apr 2010 12:55:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43733) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1O0HUT-0005t7-Eb for qemu-devel@nongnu.org; Fri, 09 Apr 2010 12:55:33 -0400 Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o39GtOYo032055 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 9 Apr 2010 12:55:32 -0400 Received: from [172.17.72.6] (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx05.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id o39GlvQl016417 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 9 Apr 2010 12:48:00 -0400 Message-ID: <4BBF5A3C.10005@redhat.com> Date: Fri, 09 Apr 2010 12:47:56 -0400 From: Laine Stump MIME-Version: 1.0 Subject: Re: [Qemu-devel] race condition when exec'ing "qemu -incoming" followed by monitor "cont" References: <4BBF4FEA.1070306@redhat.com> In-Reply-To: <4BBF4FEA.1070306@redhat.com> Content-Type: text/plain; charset=ISO-8859-9; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 04/09/2010 12:03 PM, Laine Stump wrote: > (Please forgive (and correct!) any inaccuracies in my description of > qemu's workings - I've only recently started looking at it directly, > rather than through the lens of libvirt) > > libvirt implements a "domain restore" operation by: > > 0) start with a previously saved domain image in a file > > 1) open the domain image, and connect it to a pipe > > 2) fork, connect the pipe to stdin, and exec qemu with "-incoming > exec:cat" > > 3) execute "cont" in that qemu's monitor. I realized after sending that I was too hasty in my reading of the code, and interpreted it incorrectly. The file containing the domain image is not sent into a pipe, but merely dup2'ed to be stdin in the child process. So the correct sequence of events is: 0) start with a previously saved domain image in a file 1) open the domain image 2) fork, dup2 the domain image's fd as stdin, and exec qemu with "-incoming exec:cat" 3) execute "cont" in that qemu's monitor. On 04/09/2010 12:29 PM, Paolo Bonzini wrote: > On 04/09/2010 06:03 PM, Laine Stump wrote: >> >> Can someone provide any insight on why it is possible to start the CPUs >> in the domain before the incoming migration is complete, and what we can >> do (other than blindly sleeping) to prevent this? > > I would say it's a bug in QEMU, and it has to be fixed there. My assumption would have been that, if -incoming is specified on the commandline, the monitor should not even be started up until the incoming migration is complete. Being new to qemu, though, I'm hoping there's something I'm missing.