From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Ky6PZ-0003Yb-Rr for qemu-devel@nongnu.org; Thu, 06 Nov 2008 10:04:41 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Ky6PY-0003YN-Ai for qemu-devel@nongnu.org; Thu, 06 Nov 2008 10:04:41 -0500 Received: from [199.232.76.173] (port=46676 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ky6PY-0003YK-7I for qemu-devel@nongnu.org; Thu, 06 Nov 2008 10:04:40 -0500 Received: from mx2.redhat.com ([66.187.237.31]:59834) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1Ky6PX-00016a-Uj for qemu-devel@nongnu.org; Thu, 06 Nov 2008 10:04:40 -0500 Message-ID: <49130772.2040508@redhat.com> Date: Thu, 06 Nov 2008 16:04:18 +0100 From: Chris Lalancette MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Possible bug in Qemu tcp migration Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: qemu-devel@nongnu.org Anthony, I was just finishing implementing Qemu/KVM live migration support in libvirt (posting here: https://www.redhat.com/archives/libvir-list/2008-November/msg00087.html). It works for the most part, except for one bug which I *think* is a bug in the Qemu live migration support. Here's the scenario: 1. start up the guest on the source side (virsh start myguest) 2. migrate the guest to the destination (virsh migrate qemu+tcp://remote/system) a. The virsh process on the source side sends a command to the destination libvirtd, and basically tells it to start the qemu container with -incoming. b. The source side performs the migrate via the "migrate tcp:remote:4444" monitor command c. Once the "migrate" monitor command completes without error, the source side kills the source qemu process. The problem seems to happen between steps b. and c. above. If I let that sequence rip as-is, then all of the memory from the source side is transferred over to the destination, and the "migrate" monitor command returns without error. However, the guest on the destination never starts; it's there, and all of the memory is there, it just won't execute at all. If, instead, I add a 5 second sleep in between steps b. and c. on the source side, then the migration completes as expected. It seems that the "migrate" monitor command is actually returning before everything is complete, so killing off the guest on the source side makes the destination wait around forever. Unfortunately, I haven't yet had time to look at it in any detail to see what's going on in the Qemu side, but I thought I would give you a heads up, and maybe you have an idea of where to look. Thanks, -- Chris Lalancette