From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:39268) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R64vA-0003Qe-Fd for qemu-devel@nongnu.org; Tue, 20 Sep 2011 14:19:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R64v9-0004hi-3g for qemu-devel@nongnu.org; Tue, 20 Sep 2011 14:19:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57439) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R64v8-0004ha-Sj for qemu-devel@nongnu.org; Tue, 20 Sep 2011 14:19:51 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p8KIJnIw032183 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 20 Sep 2011 14:19:49 -0400 Message-ID: <4E78D939.4090902@redhat.com> Date: Tue, 20 Sep 2011 12:19:37 -0600 From: Eric Blake MIME-Version: 1.0 References: <20110920180649.GD4121@redhat.com> In-Reply-To: <20110920180649.GD4121@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [libvirt] [PATCH] qemu: Fix shutdown regression List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Dave Allan Cc: libvir-list@redhat.com, QEMU Developers On 09/20/2011 12:06 PM, Dave Allan wrote: > On Tue, Sep 20, 2011 at 07:39:15PM +0200, Jiri Denemark wrote: >> The commit that prevents disk corruption on domain shutdown >> (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU >> 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed >> only recently in QEMU git. With affected QEMU binaries, domains cannot >> be shutdown properly and stay in a paused state. This patch tries to >> avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we >> wait a bit more between sending SIGTERM and SIGKILL to reduce the >> possibility of virtual disk corruption. > > IMO, SIGKILL should only be sent at the explicit direction of the > user, saying in effect, I'm ok with possible data corruption, I want > the VM killed unconditionally. I would rather leave VMs paused than > risk corrupting data. Let's get as much input as we can from the qemu > folks before we go down this path. That re-echos my sentiment that qemu needs to tell us whether the bug is fixed (we know that if version < 0.14, the bug is not present, and if version > 0.15, the bug is fixed, but it is the 0.1[45] window where we don't know if the vendor has back-ported the fix into the version of qemu that we are targetting, unless we get some help from qemu). I also wonder if we should make it so: virDomainDestroy(dom) fails with a reasonable message, rather than leaving the domain paused, if we think qemu has the bug, and require the user to do virDomainDestroyFlags(dom, VIR_DOMAIN_DESTROY_FORCE) as the means of the user explicitly requesting that they work around the qemu bug. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org