From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56560) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOQvA-0004MT-Dn for qemu-devel@nongnu.org; Mon, 01 Sep 2014 08:41:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XOQv1-0001SI-3V for qemu-devel@nongnu.org; Mon, 01 Sep 2014 08:41:20 -0400 Received: from e06smtp15.uk.ibm.com ([195.75.94.111]:44524) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XOQv0-0001Rn-QX for qemu-devel@nongnu.org; Mon, 01 Sep 2014 08:41:11 -0400 Received: from /spool/local by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 1 Sep 2014 13:41:09 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp02.portsmouth.uk.ibm.com (Postfix) with ESMTP id 6B07E219005C for ; Mon, 1 Sep 2014 13:40:47 +0100 (BST) Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s81Cf6KV46465222 for ; Mon, 1 Sep 2014 12:41:06 GMT Received: from d06av06.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s817dhhI024470 for ; Mon, 1 Sep 2014 03:39:43 -0400 Date: Mon, 1 Sep 2014 14:41:02 +0200 From: Greg Kurz Message-ID: <20140901144102.0aaae712@bahia.local> In-Reply-To: <20140830145313.GN14001@redhat.com> References: <20140830145313.GN14001@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] qcow2, lazy_refcounts and killing qemu List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Richard W.M. Jones" Cc: kwolf@redhat.com, qemu-devel@nongnu.org On Sat, 30 Aug 2014 15:53:13 +0100 "Richard W.M. Jones" wrote: > I found out a few days ago that if you: > > (1) Open a qcow2 file that has lazy_refcounts = on and a backing file, and > > (2) Write lots of stuff, and > > (3) Kill qemu with SIGTERM [which I believed, maybe incorrectly, is a > "nice" way to kill qemu] > > .. then you can end up with a corrupt qcow2 file. In particular the > qcow2 file sometimes forgot that it had a backing file, but I suspect > this was just a symptom and in fact the qcow2 file header wasn't being > written to disk correctly. > Hi Rich, Someone in IBM hit a very similar issue with PowerKVM a few monthes ago. The symptom was a corrupted filesystem in a qcow2 file. The steps involved to kill the QEMU process while the guest OS is shutting down. Unfortunately, no easy reproducer could be found and investigations halted... > Is it correct that sending SIGTERM to qemu should kill it cleanly, or > is that no longer the case, or is lazy_refcounts a special case, or > have I found a bug? > QEMU catches SIGTERM and calls bdrv_close(), so I would favor it is a bug or an undocumented limitation (hence a documentation bug :) > I can reproduce this easily, although of course the reproducer will > involve libguestfs. > > Rich. > Can you share this reproducer ? Cheers. -- Greg