From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=54435 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P5hdM-0001IU-17 for qemu-devel@nongnu.org; Tue, 12 Oct 2010 12:23:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P5hWX-0006Hx-20 for qemu-devel@nongnu.org; Tue, 12 Oct 2010 12:16:26 -0400 Received: from e8.ny.us.ibm.com ([32.97.182.138]:52605) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P5hWW-0006Ho-V2 for qemu-devel@nongnu.org; Tue, 12 Oct 2010 12:16:21 -0400 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by e8.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o9CG1LrC007913 for ; Tue, 12 Oct 2010 12:01:21 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9CGGIPU300734 for ; Tue, 12 Oct 2010 12:16:18 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o9CGGHYX016517 for ; Tue, 12 Oct 2010 13:16:18 -0300 Message-ID: <4CB489D1.3050204@linux.vnet.ibm.com> Date: Tue, 12 Oct 2010 11:16:17 -0500 From: Anthony Liguori MIME-Version: 1.0 References: <1286552914-27014-1-git-send-email-stefanha@linux.vnet.ibm.com> <1286552914-27014-7-git-send-email-stefanha@linux.vnet.ibm.com> <4CB479D2.7030901@redhat.com> <4CB47D38.3060602@linux.vnet.ibm.com> <4CB48144.9030607@redhat.com> <20101012155953.GA13872@stefan-thinkpad.transitives.com> In-Reply-To: <20101012155953.GA13872@stefan-thinkpad.transitives.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH v2 6/7] qed: Read/write support List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , Avi Kivity , qemu-devel@nongnu.org, Christoph Hellwig On 10/12/2010 10:59 AM, Stefan Hajnoczi wrote: > On Tue, Oct 12, 2010 at 05:39:48PM +0200, Kevin Wolf wrote: > >> Am 12.10.2010 17:22, schrieb Anthony Liguori: >> >>> On 10/12/2010 10:08 AM, Kevin Wolf wrote: >>> >>>> Otherwise we might destroy data that isn't >>>> even touched by the guest request in case of a crash. >>>> >>>> >>> The failure scenarios are either that the cluster is leaked in which >>> case, the old version of the data is still present or the cluster is >>> orphaned because the L2 entry is written, in which case the old version >>> of the data is present. >>> >> Hm, how does the latter case work? Or rather, what do mean by "orphaned"? >> >> >>> Are you referring to a scenario where the cluster is partially written >>> because the data is present in the write cache and the write cache isn't >>> flushed on power failure? >>> >> The case I'm referring to is a COW. So let's assume a partial write to >> an unallocated cluster, we then need to do a COW in pre/postfill. Then >> we do a normal write and link the new cluster in the L2 table. >> >> Assume that the write to the L2 table is already on the disk, but the >> pre/postfill data isn't yet. At this point we have a bad state because >> if we crash now we have lost the data that should have been copied from >> the backing file. >> > In this case QED_F_NEED_CHECK is set and the invalid cluster offset > should be reset to zero on open. > > However, I think we can get into a state where the pre/postfill data > isn't on the disk yet but another allocation has increased the file > size, making the unwritten cluster "valid". This fools consistency > check into thinking the data cluster (which was never written to on > disk) is valid. > > Will think about this more tonight. > It's fairly simple to add a sync to this path. It's probably worth checking the prefill/postfill for zeros and avoiding the write/sync if that's the case. That should optimize the common cases of allocating new space within a file. My intuition is that we can avoid the sync entirely but we'll need to think about it further. Regards, Anthony Liguori > Stefan >