From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=49132 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pz9L4-0001Am-Hw for qemu-devel@nongnu.org; Mon, 14 Mar 2011 11:05:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pz9L1-0006W3-VD for qemu-devel@nongnu.org; Mon, 14 Mar 2011 11:05:41 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:45692) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pz9L1-0006Vn-Ov for qemu-devel@nongnu.org; Mon, 14 Mar 2011 11:05:39 -0400 Received: by gxk26 with SMTP id 26so1320093gxk.4 for ; Mon, 14 Mar 2011 08:05:39 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4D7E2B09.7060002@codemonkey.ws> References: <4D5BC467.4070804@redhat.com> <4D5E4271.80501@redhat.com> <20110220221357.GO4580@hall.aurel32.net> <4D62295E.1030504@redhat.com> <4D7D036B.4050706@codemonkey.ws> <4D7E167A.1020509@codemonkey.ws> <4D7E2001.8080201@codemonkey.ws> <4D7E2465.9040208@redhat.com> <4D7E2B09.7060002@codemonkey.ws> Date: Mon, 14 Mar 2011 15:05:38 +0000 Message-ID: Subject: Re: [Qemu-devel] Re: Strategic decision: COW format From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Kevin Wolf , Chunqiang Tang , Markus Armbruster , Aurelien Jarno , qemu-devel@nongnu.org On Mon, Mar 14, 2011 at 2:49 PM, Anthony Liguori wr= ote: > On 03/14/2011 09:21 AM, Kevin Wolf wrote: >> >> Am 14.03.2011 15:02, schrieb Anthony Liguori: >>> >>> On 03/14/2011 08:53 AM, Chunqiang Tang wrote: >>>>> >>>>> No, because the copy-on-write is another layer on top of the snapshot >>>>> and AFAICT, they don't persist when moving between snapshots. >>>>> >>>>> The equivalent for external snapshots would be: >>>>> >>>>> base0<- base1<- base2<- image >>>>> >>>>> And then if I wanted to move to base1 without destroying base2 and >>>>> image, I could do: >>>>> >>>>> qemu-img create -f qcow2 -b base1 base1-overlay.img >>>>> >>>>> The file system can keep a lot of these things around pretty easily b= ut >>>>> with your proposal, it seems like there can only be one. =A0If you >>>>> support >>>>> many of them, I think you'll degenerate to something as complex as a >>>>> reference count table. >>>>> >>>>> On the other hand, I think it's reasonable to just avoid the CoW >>>>> overlay >>>>> entirely and say that moving to a previous snapshot destroys any of >>>>> it's >>>>> children. =A0I think this ends up being a simplifying assumption that= is >>>>> worth investigating further. >>>> >>>> No, both VMware and FVD have the same semantics as QCOW2. Moving to a >>>> previous snapshot does not destroy any of its children. In the example= I >>>> gave (copied below), >>>> it goes from >>>> >>>> Image: s1->s2->s3->s4->(current-state) >>>> >>>> back to snapshot s2, and now the state is >>>> >>>> Image: s1->s2->s3->s4 >>>> =A0 =A0 =A0 =A0 =A0 =A0 |->(curren-state) >>>> >>>> where all snapshots s1-s4 are kept. From there, it can take another >>>> snapshot s5, and then further go back to snapshot s4, ending up with >>>> >>>> Image: s1->s2->s3->s4 >>>> =A0 =A0 =A0 =A0 =A0 =A0 |->s5 =A0 | >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 |-> =A0 (current-state) >>> >>> Your use of "current-state" is confusing me because AFAICT, >>> current-state is just semantically another snapshot. >>> >>> It's writable because it has no children. =A0You only keep around one >>> writable snapshot and to make another snapshot writable, you have to >>> discard the former. >>> >>> This is not the semantics of qcow2. =A0Every time you create a snapshot= , >>> it's essentially a new image. =A0You can write directly to it. >>> >>> While we don't do this today and I don't think we ever should, it's >>> entirely possible to have two disks served simultaneously out of the >>> same qcow2 file using snapshots. >> >> No, CQ is describing the semantics of internal snapshots in qcow2 >> correctly. You have all the snapshots that are stored in the snapshot >> table (all read-only) plus one current state described by the image >> header (read-write). > > But is there any problem (in the format) with writing to the non-current > state? =A0I can't think of one. Here is a problem: there is a single global refcount table in QCOW2. You need to synchronize updates of the refcounts between multiple writers to avoid introducing incorrect refcounts. Stefan