From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=56659 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P5FKS-00038d-CK for qemu-devel@nongnu.org; Mon, 11 Oct 2010 06:10:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P5FKR-0002UM-7m for qemu-devel@nongnu.org; Mon, 11 Oct 2010 06:10:00 -0400 Received: from mtagate1.de.ibm.com ([195.212.17.161]:42993) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P5FKR-0002U9-0C for qemu-devel@nongnu.org; Mon, 11 Oct 2010 06:09:59 -0400 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate1.de.ibm.com (8.13.1/8.13.1) with ESMTP id o9BA9vLl007956 for ; Mon, 11 Oct 2010 10:09:57 GMT Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com [9.149.165.229]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9BA9tmP4128938 for ; Mon, 11 Oct 2010 12:09:57 +0200 Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id o9BA9tFR018383 for ; Mon, 11 Oct 2010 12:09:55 +0200 Date: Mon, 11 Oct 2010 11:09:54 +0100 From: Stefan Hajnoczi Message-ID: <20101011100954.GA4078@stefan-thinkpad.transitives.com> References: <1286552914-27014-1-git-send-email-stefanha@linux.vnet.ibm.com> <1286552914-27014-4-git-send-email-stefanha@linux.vnet.ibm.com> <4CB18549.3020206@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CB18549.3020206@redhat.com> Subject: [Qemu-devel] Re: [PATCH v2 3/7] docs: Add QED image format specification List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Kevin Wolf , Anthony Liguori , qemu-devel@nongnu.org, Christoph Hellwig On Sun, Oct 10, 2010 at 11:20:09AM +0200, Avi Kivity wrote: > On 10/08/2010 05:48 PM, Stefan Hajnoczi wrote: > >Signed-off-by: Stefan Hajnoczi > >--- > > docs/specs/qed_spec.txt | 94 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 files changed, 94 insertions(+), 0 deletions(-) > > > >+Feature bits: > >+* QED_F_BACKING_FILE = 0x01. The image uses a backing file. > >+* QED_F_NEED_CHECK = 0x02. The image needs a consistency check before use. > > Great that QED_F_NEED_CHECK is now non-optional. However, suppose > we add a new structure (e.g. persistent freelist); it's now > impossible to tell whether the structure is updated or not: > > 1 new qemu opens image > 2 writes persistent freelist > 3 clears need_check > 4 shuts down > 5 old qemu opens image > 6 doesn't update persistent freelist > 7 clears need_check > 8 shuts down > > The image is now inconsistent, but has need_check cleared. > > We can address this by having a third feature bitmask that is > autocleared by guests that don't recognize various bits; so the > sequence becomes: > > 1 new qemu opens image > 2 writes persistent freelist > 3 clears need_check > 4 sets persistent_freelist > 5 shuts down > 6 old qemu opens image > 7 clears persistent_freelist (and any other bits it doesn't recognize) > 8 doesn't update persistent freelist > 9 clears need_check > 10 shuts down > > The image is now consistent, since the persistent freelist has disappeared. It is more complicated than just the feature bit. The freelist requires space in the image file. Clearing the persistent_freelist bit leaks the freelist. We can solve this by using a compat feature bit and an autoclear feature bit. The autoclear bit specifies whether or not the freelist is valid and the compat feature bit specifices whether or not the freelist exists. When the new qemu opens the image again it notices that the autoclear bit is unset but the compat bit is set. This means the freelist is out-of-date and its space can be reclaimed. I don't like the idea of doing these feature bit acrobatics because they add complexity. On the other hand your proposal ensures backward compatibility in the case of an optional data structure that needs to stay in sync with the image. I'm just not 100% convinced it's worth it. > >+* QED_CF_BACKING_FORMAT = 0x01. The image has a specific backing file format stored. > >+ > > It was suggested to have just a bit saying whether the backing > format is raw or not. This way you don't need to store the format. Will fix in v3. Stefan