From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:35899) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TiO1M-0008Hi-2J for qemu-devel@nongnu.org; Tue, 11 Dec 2012 06:29:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TiO1F-000899-TF for qemu-devel@nongnu.org; Tue, 11 Dec 2012 06:29:08 -0500 Received: from mail-we0-f173.google.com ([74.125.82.173]:52022) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TiO1F-00088v-GQ for qemu-devel@nongnu.org; Tue, 11 Dec 2012 06:29:01 -0500 Received: by mail-we0-f173.google.com with SMTP id z2so1624373wey.4 for ; Tue, 11 Dec 2012 03:29:00 -0800 (PST) Date: Tue, 11 Dec 2012 12:28:57 +0100 From: Stefan Hajnoczi Message-ID: <20121211112857.GG796@stefanha-thinkpad.muc.redhat.com> References: <1353935123-24199-1-git-send-email-benoit@irqsave.net> <1353935123-24199-2-git-send-email-benoit@irqsave.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1353935123-24199-2-git-send-email-benoit@irqsave.net> Subject: Re: [Qemu-devel] [RFC V3 01/24] qcow2: Add deduplication to the qcow2 specification. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?iso-8859-1?Q?Beno=EEt?= Canet Cc: kwolf@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com On Mon, Nov 26, 2012 at 02:05:00PM +0100, Benoît Canet wrote: > Signed-off-by: Benoit Canet > --- > docs/specs/qcow2.txt | 33 ++++++++++++++++++++++++++++++++- > 1 file changed, 32 insertions(+), 1 deletion(-) > > diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt > index 36a559d..16eafd7 100644 > --- a/docs/specs/qcow2.txt > +++ b/docs/specs/qcow2.txt > @@ -80,7 +80,10 @@ in the description of a field. > tables to repair refcounts before accessing the > image. > > - Bits 1-63: Reserved (set to 0) > + Bit 1: Deduplication bit. If this bit is set then > + deduplication is used on this image. > + > + Bits 2-63: Reserved (set to 0) > > 80 - 87: compatible_features > Bitmask of compatible features. An implementation can This bit prevents programs that don't support dedup from opening the image file. What are the restrictions really - can a program without dedup support read the file? Can it write to the file (invalidating the dedup table)? > @@ -116,6 +119,7 @@ be stored. Each extension has a structure like the following: > 0x00000000 - End of the header extension area > 0xE2792ACA - Backing file format name > 0x6803f857 - Feature name table > + 0xCD8E819B - Deduplication > other - Unknown header extension, can be safely > ignored > > @@ -159,6 +163,33 @@ the header extension data. Each entry look like this: > terminated if it has full length) > > > +== Deduplication == > + > +The deduplication extension contains the offset and size of the deduplication > +table. > + > + Byte 0 - 7: Offset > + > + 8 - 11: Size Units? > + > +== Deduplication table == Before going into the layout please summarize the point of this table: The deduplication table maps a physical offset to a data hash and logical offset. ... > +The deduplication table contains 64 bits offsets to the level 2 deduplication > +table clusters. > +Each entry of these clusters contains a 32 bytes SHA256 hash followed by the > +64 bits logical offset of the first encountered block having this hash. At this point a diagram showing L1, L2, and dedup table entry would help. Or perhaps the entry structure can be presented like other structures in this spec to reduce the amount of English description and use a more formal reference: Each L2 deduplication table entry has the following structure: Byte 0 - 31: SHA256 hash of data cluster 32 - 39: Logical offset of first encountered block having this hash > +Entries in the deduplication table are orderered by physical cluster index. > + > +The number of entries in an l2 deduplication table cluster is : > +l2_dedup_cluster_entries = cluster_size / (32 + 8) > + > +The index in the level 1 deduplication table is : > +l1_dedup_index = physical_cluster_index / l2_dedup_cluster_entries > + > +The index in the level 2 deduplication table is: > +l2_dedup_index = physical_cluster_index % l2_dedup_cluster_entries > + > == Host cluster management == > > qcow2 manages the allocation of host clusters by maintaining a reference count > -- > 1.7.10.4 > >