From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=53533 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ov7J4-00063z-Ir
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 07:34:43 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <kwolf@redhat.com>) id 1Ov7DA-00013K-2C
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 07:28:37 -0400
Received: from mx1.redhat.com ([209.132.183.28]:60550)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <kwolf@redhat.com>) id 1Ov7D9-00013E-Qc
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 07:28:36 -0400
Message-ID: <4C8E0AF2.2090107@redhat.com>
Date: Mon, 13 Sep 2010 13:28:50 +0200
From: Kevin Wolf <kwolf@redhat.com>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
References: <1283767478-16740-1-git-send-email-stefanha@linux.vnet.ibm.com>
	<4C84E738.3020802@codemonkey.ws> <4C865187.6090508@redhat.com>
	<4C865CFE.7010508@codemonkey.ws> <4C8663C4.1090508@redhat.com>
	<4C866773.2030103@codemonkey.ws>
	<4C86BC6B.5010809@codemonkey.ws> <4C874812.9090807@redhat.com>
	<4C87860A.3060904@codemonkey.ws> <4C888287.8020209@redhat.com>
	<4C88D7CC.5000806@codemonkey.ws> <4C8A1311.8070903@redhat.com>
	<4C8A2F40.7000509@codemonkey.ws> <4C8A36D4.5050001@redhat.com>
	<4C8A4707.7080705@codemonkey.ws> <4C8A5391.2030601@redhat.com>
	<4C8A65BB.9010602@codemonkey.ws> <4C8CD47E.4060309@redhat.com>
	<4C8CEE14.4020501@codemonkey.ws> <4C8CF812.4020203@redhat.com>
	<4C8D094E.4060507@codemonkey.ws>
In-Reply-To: <4C8D094E.4060507@codemonkey.ws>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Am 12.09.2010 19:09, schrieb Anthony Liguori:
> For a 1PB disk image with qcow2, the reference count table is 128GB.  
> For a 1TB image, the reference count table is 128MB.   For a 128GB 
> image, the reference table is 16MB which is why we get away with it today.

This is physical size. If you have a 1 PB disk, you're probably okay
with using 128 GB of it for metadata (and I think it's less than that,
see below)

> Anytime you grow the freelist with qcow2, you have to write a brand new 
> freelist table and update the metadata synchronously to point to a new 
> version of it.  That means for a 1TB image, you're potentially writing 
> out 128MB of data just to allocate a new cluster.

No. qcow2 has two-level tables.

File size: 1 TB
Number of clusters: 1 TB / 64 kB = 16 M
Number of refcount blocks: (16 M * 2 B) / 64kB = 512
Total size of all refcount blocks: 512 * 64kB = 32 MB
Size of recount table: 512 * 8 B = 4 kB

When we grow an image file, the refcount blocks can stay where they are,
only the refcount table needs to be rewritten. So we have to copy a
total of 4 kB for growing the image file when it's 1 TB in size (all
assuming 64k clusters).

The other result of this calculation is that we need to grow the
refcount table each time we cross a 16 TB boundary. So additionally to
being a small amount of data, it doesn't happen in practice anyway.

Kevin