From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=55021 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ov8q5-0003IJ-BS
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:12:54 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <anthony@codemonkey.ws>) id 1Ov8q1-0000V1-5N
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:12:53 -0400
Received: from mail-gw0-f45.google.com ([74.125.83.45]:63066)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <anthony@codemonkey.ws>) id 1Ov8q1-0000Uv-37
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:12:49 -0400
Received: by gwb11 with SMTP id 11so2381824gwb.4
	for <qemu-devel@nongnu.org>; Mon, 13 Sep 2010 06:12:48 -0700 (PDT)
Message-ID: <4C8E2348.7020100@codemonkey.ws>
Date: Mon, 13 Sep 2010 08:12:40 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format
References: <1283767478-16740-1-git-send-email-stefanha@linux.vnet.ibm.com>
	<4C84E738.3020802@codemonkey.ws> <4C865187.6090508@redhat.com>
	<4C865CFE.7010508@codemonkey.ws> <4C8663C4.1090508@redhat.com>
	<4C866773.2030103@codemonkey.ws>
	<4C86BC6B.5010809@codemonkey.ws> <4C874812.9090807@redhat.com>
	<4C87860A.3060904@codemonkey.ws> <4C888287.8020209@redhat.com>
	<4C88D7CC.5000806@codemonkey.ws> <4C8A1311.8070903@redhat.com>
	<4C8A2F40.7000509@codemonkey.ws> <4C8A36D4.5050001@redhat.com>
	<4C8A4707.7080705@codemonkey.ws> <4C8A5391.2030601@redhat.com>
	<4C8A65BB.9010602@codemonkey.ws> <4C8CD47E.4060309@redhat.com>
	<4C8CEE14.4020501@codemonkey.ws> <4C8CF812.4020203@redhat.com>
	<4C8D094E.4060507@codemonkey.ws> <4C8E0AF2.2090107@redhat.com>
In-Reply-To: <4C8E0AF2.2090107@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, Avi Kivity <avi@redhat.com>, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

On 09/13/2010 06:28 AM, Kevin Wolf wrote:
>> Anytime you grow the freelist with qcow2, you have to write a brand new
>> freelist table and update the metadata synchronously to point to a new
>> version of it.  That means for a 1TB image, you're potentially writing
>> out 128MB of data just to allocate a new cluster.
>>      
> No. qcow2 has two-level tables.
>
> File size: 1 TB
> Number of clusters: 1 TB / 64 kB = 16 M
> Number of refcount blocks: (16 M * 2 B) / 64kB = 512
> Total size of all refcount blocks: 512 * 64kB = 32 MB
> Size of recount table: 512 * 8 B = 4 kB
>
> When we grow an image file, the refcount blocks can stay where they are,
> only the refcount table needs to be rewritten. So we have to copy a
> total of 4 kB for growing the image file when it's 1 TB in size (all
> assuming 64k clusters).
>    

Yes, I misread the code.  It is a two level table.

Even though it's 4x smaller than I previously stated, it's still quite 
large and finding a free block is an O(n) operation where n is the 
physical file size.  An fsck() on qed is also an O(n) operation where n 
is the physical file size so I still contend the two are similar in cost.

Regards,

Anthony Liguori

> The other result of this calculation is that we need to grow the
> refcount table each time we cross a 16 TB boundary. So additionally to
> being a small amount of data, it doesn't happen in practice anyway.
>
> Kevin
>