From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=60672 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OtKBS-0000qa-1V for qemu-devel@nongnu.org; Wed, 08 Sep 2010 08:55:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OtKBN-0007gz-Ho for qemu-devel@nongnu.org; Wed, 08 Sep 2010 08:55:25 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:57172) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OtKBN-0007gZ-Az for qemu-devel@nongnu.org; Wed, 08 Sep 2010 08:55:21 -0400 Received: by gxk26 with SMTP id 26so1085gxk.4 for ; Wed, 08 Sep 2010 05:55:20 -0700 (PDT) Message-ID: <4C8787B6.2070907@codemonkey.ws> Date: Wed, 08 Sep 2010 07:55:18 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [RFC] qed: Add QEMU Enhanced Disk format References: <1283767478-16740-1-git-send-email-stefanha@linux.vnet.ibm.com> <4C84E738.3020802@codemonkey.ws> <4C865187.6090508@redhat.com> <4C865CFE.7010508@codemonkey.ws> <4C8663C4.1090508@redhat.com> <4C866773.2030103@codemonkey.ws> <4C86BC6B.5010809@codemonkey.ws> <4C874812.9090807@redhat.com> <395D4377-00F9-4765-94C4-470BDFA1F96E@suse.de> <4C874F22.6060802@redhat.com> In-Reply-To: <4C874F22.6060802@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Kevin Wolf , qemu-devel@nongnu.org, Alexander Graf , Stefan Hajnoczi On 09/08/2010 03:53 AM, Avi Kivity wrote: > On 09/08/2010 11:41 AM, Alexander Graf wrote: >> On 08.09.2010, at 10:23, Avi Kivity wrote: >> >>> On 09/08/2010 01:27 AM, Anthony Liguori wrote: >>>>> FWIW, L2s are 256K at the moment and with a two level table, it >>>>> can support 5PB of data. >>>> >>>> I clearly suck at basic math today. The image supports 64TB >>>> today. Dropping to 128K tables would reduce it to 16TB and 64k >>>> tables would be 4TB. >>> Maybe we should do three levels then. Some users are bound to >>> complain about 64TB. >> Why 3 levels? Can't the L2 size be dynamic? Then big images get a big >> L2 map while small images get a smaller one. >> > > Dunno, just seems more regular to me. Image resize doesn't need to > relocate the L2 table in case it overflows. > > The overhead from three levels is an extra table, which is negligible. It means an extra I/O request in the degenerate case whereas increasing the table size only impacts the size of the metadata. A 10GB image currently has 1.2MB of metadata in QED today. A 1TB image uses 128MB of metadata. The ratio of metadata is about 0.01%. A three level table adds an additional I/O request in order to reduce metadata. But the metadata is small enough today that I don't see the point. Regards, Anthony Liguori > With 64K tables, the maximum image size is 32PiB, which is 14 bits > away from a 2TB disk, giving us about 30 years. >