From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:57949)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S0Uix-0002QI-JC
	for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:33 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S0Uir-0007fo-LR
	for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:27 -0500
Received: from mail-ey0-f173.google.com ([209.85.215.173]:49668)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stefanha@gmail.com>) id 1S0Uir-0007eQ-H0
	for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:21 -0500
Received: by mail-ey0-f173.google.com with SMTP id c1so409614eaa.4
	for <qemu-devel@nongnu.org>; Thu, 23 Feb 2012 01:12:21 -0800 (PST)
Date: Thu, 23 Feb 2012 08:32:04 +0000
From: Stefan Hajnoczi <stefanha@gmail.com>
Message-ID: <20120223083204.GA10698@stefanha-thinkpad.localdomain>
References: <CABZruFAg81ayd3P6RWKnbhLMW9jmN00hcp=LRsbMHZ3NgupA2A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CABZruFAg81ayd3P6RWKnbhLMW9jmN00hcp=LRsbMHZ3NgupA2A@mail.gmail.com>
Subject: Re: [Qemu-devel] Cluster_size parameter issue on qcow2 image format
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: PANKAJ RAWAT <pankajr141@gmail.com>
Cc: qemu-devel@nongnu.org

On Thu, Feb 23, 2012 at 11:01:46AM +0530, PANKAJ RAWAT wrote:
> I theory regarding  cluster size it is written that as the size of cluster
> increase performance should increase.
> 
> But something surprising happen The performance is degrading as the size of
> cluster increased from 64K to 1M  ( during expansion of qcow2 image)

It's not true that performance should increase by raising the cluster
size, otherwise the default would be infinity.  It's an algorithms/data
structure trade-off.

Most importantly is the relative latency between a small guest I/O
request (e.g. 4 KB) and the cluster size (e.g. 64 KB).  If the cluster
size latency is orders of magnitude larger than a small guest I/O
request, then be prepared to see extreme effects described below:

 * Bigger clusters decrease the frequency of metadata operations and
   increase metadata cache hit rates.  Bigger clusters means less
   metadata so qcow2 performs fewer metadata operations overall.

   Performance boost.

 * Bigger clusters increase the cost of allocating a new cluster.  For
   example, a 8 KB write to a new cluster will incur a 1 MB write to the
   image file (the untouched regions are filled with zeros).  This can
   be optimized in some cases but not everywhere (e.g. reallocating a
   data cluster versus extending the image file size and relying on the
   file system to provide zeroed space).  This is especially expensive
   when a backing file is in use because up to 1 MB of the backing file
   needs to be read to populate the newly allocated cluster!

   Performance loss.

 * Bigger clusters can reduce fragmentation of data on the physical
   disk.  The file system sees fewer, bigger allocating writes and is
   therefore able to allocate more contiguous data - less fragmentation.

   Performance boost.

 * Bigger clusters reduce the compactness of sparse files. you use more
   image file space on the host file system when the cluster size is
   large.

   Space efficiency loss.

Here's a scenario where a 1 MB cluster size is great compared to a large
cluster size:

You have a fully allocated qcow2 image, you will never need to do any
allocating writes.

Here's a scenario where a 1 MB cluster size is terrible compared to a
small cluster size:

You have an empty qcow2 file and perform 4 KB writes to the first sector
of each 1 MB chunk, and there is a backing file.

So it depends on the application.

Stefan