From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:57949) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S0Uix-0002QI-JC for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1S0Uir-0007fo-LR for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:27 -0500 Received: from mail-ey0-f173.google.com ([209.85.215.173]:49668) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1S0Uir-0007eQ-H0 for qemu-devel@nongnu.org; Thu, 23 Feb 2012 04:12:21 -0500 Received: by mail-ey0-f173.google.com with SMTP id c1so409614eaa.4 for ; Thu, 23 Feb 2012 01:12:21 -0800 (PST) Date: Thu, 23 Feb 2012 08:32:04 +0000 From: Stefan Hajnoczi Message-ID: <20120223083204.GA10698@stefanha-thinkpad.localdomain> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] Cluster_size parameter issue on qcow2 image format List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: PANKAJ RAWAT Cc: qemu-devel@nongnu.org On Thu, Feb 23, 2012 at 11:01:46AM +0530, PANKAJ RAWAT wrote: > I theory regarding cluster size it is written that as the size of cluster > increase performance should increase. > > But something surprising happen The performance is degrading as the size of > cluster increased from 64K to 1M ( during expansion of qcow2 image) It's not true that performance should increase by raising the cluster size, otherwise the default would be infinity. It's an algorithms/data structure trade-off. Most importantly is the relative latency between a small guest I/O request (e.g. 4 KB) and the cluster size (e.g. 64 KB). If the cluster size latency is orders of magnitude larger than a small guest I/O request, then be prepared to see extreme effects described below: * Bigger clusters decrease the frequency of metadata operations and increase metadata cache hit rates. Bigger clusters means less metadata so qcow2 performs fewer metadata operations overall. Performance boost. * Bigger clusters increase the cost of allocating a new cluster. For example, a 8 KB write to a new cluster will incur a 1 MB write to the image file (the untouched regions are filled with zeros). This can be optimized in some cases but not everywhere (e.g. reallocating a data cluster versus extending the image file size and relying on the file system to provide zeroed space). This is especially expensive when a backing file is in use because up to 1 MB of the backing file needs to be read to populate the newly allocated cluster! Performance loss. * Bigger clusters can reduce fragmentation of data on the physical disk. The file system sees fewer, bigger allocating writes and is therefore able to allocate more contiguous data - less fragmentation. Performance boost. * Bigger clusters reduce the compactness of sparse files. you use more image file space on the host file system when the cluster size is large. Space efficiency loss. Here's a scenario where a 1 MB cluster size is great compared to a large cluster size: You have a fully allocated qcow2 image, you will never need to do any allocating writes. Here's a scenario where a 1 MB cluster size is terrible compared to a small cluster size: You have an empty qcow2 file and perform 4 KB writes to the first sector of each 1 MB chunk, and there is a backing file. So it depends on the application. Stefan