From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LMJLg-0006XD-BS for qemu-devel@nongnu.org; Mon, 12 Jan 2009 04:44:44 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LMJLe-0006Wp-Ox for qemu-devel@nongnu.org; Mon, 12 Jan 2009 04:44:43 -0500 Received: from [199.232.76.173] (port=45784 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LMJLe-0006Wf-Fi for qemu-devel@nongnu.org; Mon, 12 Jan 2009 04:44:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:42443) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1LMJLe-0000Or-0V for qemu-devel@nongnu.org; Mon, 12 Jan 2009 04:44:42 -0500 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id E5E144856F for ; Mon, 12 Jan 2009 10:44:38 +0100 (CET) Message-ID: <496B1256.1060609@suse.de> Date: Mon, 12 Jan 2009 10:50:14 +0100 From: Kevin Wolf MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 0/3] info blockstats (block-qcow2): show highest allocated offset (bytes) References: <49664ACA.9050807@redhat.com> <49671463.8050402@suse.de> <496A0899.7010707@redhat.com> In-Reply-To: <496A0899.7010707@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Shahar Frank schrieb: > Kevin Wolf wrote: >> Uri Lublin schrieb: >>> Although there may be many free blocks below that number (allocated and >>> freed) >>> the file system can not deallocate those blocks, and they have to be >>> reused >>> by qemu. Also note that due to fragmentation those free blocks may not >>> be used on next allocations. >> >> Any idea what would it mean to performance if we changed the behaviour >> so that s->free_cluster_index always points to lowest free cluster? Then >> most of the fragmentation should be gone. >> > free_cluster_index if already pointing the lowest known free space. The Right, that's what I thought, too. Until I looked at code again. alloc_clusters_noref() moves free_cluster_index forward if the needed number of cluster don't fit right there. It's only set back to the lowest free cluster when that cluster is freed later. > problem is that the its update logic is very simplistic so an allocation > of multiple clusters may cause this pointer to skip many single (in fact > it will skip all cluster sequences that are shorter than the requested > number), so the next allocation may miss it. This will increase the > fragmentation. Note that it wasn't so important until Laurent Vivier > implemented his optimizations that allocated cluster sequences. Yes, it wouldn't have been a problem before these patches. But now what is the right thing to do if we're having some one-cluster holes and want to write a larger block? There are only two options: Try to find a place where the clusters are physically contiguous for better performance but at the cost of fragmentation (that's what we to today) or fill up all the holes first at cost of performance (we could to that with a few lines of code). > In fact, having two or even > several free pointers is probably a step in the right direction, but we > may need some better allocation mechanism to really solve the problem > (btree+ structure, or something else). The target should be a decent > extend based allocation. This improve qcow2 performance and handle he > fragmentation problem. The problem is that it will probably change the > qcow2 internals, so may better implement a simple approach for qcow2 and > start designing qcow3... Maybe you're right. But actually I don't feel like starting qcow3 now... And it would be a long term thing anyway. Kevin