From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43034) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1wvH-0001iA-PQ for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:12:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X1wv7-0001sS-8u for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:12:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41572) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X1wv7-0001sC-0H for qemu-devel@nongnu.org; Tue, 01 Jul 2014 08:12:21 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s61CCKZG001844 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 1 Jul 2014 08:12:20 -0400 Message-ID: <53B2A59C.7070802@redhat.com> Date: Tue, 01 Jul 2014 14:12:12 +0200 From: Max Reitz MIME-Version: 1.0 References: <1402167080-20316-1-git-send-email-mreitz@redhat.com> <1402167080-20316-4-git-send-email-mreitz@redhat.com> <20140630113339.GE4334@noname.str.redhat.com> In-Reply-To: <20140630113339.GE4334@noname.str.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v8 03/14] qcow2: Optimize bdrv_make_empty() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, Stefan Hajnoczi On 30.06.2014 13:33, Kevin Wolf wrote: > Am 07.06.2014 um 20:51 hat Max Reitz geschrieben: >> bdrv_make_empty() is currently only called if the current image >> represents an external snapshot that has been committed to its base >> image; it is therefore unlikely to have internal snapshots. In this >> case, bdrv_make_empty() can be greatly sped up by creating an empty L1 >> table and dropping all data clusters at once by recreating the refcount >> structure accordingly instead of normally discarding all clusters. >> >> If there are snapshots, fall back to the simple implementation (discard >> all clusters). >> >> Signed-off-by: Max Reitz >> Reviewed-by: Eric Blake > This approach looks a bit too complicated to me, and calulating the > required metadata size seems error-prone. > > How about this: > > 1. Set the dirty flag in the header so we can mess with the L1 table > without keeping the refcounts consistent Hm, I didn't think about this. *g* > 2. Overwrite the L1 table with zeros > > 3. Overwrite the first n clusters after the header with zeros > (n = 2 + l1_clusters). > > 4. Update the header: > refcount_table_offset = cluster_size > refcount_table_clusters = 1 > l1_table_offset = 3 * cluster_size > > 6. bdrv_truncate to n + 1 clusters > > 7. Now update the first 8 bytes at cluster_size (the first new refcount > table entry) to point to 2 * cluster_size (new refcount block) > > 8. Reset refcount block and L2 cache > > 9. Allocate n + 1 clusters (the header, too) and make sure you get > offset 0 > > 10. Remove the dirty flag > > Surprisingly (or not) this is much like an ordinary image creation. The > main difference is that we keep the full size of the L1 table so the > image stays always valid (the spec would even allow us to temporarily > set l1_size = 0, but qcow2_open() doesn't seem to like that) Yes, I noticed. ;-) > and all > areas where the L1 table could be are zeroed (this includes the new > refcount table/block until the header is updated). > > > I wanted to check whether this would still give the preallocation=full > series what it needs, but a v11 doesn't seem to be on the list yet and > v10 doesn't have the dependency on this series yet. Well, as far as I see it, the preallocation=full series will need a function to calculate the required image size (if it doesn't, preallocation=thin will). I don't really care whether this series introduces such a function or whether preallocation=full does. Max PS: I personally am reluctant to drop/change this patch, if only because I spent about a week getting it right. ;-) I guess I'll just take a look into marking the image dirty and see how it goes.