From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38838) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xo6j7-0002XM-Qd for qemu-devel@nongnu.org; Tue, 11 Nov 2014 03:23:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xo6j0-0006aN-Li for qemu-devel@nongnu.org; Tue, 11 Nov 2014 03:23:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42130) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xo6j0-0006a9-EL for qemu-devel@nongnu.org; Tue, 11 Nov 2014 03:22:54 -0500 Message-ID: <5461C751.3080607@redhat.com> Date: Tue, 11 Nov 2014 09:22:41 +0100 From: Max Reitz MIME-Version: 1.0 References: <1415627159-15941-1-git-send-email-mreitz@redhat.com> <1415627159-15941-6-git-send-email-mreitz@redhat.com> <54612A27.7000801@redhat.com> In-Reply-To: <54612A27.7000801@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 05/21] qcow2: Refcount overflow and qcow2_alloc_bytes() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , qemu-devel@nongnu.org Cc: Kevin Wolf , Peter Lieven , Stefan Hajnoczi On 2014-11-10 at 22:12, Eric Blake wrote: > On 11/10/2014 06:45 AM, Max Reitz wrote: >> qcow2_alloc_bytes() may reuse a cluster multiple times, in which case >> the refcount is increased accordingly. However, if this would lead to an >> overflow the function should instead just not reuse this cluster and >> allocate a new one. > So if recount_order is 1 (2 bits per refcount, max refcount of 4 *max refcount of 3 (0b11) > ), and > we encounter the same cluster 6 times (say by 5 back-to-back internal > snapshots), does this code optimize to only 2 clusters (both with > refcount 3) or does it result in each of the last 3 clusters spilling to > its own 1-ref cluster for a total of 4 clusters? Short of Benoit's work > on deduplication, is there even a way to avoid inefficient use of > spilled clusters? I'm not sure what you're referring to; maybe I should add that qcow2_alloc_bytes() is used for allocating compressed clusters (which ideally don't take up a full host cluster), so "reuse" in this context just means that several compressed clusters share one host cluster. Maybe you're referring to the following situation: We have the default cluster size of 64k. Now we're trying to allocate 16k for each of the compressed clusters A, B, C and D. D won't fit into that cluster because the maximum refcount is three, so it will be put into a newly allocated host cluster. Finally, we're trying to allocate 32k for a compressed cluster E, which will then be put into the same cluster as D. We therefore have the following allocation (each sub-box representing 16k): +---+---+---+---+ +---+---+---+---+ |A |B | C | | | D | E | | +---+---+---+---+ +---+---+---+---+ whereas the ideal allocation would be: +---+---+---+---+ +---+---+---+---+ |A |B | E | | C | D | | | +---+---+---+---+ +---+---+---+---+ This is a problem, but I think first it's a minor one (just use a sufficiently large refcount width if you're going to use compressed clusters) and second it's about compressed clusters, whose performance I could hardly care less about, frankly. Max > But I guess answering that can be a separate patch; > inefficiency is annoying, but not technically wrong and therefore not a > reason to reject this one. > >> Signed-off-by: Max Reitz >> --- >> block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++-- >> 1 file changed, 30 insertions(+), 2 deletions(-) >> > Reviewed-by: Eric Blake >