From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33531) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XqnzN-0007Sd-DG for qemu-devel@nongnu.org; Tue, 18 Nov 2014 13:59:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XqnzE-0006tA-6j for qemu-devel@nongnu.org; Tue, 18 Nov 2014 13:58:57 -0500 Received: from mx1.redhat.com ([209.132.183.28]:7302) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XqnzD-0006si-UV for qemu-devel@nongnu.org; Tue, 18 Nov 2014 13:58:48 -0500 Message-ID: <546B96DC.1090007@redhat.com> Date: Tue, 18 Nov 2014 19:58:36 +0100 From: Max Reitz MIME-Version: 1.0 References: <1415970374-16811-1-git-send-email-mreitz@redhat.com> <1415970374-16811-19-git-send-email-mreitz@redhat.com> <546B87F8.5090600@redhat.com> In-Reply-To: <546B87F8.5090600@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 18/21] qcow2: Add function for refcount order amendment List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , qemu-devel@nongnu.org Cc: Kevin Wolf , Peter Lieven , Stefan Hajnoczi On 18.11.2014 18:55, Eric Blake wrote: > On 11/14/2014 06:06 AM, Max Reitz wrote: >> Add a function qcow2_change_refcount_order() which allows changing the >> refcount order of a qcow2 image. >> >> Signed-off-by: Max Reitz >> --- >> block/qcow2-refcount.c | 457 +++++++++++++++++++++++++++++++++++++++++++++++++ >> block/qcow2.h | 4 + >> 2 files changed, 461 insertions(+) >> >> +static int walk_over_reftable(BlockDriverState *bs, uint64_t **new_reftable, >> + >> + status_cb(bs, (uint64_t)index * s->refcount_table_size + reftable_index, >> + (uint64_t)total * s->refcount_table_size, cb_opaque); > Not sure if the casts are needed (isn't s->refcount_table_size already > uint64_t, Surprise, it isn't. I thought otherwise, too, but then got told by clang_complete (it's uint32_t). > and 'int * uint64_t' does the right thing); but I guess it > doesn't hurt to leave them. > >> +int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order, >> + BlockDriverAmendStatusCB *status_cb, >> + void *cb_opaque, Error **errp) >> +{ >> + do { >> + int total_walks; >> + >> + new_allocation = false; >> + >> + /* At least we have to do this walk and the one which writes the >> + * refblocks; also, at least we have to do this loop here at least >> + * twice (normally), first to do the allocations, and second to >> + * determine that everything is correctly allocated, this then makes >> + * three walks in total */ >> + total_walks = MIN(walk_index + 2, 3); > This feels wrong... Yes, I noticed already when preparing v3 and it's already fixed in my local v3 branch. *cough* >> + >> + /* First, allocate the structures so they are present in the refcount >> + * structures */ >> + ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index, >> + &new_reftable_size, NULL, new_refblock_size, >> + new_refcount_bits, &alloc_refblock, >> + &new_allocation, NULL, status_cb, cb_opaque, >> + walk_index++, total_walks, errp); > ...In the common case of just two iterations of the do loop (second > iteration confirms no allocations needed), you call with index 0/2, 1/3, > and then the later non-allocation walk is index 2/3. > > In the rare case of three iterations of the do loop, you call with index > 0/2, 1/3, 2/3, and then the later non-allocation walk is 3/4. > > I highly doubt that it is possible to trigger four iterations of the do > loop, but if it were, you would call with 0/2, 1/3, 2/3, 3/3, and then 4/5. > > I think you instead want to have: > > total_walks = MAX(walk_index + 2, 3) > > then the common case will call with 0/3, 1/3, and the later walk as 2/3 > > the three-iteration loop will call with 0/3, 1/3, 2/4, and the later > walk as 3/4 > > the unlikely four-iteration loop will call with 0/3, 1/3, 2/4, 3/5, and > the later walk as 4/5. > >> + >> + new_reftable_index = 0; >> + >> + if (new_allocation) { >> + if (new_reftable_offset) { >> + qcow2_free_clusters(bs, new_reftable_offset, >> + allocated_reftable_size * sizeof(uint64_t), >> + QCOW2_DISCARD_NEVER); > Any reason you picked QCOW2_DISCARD_NEVER instead of some other policy? Ah, discarding is always interesting... Last year I used QCOW2_DISCARD_ALWAYS, then asked Kevin and he basically said never to use ALWAYS unless one is really sure about it. I could have used QCOW2_DISCARD_OTHER... But the idea behind using NEVER in cases like this is that the clusters may get picked up by the following allocation, in which case having discarded them is not a good idea (there are some other places in the qcow2 code which use NEVER for the same reason). So, in this case, I think NEVER is good. > Why not punch holes in the file when throwing out a failed too-small > new table, or when cleaning up the old table once the new table is good? > >> + } >> + >> + new_reftable_offset = qcow2_alloc_clusters(bs, new_reftable_size * >> + sizeof(uint64_t)); >> + if (new_reftable_offset < 0) { >> + error_setg_errno(errp, -new_reftable_offset, >> + "Failed to allocate the new reftable"); >> + ret = new_reftable_offset; >> + goto done; >> + } >> + allocated_reftable_size = new_reftable_size; >> + >> + new_allocation = true; > This assignment is dead code (it already occurs inside an 'if > (new_allocation)' condition). Right. Though I somehow like its explicitness... I'll remove it. >> + } >> + } while (new_allocation); >> + >> + /* Second, write the new refblocks */ >> + new_allocation = false; > This assignment is dead code (it can only be reached if the earlier do > loop ended, which is only possible when no allocations are recorded). Right again. >> + ret = walk_over_reftable(bs, &new_reftable, &new_reftable_index, >> + &new_reftable_size, new_refblock, >> + new_refblock_size, new_refcount_bits, >> + &flush_refblock, &new_allocation, new_set_refcount, >> + status_cb, cb_opaque, walk_index, walk_index + 1, >> + errp); >> + if (ret < 0) { >> + goto done; >> + } >> + assert(!new_allocation); >> + > Correct. > >> +done: >> + if (new_reftable) { >> + /* On success, new_reftable actually points to the old reftable (and >> + * new_reftable_size is the old reftable's size); but that is just >> + * fine */ >> + for (i = 0; i < new_reftable_size; i++) { >> + uint64_t offset = new_reftable[i] & REFT_OFFSET_MASK; >> + if (offset) { >> + qcow2_free_clusters(bs, offset, s->cluster_size, >> + QCOW2_DISCARD_NEVER); > Again, why the QCOW2_DISCARD_NEVER policy? Here, I have nothing to justify it. I'll use QCOW2_DISCARD_OTHER in v3. > Fix the MIN vs. MAX bug, and the two dead assignment statements, and you > can have: > > Reviewed-by: Eric Blake I'll also use QCOW2_DISCARD_OTHER for freeing the refblocks and the reftable after the "done" label, if you're fine with that. Once again, thanks a lot! Max