From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59492) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVv29-0004Gw-D7 for qemu-devel@nongnu.org; Fri, 14 Jul 2017 03:29:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVv28-0003Jq-EJ for qemu-devel@nongnu.org; Fri, 14 Jul 2017 03:29:05 -0400 Date: Fri, 14 Jul 2017 09:28:52 +0200 From: Kevin Wolf Message-ID: <20170714072852.GC4762@noname.redhat.com> References: <20170712114700.20008-1-pbutsykin@virtuozzo.com> <20170712114700.20008-4-pbutsykin@virtuozzo.com> <20170712145251.GE4917@noname.str.redhat.com> <20170713084126.GA4139@noname.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PATCH v5 3/4] qcow2: add shrink image support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pavel Butsykin Cc: Max Reitz , qemu-block@nongnu.org, qemu-devel@nongnu.org, eblake@redhat.com, armbru@redhat.com, den@openvz.org Am 13.07.2017 um 19:28 hat Pavel Butsykin geschrieben: > On 13.07.2017 17:36, Max Reitz wrote: > >On 2017-07-13 10:41, Kevin Wolf wrote: > >>Am 12.07.2017 um 18:58 hat Max Reitz geschrieben: > >>>On 2017-07-12 16:52, Kevin Wolf wrote: > >>>>Am 12.07.2017 um 13:46 hat Pavel Butsykin geschrieben: > >>>>>This patch add shrinking of the image file for qcow2. As a result, this allows > >>>>>us to reduce the virtual image size and free up space on the disk without > >>>>>copying the image. Image can be fragmented and shrink is done by punching holes > >>>>>in the image file. > >>>>> > >>>>>Signed-off-by: Pavel Butsykin > >>>>>Reviewed-by: Max Reitz > >>>>>--- > >>>>> block/qcow2-cluster.c | 40 ++++++++++++++++++ > >>>>> block/qcow2-refcount.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>> block/qcow2.c | 43 +++++++++++++++---- > >>>>> block/qcow2.h | 14 +++++++ > >>>>> qapi/block-core.json | 3 +- > >>>>> 5 files changed, 200 insertions(+), 10 deletions(-) > >>>>> > >>>>>diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > >>>>>index f06c08f64c..518429c64b 100644 > >>>>>--- a/block/qcow2-cluster.c > >>>>>+++ b/block/qcow2-cluster.c > >>>>>@@ -32,6 +32,46 @@ > >>>>> #include "qemu/bswap.h" > >>>>> #include "trace.h" > >>>>>+int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t exact_size) > >>>>>+{ > >>>>>+ BDRVQcow2State *s = bs->opaque; > >>>>>+ int new_l1_size, i, ret; > >>>>>+ > >>>>>+ if (exact_size >= s->l1_size) { > >>>>>+ return 0; > >>>>>+ } > >>>>>+ > >>>>>+ new_l1_size = exact_size; > >>>>>+ > >>>>>+#ifdef DEBUG_ALLOC2 > >>>>>+ fprintf(stderr, "shrink l1_table from %d to %d\n", s->l1_size, new_l1_size); > >>>>>+#endif > >>>>>+ > >>>>>+ BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_WRITE_TABLE); > >>>>>+ ret = bdrv_pwrite_zeroes(bs->file, s->l1_table_offset + > >>>>>+ sizeof(uint64_t) * new_l1_size, > >>>>>+ (s->l1_size - new_l1_size) * sizeof(uint64_t), 0); > >>>>>+ if (ret < 0) { > >>>>>+ return ret; > >>>>>+ } > >>>>>+ > >>>>>+ ret = bdrv_flush(bs->file->bs); > >>>>>+ if (ret < 0) { > >>>>>+ return ret; > >>>>>+ } > >>>> > >>>>If we have an error here (or after a partial bdrv_pwrite_zeroes()), we > >>>>have entries zeroed out on disk, but in memory we still have the > >>>>original L1 table. > >>>> > >>>>Should the in-memory L1 table be zeroed first? Then we can't > >>>>accidentally reuse stale entries, but would have to allocate new ones, > >>>>which would get on-disk state and in-memory state back in sync again. > >>> > >>>Yes, I thought of the same. But this implies that the allocation is > >>>able to modify the L1 table, and I find that unlikely if that > >>>bdrv_flush() failed already... > >>> > >>>So I concluded not to have an opinion on which order is better. > >> > >>Well, let me ask the other way round: Is there any disadvantage in first > >>zeroing the in-memory table and then writing to the image? > > > >I was informed that the code would be harder to write. :-) > > > >>If I have a choice between "always safe" and "not completely safe, but I > >>think it's unlikely to happen", especially in image formats, then I will > >>certainly take the "always safe". > >> > >>>>>+ BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_FREE_L2_CLUSTERS); > >>>>>+ for (i = s->l1_size - 1; i > new_l1_size - 1; i--) { > >>>>>+ if ((s->l1_table[i] & L1E_OFFSET_MASK) == 0) { > >>>>>+ continue; > >>>>>+ } > >>>>>+ qcow2_free_clusters(bs, s->l1_table[i] & L1E_OFFSET_MASK, > >>>>>+ s->cluster_size, QCOW2_DISCARD_ALWAYS); > >>>>>+ s->l1_table[i] = 0; > >>>>>+ } > >>>>>+ return 0; > >>>>>+} > >>>>>+ > >>>>> int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, > >>>>> bool exact_size) > >>>>> { > >>>> > >>>>I haven't checked qcow2_shrink_reftable() for similar kinds of problems, > >>>>I hope Max has. > >>> > >>>Well, it's exactly the same there. > >> > >>Ok, so I'll object to this code without really having looked at it. > >I won't object to your objection. O:-) > > Kevin, > > Can you help me to reduce the number of patch-set versions? :) > And look at the rest part of the series, thanks! Patches 1 and 2 looked good to me. I didn't have a look at the test case, but if it passes and Max reviewed it, it can't be too bad. ;-) Kevin