qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Max Reitz <mreitz@redhat.com>
To: Pavel Butsykin <pbutsykin@virtuozzo.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: jsnow@redhat.com, kwolf@redhat.com, eblake@redhat.com,
	armbru@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v7 3/4] qcow2: add shrink image support
Date: Sat, 16 Sep 2017 17:29:17 +0200	[thread overview]
Message-ID: <9205db19-e16e-55d7-a6bd-0dd7d6e7eae5@redhat.com> (raw)
In-Reply-To: <20170817091542.9403-4-pbutsykin@virtuozzo.com>

[-- Attachment #1: Type: text/plain, Size: 8493 bytes --]

On 2017-08-17 11:15, Pavel Butsykin wrote:
> This patch add shrinking of the image file for qcow2. As a result, this allows
> us to reduce the virtual image size and free up space on the disk without
> copying the image. Image can be fragmented and shrink is done by punching holes
> in the image file.
> 
> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> ---
>  block/qcow2-cluster.c  |  50 +++++++++++++++++++++
>  block/qcow2-refcount.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
>  block/qcow2.c          |  43 ++++++++++++++----
>  block/qcow2.h          |  14 ++++++
>  qapi/block-core.json   |   3 +-
>  5 files changed, 220 insertions(+), 10 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index f06c08f64c..0c7a9a920c 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -32,6 +32,56 @@
>  #include "qemu/bswap.h"
>  #include "trace.h"
>  
> +int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t exact_size)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    int new_l1_size, i, ret;
> +
> +    if (exact_size >= s->l1_size) {
> +        return 0;
> +    }
> +
> +    new_l1_size = exact_size;
> +
> +#ifdef DEBUG_ALLOC2
> +    fprintf(stderr, "shrink l1_table from %d to %d\n", s->l1_size, new_l1_size);
> +#endif
> +
> +    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_WRITE_TABLE);
> +    ret = bdrv_pwrite_zeroes(bs->file, s->l1_table_offset +
> +                                       new_l1_size * sizeof(uint64_t),
> +                             (s->l1_size - new_l1_size) * sizeof(uint64_t), 0);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +
> +    ret = bdrv_flush(bs->file->bs);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +
> +    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_FREE_L2_CLUSTERS);
> +    for (i = s->l1_size - 1; i > new_l1_size - 1; i--) {
> +        if ((s->l1_table[i] & L1E_OFFSET_MASK) == 0) {
> +            continue;
> +        }
> +        qcow2_free_clusters(bs, s->l1_table[i] & L1E_OFFSET_MASK,
> +                            s->cluster_size, QCOW2_DISCARD_ALWAYS);
> +        s->l1_table[i] = 0;
> +    }
> +    return 0;
> +
> +fail:
> +    /*
> +     * If the write in the l1_table failed the image may contain partially
> +     * overwritten the l1_table. In this case would be better to clear the

e.g. *"may contain a partially overwritten l1_table"
*"In this case it would be better"

> +     * l1_table in memory to avoid possible image corruption.
> +     */
> +    memset(s->l1_table + exact_size, 0,

Though it doesn't make a functional difference, I'd prefer "new_l1_size"
instead of "exact_size", because you're using new_l1_size everywhere
else (including the line below).

> +           (s->l1_size - new_l1_size) * sizeof(uint64_t));
> +    return ret;
> +}
> +
>  int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
>                          bool exact_size)
>  {
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 8c17c0e3aa..15af9a795f 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -29,6 +29,7 @@
>  #include "block/qcow2.h"
>  #include "qemu/range.h"
>  #include "qemu/bswap.h"
> +#include "qemu/cutils.h"
>  
>  static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
>  static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
> @@ -3061,3 +3062,122 @@ done:
>      qemu_vfree(new_refblock);
>      return ret;
>  }
> +
> +static int qcow2_discard_refcount_block(BlockDriverState *bs,
> +                                        uint64_t discard_block_offs)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    uint64_t refblock_offs = get_refblock_offset(s, discard_block_offs);
> +    uint64_t cluster_index = discard_block_offs >> s->cluster_bits;
> +    uint32_t block_index = cluster_index & (s->refcount_block_size - 1);
> +    void *refblock;
> +    int ret;
> +
> +    assert(discard_block_offs != 0);
> +
> +    ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
> +                          &refblock);
> +    if (ret < 0) {
> +        return ret;
> +    }
> +
> +    if (s->get_refcount(refblock, block_index) != 1) {
> +        qcow2_signal_corruption(bs, true, -1, -1, "Invalid refcount:"
> +                                " refblock offset %#" PRIx64
> +                                ", reftable index %u"
> +                                ", block offset %#" PRIx64
> +                                ", refcount %#" PRIx64,
> +                                refblock_offs,
> +                                offset_to_reftable_index(s, discard_block_offs),
> +                                discard_block_offs,
> +                                s->get_refcount(refblock, block_index));
> +        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
> +        return -EINVAL;
> +    }
> +    s->set_refcount(refblock, block_index, 0);
> +
> +    qcow2_cache_entry_mark_dirty(bs, s->refcount_block_cache, refblock);
> +
> +    qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
> +
> +    if (cluster_index < s->free_cluster_index) {
> +        s->free_cluster_index = cluster_index;
> +    }
> +
> +    refblock = qcow2_cache_is_table_offset(bs, s->refcount_block_cache,
> +                                           discard_block_offs);
> +    if (refblock) {
> +        /* discard refblock from the cache if refblock is cached */
> +        qcow2_cache_discard(bs, s->refcount_block_cache, refblock);
> +    }
> +    update_refcount_discard(bs, discard_block_offs, s->cluster_size);
> +
> +    return 0;
> +}
> +
> +int qcow2_shrink_reftable(BlockDriverState *bs)
> +{
> +    BDRVQcow2State *s = bs->opaque;
> +    uint64_t *reftable_tmp =
> +        g_malloc(s->refcount_table_size * sizeof(uint64_t));
> +    int i, ret;
> +
> +    for (i = 0; i < s->refcount_table_size; i++) {
> +        int64_t refblock_offs = s->refcount_table[i] & REFT_OFFSET_MASK;
> +        void *refblock;
> +        bool unused_block;
> +
> +        if (refblock_offs == 0) {
> +            reftable_tmp[i] = 0;
> +            continue;
> +        }
> +        ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
> +                              &refblock);
> +        if (ret < 0) {
> +            goto out;
> +        }
> +
> +        /* the refblock has own reference */
> +        if (i == offset_to_reftable_index(s, refblock_offs)) {
> +            uint64_t block_index = (refblock_offs >> s->cluster_bits) &
> +                                   (s->refcount_block_size - 1);
> +            uint64_t refcount = s->get_refcount(refblock, block_index);
> +
> +            s->set_refcount(refblock, block_index, 0);
> +
> +            unused_block = buffer_is_zero(refblock, s->cluster_size);
> +
> +            s->set_refcount(refblock, block_index, refcount);
> +        } else {
> +            unused_block = buffer_is_zero(refblock, s->cluster_size);
> +        }
> +        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
> +
> +        reftable_tmp[i] = unused_block ? 0 : cpu_to_be64(s->refcount_table[i]);
> +    }
> +
> +    ret = bdrv_pwrite_sync(bs->file, s->refcount_table_offset, reftable_tmp,
> +                           s->refcount_table_size * sizeof(uint64_t));
> +    /*
> +     * If the write in the reftable failed the image may contain partially
> +     * overwritten the reftable. In this case would be better to clear the

*"may contain a partially overwritten reftable"
*"In this case it would be better"

With these changes:

Reviewed-by: Max Reitz <mreitz@redhat.com>

> +     * reftable in memory to avoid possible image corruption.
> +     */
> +    for (i = 0; i < s->refcount_table_size; i++) {
> +        if (s->refcount_table[i] && !reftable_tmp[i]) {
> +            if (ret == 0) {
> +                ret = qcow2_discard_refcount_block(bs, s->refcount_table[i] &
> +                                                       REFT_OFFSET_MASK);
> +            }
> +            s->refcount_table[i] = 0;
> +        }
> +    }
> +
> +    if (!s->cache_discards) {
> +        qcow2_process_discards(bs, ret);
> +    }
> +
> +out:
> +    g_free(reftable_tmp);
> +    return ret;
> +}


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

  parent reply	other threads:[~2017-09-16 15:29 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-17  9:15 [Qemu-devel] [PATCH v7 0/4] Add shrink image for qcow2 Pavel Butsykin
2017-08-17  9:15 ` [Qemu-devel] [PATCH v7 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
2017-09-16 15:49   ` Max Reitz
2017-08-17  9:15 ` [Qemu-devel] [PATCH v7 2/4] qcow2: add qcow2_cache_discard Pavel Butsykin
2017-09-16 14:59   ` Max Reitz
2017-08-17  9:15 ` [Qemu-devel] [PATCH v7 3/4] qcow2: add shrink image support Pavel Butsykin
2017-08-17 11:04   ` Eric Blake
2017-08-17 12:59     ` [Qemu-devel] [PATCH] follow-up path - " qcow2: add shrink image support" Pavel Butsykin
2017-08-17 13:05       ` no-reply
2017-09-16 15:29   ` Max Reitz [this message]
2017-08-17  9:15 ` [Qemu-devel] [PATCH v7 4/4] qemu-iotests: add shrinking image test Pavel Butsykin
2017-08-21 23:31 ` [Qemu-devel] [PATCH v7 0/4] Add shrink image for qcow2 John Snow
2017-08-22 17:17   ` Max Reitz
2017-09-16 14:56   ` Max Reitz
2017-09-18  9:45     ` Pavel Butsykin
2017-09-15  9:16 ` Pavel Butsykin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9205db19-e16e-55d7-a6bd-0dd7d6e7eae5@redhat.com \
    --to=mreitz@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbutsykin@virtuozzo.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).