qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2
@ 2017-09-18 12:42 Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Pavel Butsykin @ 2017-09-18 12:42 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbutsykin, jsnow, kwolf, mreitz, eblake, armbru, den

This patch add shrinking of the image file for qcow2. As a result, this allows
us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and shrink is done by punching holes
in the image file.

# ./qemu-img create -f qcow2 image.qcow2 4G
Formatting 'image.qcow2', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# ./qemu-io -c "write -P 0x22 0 1G" image.qcow2
wrote 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0:00:04.59 (222.886 MiB/sec and 0.2177 ops/sec)

# ./qemu-img resize image.qcow2 512M
warning: qemu-img: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
error: qemu-img: Use the --shrink option to perform a shrink operation.

# ./qemu-img resize --shrink image.qcow2 128M
Image resized.

# ./qemu-img info image.qcow2
image: image.qcow2
file format: qcow2
virtual size: 128M (134217728 bytes)
disk size: 128M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

# du -h image.qcow2
129M    image.qcow2

Changes from v1:
- add --shrink flag for qemu-img resize
- add qcow2_cache_discard
- simplify qcow2_shrink_l1_table() to reduce the likelihood of image corruption
- add new qemu-iotests for shrinking images

Changes from v2:
- replace qprintf() on error_report() (1)
- rewrite warning messages (1)
- enforce --shrink flag for all formats except raw (1)
- split qcow2_cache_discard() (2)
- minor fixes according to comments (3)
- rewrite the last part of qcow2_shrink_reftable() to avoid
  qcow2_free_clusters() calls inside (3)
- improve test for shrinking image (4)

Changes from v3:
- rebase on "Implement a warning_report function" Alistair's patch-set (1)
- spelling fixes (1)
- the man page fix according to the discussion (1)
- add call qcow2_signal_corruption() in case of image corruption (3)

Changes from v4:
- rebase on https://github.com/XanClic/qemu/commits/block Max's block branch

Changes from v5:
- the condition refcount == 0 should be enough to evict the l2/refcount cluster
  from the cache (2)
- overwrite the l1/refcount table in memory with zeros, even if overwriting the
  l1/refcount table on disk has failed (3)
- replace g_try_malloc() on g_malloc() for allocation reftable_tmp (3)

Changes from v6:
- rebase on master 1f29673387

Changes from v7:
- fix 106 iotest (1)
- minor fixes according to comments (2, 3)
- add documentation of the new enum members (3)
- add r-b's by Max and John

Pavel Butsykin (4):
  qemu-img: add --shrink flag for resize
  qcow2: add qcow2_cache_discard
  qcow2: add shrink image support
  qemu-iotests: add shrinking image test

 block/qcow2-cache.c        |  26 +++++++
 block/qcow2-cluster.c      |  50 +++++++++++++
 block/qcow2-refcount.c     | 140 ++++++++++++++++++++++++++++++++++++-
 block/qcow2.c              |  43 +++++++++---
 block/qcow2.h              |  17 +++++
 qapi/block-core.json       |   8 ++-
 qemu-img-cmds.hx           |   4 +-
 qemu-img.c                 |  23 ++++++
 qemu-img.texi              |   6 +-
 tests/qemu-iotests/102     |   4 +-
 tests/qemu-iotests/106     |   2 +-
 tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/163.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 14 files changed, 481 insertions(+), 18 deletions(-)
 create mode 100644 tests/qemu-iotests/163
 create mode 100644 tests/qemu-iotests/163.out

-- 
2.14.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize
  2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
@ 2017-09-18 12:42 ` Pavel Butsykin
  2017-09-18 18:12   ` Max Reitz
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 2/4] qcow2: add qcow2_cache_discard Pavel Butsykin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Pavel Butsykin @ 2017-09-18 12:42 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbutsykin, jsnow, kwolf, mreitz, eblake, armbru, den

The flag is additional precaution against data loss. Perhaps in the future the
operation shrink without this flag will be blocked for all formats, but for now
we need to maintain compatibility with raw.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 qemu-img-cmds.hx       |  4 ++--
 qemu-img.c             | 23 +++++++++++++++++++++++
 qemu-img.texi          |  6 +++++-
 tests/qemu-iotests/102 |  4 ++--
 tests/qemu-iotests/106 |  2 +-
 5 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index b47d409665..2fe31893cf 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -89,9 +89,9 @@ STEXI
 ETEXI
 
 DEF("resize", img_resize,
-    "resize [--object objectdef] [--image-opts] [-q] filename [+ | -]size")
+    "resize [--object objectdef] [--image-opts] [-q] [--shrink] filename [+ | -]size")
 STEXI
-@item resize [--object @var{objectdef}] [--image-opts] [-q] @var{filename} [+ | -]@var{size}
+@item resize [--object @var{objectdef}] [--image-opts] [-q] [--shrink] @var{filename} [+ | -]@var{size}
 ETEXI
 
 STEXI
diff --git a/qemu-img.c b/qemu-img.c
index 56ef49e214..b7b2386cbd 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -65,6 +65,7 @@ enum {
     OPTION_TARGET_IMAGE_OPTS = 263,
     OPTION_SIZE = 264,
     OPTION_PREALLOCATION = 265,
+    OPTION_SHRINK = 266,
 };
 
 typedef enum OutputFormat {
@@ -3437,6 +3438,7 @@ static int img_resize(int argc, char **argv)
         },
     };
     bool image_opts = false;
+    bool shrink = false;
 
     /* Remove size from argv manually so that negative numbers are not treated
      * as options by getopt. */
@@ -3455,6 +3457,7 @@ static int img_resize(int argc, char **argv)
             {"object", required_argument, 0, OPTION_OBJECT},
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {"preallocation", required_argument, 0, OPTION_PREALLOCATION},
+            {"shrink", no_argument, 0, OPTION_SHRINK},
             {0, 0, 0, 0}
         };
         c = getopt_long(argc, argv, ":f:hq",
@@ -3498,6 +3501,9 @@ static int img_resize(int argc, char **argv)
                 return 1;
             }
             break;
+        case OPTION_SHRINK:
+            shrink = true;
+            break;
         }
     }
     if (optind != argc - 1) {
@@ -3571,6 +3577,23 @@ static int img_resize(int argc, char **argv)
         goto out;
     }
 
+    if (total_size < current_size && !shrink) {
+        warn_report("Shrinking an image will delete all data beyond the "
+                    "shrunken image's end. Before performing such an "
+                    "operation, make sure there is no important data there.");
+
+        if (g_strcmp0(bdrv_get_format_name(blk_bs(blk)), "raw") != 0) {
+            error_report(
+              "Use the --shrink option to perform a shrink operation.");
+            ret = -1;
+            goto out;
+        } else {
+            warn_report("Using the --shrink option will suppress this message."
+                        "Note that future versions of qemu-img may refuse to "
+                        "shrink images without this option.");
+        }
+    }
+
     ret = blk_truncate(blk, total_size, prealloc, &err);
     if (!ret) {
         qprintf(quiet, "Image resized.\n");
diff --git a/qemu-img.texi b/qemu-img.texi
index 72dabd6b3e..ea5d04b873 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -536,7 +536,7 @@ qemu-img rebase -b base.img diff.qcow2
 At this point, @code{modified.img} can be discarded, since
 @code{base.img + diff.qcow2} contains the same information.
 
-@item resize [--preallocation=@var{prealloc}] @var{filename} [+ | -]@var{size}
+@item resize [--shrink] [--preallocation=@var{prealloc}] @var{filename} [+ | -]@var{size}
 
 Change the disk image as if it had been created with @var{size}.
 
@@ -544,6 +544,10 @@ Before using this command to shrink a disk image, you MUST use file system and
 partitioning tools inside the VM to reduce allocated file systems and partition
 sizes accordingly.  Failure to do so will result in data loss!
 
+When shrinking images, the @code{--shrink} option must be given. This informs
+qemu-img that the user acknowledges all loss of data beyond the truncated
+image's end.
+
 After using this command to grow a disk image, you must use file system and
 partitioning tools inside the VM to actually begin using the new space on the
 device.
diff --git a/tests/qemu-iotests/102 b/tests/qemu-iotests/102
index 87db1bb1bf..d7ad8d9840 100755
--- a/tests/qemu-iotests/102
+++ b/tests/qemu-iotests/102
@@ -54,7 +54,7 @@ _make_test_img $IMG_SIZE
 $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
 # Remove data cluster from image (first cluster: image header, second: reftable,
 # third: refblock, fourth: L1 table, fifth: L2 table)
-$QEMU_IMG resize -f raw "$TEST_IMG" $((5 * 64 * 1024))
+$QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024))
 
 $QEMU_IO -c map "$TEST_IMG"
 $QEMU_IMG map "$TEST_IMG"
@@ -69,7 +69,7 @@ $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
 
 qemu_comm_method=monitor _launch_qemu -drive if=none,file="$TEST_IMG",id=drv0
 
-$QEMU_IMG resize -f raw "$TEST_IMG" $((5 * 64 * 1024))
+$QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024))
 
 _send_qemu_cmd $QEMU_HANDLE 'qemu-io drv0 map' 'allocated' \
     | sed -e 's/^(qemu).*qemu-io drv0 map...$/(qemu) qemu-io drv0 map/'
diff --git a/tests/qemu-iotests/106 b/tests/qemu-iotests/106
index 32649578fb..bfe71f4e60 100755
--- a/tests/qemu-iotests/106
+++ b/tests/qemu-iotests/106
@@ -83,7 +83,7 @@ echo '=== Testing image shrinking ==='
 for growth_mode in falloc full off; do
     echo
     echo "--- growth_mode=$growth_mode ---"
-    $QEMU_IMG resize -f "$IMGFMT" --preallocation=$growth_mode "$TEST_IMG" -${GROWTH_SIZE}K
+    $QEMU_IMG resize -f "$IMGFMT" --shrink --preallocation=$growth_mode "$TEST_IMG" -${GROWTH_SIZE}K
 done
 
 # success, all done
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v8 2/4] qcow2: add qcow2_cache_discard
  2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
@ 2017-09-18 12:42 ` Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 3/4] qcow2: add shrink image support Pavel Butsykin
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Butsykin @ 2017-09-18 12:42 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbutsykin, jsnow, kwolf, mreitz, eblake, armbru, den

Whenever l2/refcount table clusters are discarded from the file we can
automatically drop unnecessary content of the cache tables. This reduces
the chance of eviction useful cache data and eliminates inconsistent data
in the cache with the data in the file.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block/qcow2-cache.c    | 26 ++++++++++++++++++++++++++
 block/qcow2-refcount.c | 20 ++++++++++++++++++--
 block/qcow2.h          |  3 +++
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 1d25147392..75746a7f43 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -411,3 +411,29 @@ void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
     assert(c->entries[i].offset != 0);
     c->entries[i].dirty = true;
 }
+
+void *qcow2_cache_is_table_offset(BlockDriverState *bs, Qcow2Cache *c,
+                                  uint64_t offset)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].offset == offset) {
+            return qcow2_cache_get_table_addr(bs, c, i);
+        }
+    }
+    return NULL;
+}
+
+void qcow2_cache_discard(BlockDriverState *bs, Qcow2Cache *c, void *table)
+{
+    int i = qcow2_cache_get_table_idx(bs, c, table);
+
+    assert(c->entries[i].ref == 0);
+
+    c->entries[i].offset = 0;
+    c->entries[i].lru_counter = 0;
+    c->entries[i].dirty = false;
+
+    qcow2_cache_table_release(bs, c, i, 1);
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 168fc32e7b..8c17c0e3aa 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -861,8 +861,24 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
         }
         s->set_refcount(refcount_block, block_index, refcount);
 
-        if (refcount == 0 && s->discard_passthrough[type]) {
-            update_refcount_discard(bs, cluster_offset, s->cluster_size);
+        if (refcount == 0) {
+            void *table;
+
+            table = qcow2_cache_is_table_offset(bs, s->refcount_block_cache,
+                                                offset);
+            if (table != NULL) {
+                qcow2_cache_put(bs, s->refcount_block_cache, &refcount_block);
+                qcow2_cache_discard(bs, s->refcount_block_cache, table);
+            }
+
+            table = qcow2_cache_is_table_offset(bs, s->l2_table_cache, offset);
+            if (table != NULL) {
+                qcow2_cache_discard(bs, s->l2_table_cache, table);
+            }
+
+            if (s->discard_passthrough[type]) {
+                update_refcount_discard(bs, cluster_offset, s->cluster_size);
+            }
         }
     }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 96a8d43c17..52c374e9ed 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -649,6 +649,9 @@ int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
 int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
     void **table);
 void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
+void *qcow2_cache_is_table_offset(BlockDriverState *bs, Qcow2Cache *c,
+                                  uint64_t offset);
+void qcow2_cache_discard(BlockDriverState *bs, Qcow2Cache *c, void *table);
 
 /* qcow2-bitmap.c functions */
 int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v8 3/4] qcow2: add shrink image support
  2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 2/4] qcow2: add qcow2_cache_discard Pavel Butsykin
@ 2017-09-18 12:42 ` Pavel Butsykin
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 4/4] qemu-iotests: add shrinking image test Pavel Butsykin
  2017-09-18 18:22 ` [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Max Reitz
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Butsykin @ 2017-09-18 12:42 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbutsykin, jsnow, kwolf, mreitz, eblake, armbru, den

This patch add shrinking of the image file for qcow2. As a result, this allows
us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and shrink is done by punching holes
in the image file.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block/qcow2-cluster.c  |  50 +++++++++++++++++++++
 block/qcow2-refcount.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c          |  43 ++++++++++++++----
 block/qcow2.h          |  14 ++++++
 qapi/block-core.json   |   8 +++-
 5 files changed, 225 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 0d4824993c..d2518d1893 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -32,6 +32,56 @@
 #include "qemu/bswap.h"
 #include "trace.h"
 
+int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t exact_size)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int new_l1_size, i, ret;
+
+    if (exact_size >= s->l1_size) {
+        return 0;
+    }
+
+    new_l1_size = exact_size;
+
+#ifdef DEBUG_ALLOC2
+    fprintf(stderr, "shrink l1_table from %d to %d\n", s->l1_size, new_l1_size);
+#endif
+
+    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_WRITE_TABLE);
+    ret = bdrv_pwrite_zeroes(bs->file, s->l1_table_offset +
+                                       new_l1_size * sizeof(uint64_t),
+                             (s->l1_size - new_l1_size) * sizeof(uint64_t), 0);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    ret = bdrv_flush(bs->file->bs);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_FREE_L2_CLUSTERS);
+    for (i = s->l1_size - 1; i > new_l1_size - 1; i--) {
+        if ((s->l1_table[i] & L1E_OFFSET_MASK) == 0) {
+            continue;
+        }
+        qcow2_free_clusters(bs, s->l1_table[i] & L1E_OFFSET_MASK,
+                            s->cluster_size, QCOW2_DISCARD_ALWAYS);
+        s->l1_table[i] = 0;
+    }
+    return 0;
+
+fail:
+    /*
+     * If the write in the l1_table failed the image may contain a partially
+     * overwritten l1_table. In this case it would be better to clear the
+     * l1_table in memory to avoid possible image corruption.
+     */
+    memset(s->l1_table + new_l1_size, 0,
+           (s->l1_size - new_l1_size) * sizeof(uint64_t));
+    return ret;
+}
+
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size)
 {
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 8c17c0e3aa..88d5a3f1ad 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -29,6 +29,7 @@
 #include "block/qcow2.h"
 #include "qemu/range.h"
 #include "qemu/bswap.h"
+#include "qemu/cutils.h"
 
 static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -3061,3 +3062,122 @@ done:
     qemu_vfree(new_refblock);
     return ret;
 }
+
+static int qcow2_discard_refcount_block(BlockDriverState *bs,
+                                        uint64_t discard_block_offs)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t refblock_offs = get_refblock_offset(s, discard_block_offs);
+    uint64_t cluster_index = discard_block_offs >> s->cluster_bits;
+    uint32_t block_index = cluster_index & (s->refcount_block_size - 1);
+    void *refblock;
+    int ret;
+
+    assert(discard_block_offs != 0);
+
+    ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
+                          &refblock);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (s->get_refcount(refblock, block_index) != 1) {
+        qcow2_signal_corruption(bs, true, -1, -1, "Invalid refcount:"
+                                " refblock offset %#" PRIx64
+                                ", reftable index %u"
+                                ", block offset %#" PRIx64
+                                ", refcount %#" PRIx64,
+                                refblock_offs,
+                                offset_to_reftable_index(s, discard_block_offs),
+                                discard_block_offs,
+                                s->get_refcount(refblock, block_index));
+        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+        return -EINVAL;
+    }
+    s->set_refcount(refblock, block_index, 0);
+
+    qcow2_cache_entry_mark_dirty(bs, s->refcount_block_cache, refblock);
+
+    qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+
+    if (cluster_index < s->free_cluster_index) {
+        s->free_cluster_index = cluster_index;
+    }
+
+    refblock = qcow2_cache_is_table_offset(bs, s->refcount_block_cache,
+                                           discard_block_offs);
+    if (refblock) {
+        /* discard refblock from the cache if refblock is cached */
+        qcow2_cache_discard(bs, s->refcount_block_cache, refblock);
+    }
+    update_refcount_discard(bs, discard_block_offs, s->cluster_size);
+
+    return 0;
+}
+
+int qcow2_shrink_reftable(BlockDriverState *bs)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t *reftable_tmp =
+        g_malloc(s->refcount_table_size * sizeof(uint64_t));
+    int i, ret;
+
+    for (i = 0; i < s->refcount_table_size; i++) {
+        int64_t refblock_offs = s->refcount_table[i] & REFT_OFFSET_MASK;
+        void *refblock;
+        bool unused_block;
+
+        if (refblock_offs == 0) {
+            reftable_tmp[i] = 0;
+            continue;
+        }
+        ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
+                              &refblock);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* the refblock has own reference */
+        if (i == offset_to_reftable_index(s, refblock_offs)) {
+            uint64_t block_index = (refblock_offs >> s->cluster_bits) &
+                                   (s->refcount_block_size - 1);
+            uint64_t refcount = s->get_refcount(refblock, block_index);
+
+            s->set_refcount(refblock, block_index, 0);
+
+            unused_block = buffer_is_zero(refblock, s->cluster_size);
+
+            s->set_refcount(refblock, block_index, refcount);
+        } else {
+            unused_block = buffer_is_zero(refblock, s->cluster_size);
+        }
+        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+
+        reftable_tmp[i] = unused_block ? 0 : cpu_to_be64(s->refcount_table[i]);
+    }
+
+    ret = bdrv_pwrite_sync(bs->file, s->refcount_table_offset, reftable_tmp,
+                           s->refcount_table_size * sizeof(uint64_t));
+    /*
+     * If the write in the reftable failed the image may contain a partially
+     * overwritten reftable. In this case it would be better to clear the
+     * reftable in memory to avoid possible image corruption.
+     */
+    for (i = 0; i < s->refcount_table_size; i++) {
+        if (s->refcount_table[i] && !reftable_tmp[i]) {
+            if (ret == 0) {
+                ret = qcow2_discard_refcount_block(bs, s->refcount_table[i] &
+                                                       REFT_OFFSET_MASK);
+            }
+            s->refcount_table[i] = 0;
+        }
+    }
+
+    if (!s->cache_discards) {
+        qcow2_process_discards(bs, ret);
+    }
+
+out:
+    g_free(reftable_tmp);
+    return ret;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index a3679c69e8..1fa9492499 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3105,18 +3105,43 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
     }
 
     old_length = bs->total_sectors * 512;
+    new_l1_size = size_to_l1(s, offset);
 
-    /* shrinking is currently not supported */
     if (offset < old_length) {
-        error_setg(errp, "qcow2 doesn't support shrinking images yet");
-        return -ENOTSUP;
-    }
+        if (prealloc != PREALLOC_MODE_OFF) {
+            error_setg(errp,
+                       "Preallocation can't be used for shrinking an image");
+            return -EINVAL;
+        }
 
-    new_l1_size = size_to_l1(s, offset);
-    ret = qcow2_grow_l1_table(bs, new_l1_size, true);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Failed to grow the L1 table");
-        return ret;
+        ret = qcow2_cluster_discard(bs, ROUND_UP(offset, s->cluster_size),
+                                    old_length - ROUND_UP(offset,
+                                                          s->cluster_size),
+                                    QCOW2_DISCARD_ALWAYS, true);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to discard cropped clusters");
+            return ret;
+        }
+
+        ret = qcow2_shrink_l1_table(bs, new_l1_size);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "Failed to reduce the number of L2 tables");
+            return ret;
+        }
+
+        ret = qcow2_shrink_reftable(bs);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "Failed to discard unused refblocks");
+            return ret;
+        }
+    } else {
+        ret = qcow2_grow_l1_table(bs, new_l1_size, true);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to grow the L1 table");
+            return ret;
+        }
     }
 
     switch (prealloc) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 52c374e9ed..5a289a81e2 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -521,6 +521,18 @@ static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
     return r1 > r2 ? r1 - r2 : r2 - r1;
 }
 
+static inline
+uint32_t offset_to_reftable_index(BDRVQcow2State *s, uint64_t offset)
+{
+    return offset >> (s->refcount_block_bits + s->cluster_bits);
+}
+
+static inline uint64_t get_refblock_offset(BDRVQcow2State *s, uint64_t offset)
+{
+    uint32_t index = offset_to_reftable_index(s, offset);
+    return s->refcount_table[index] & REFT_OFFSET_MASK;
+}
+
 /* qcow2.c functions */
 int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
                   int64_t sector_num, int nb_sectors);
@@ -584,10 +596,12 @@ int qcow2_inc_refcounts_imrt(BlockDriverState *bs, BdrvCheckResult *res,
 int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
                                 BlockDriverAmendStatusCB *status_cb,
                                 void *cb_opaque, Error **errp);
+int qcow2_shrink_reftable(BlockDriverState *bs);
 
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size);
+int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t max_size);
 int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
 int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 833c602150..c55cd0c8db 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2479,6 +2479,11 @@
 #
 # Trigger events supported by blkdebug.
 #
+# @l1_shrink_write_table:      write zeros to the l1 table to shrink image.
+#                              (since 2.11)
+#
+# @l1_shrink_free_l2_clusters: discard the l2 tables. (since 2.11)
+#
 # Since: 2.9
 ##
 { 'enum': 'BlkdebugEvent', 'prefix': 'BLKDBG',
@@ -2495,7 +2500,8 @@
             'cluster_alloc_bytes', 'cluster_free', 'flush_to_os',
             'flush_to_disk', 'pwritev_rmw_head', 'pwritev_rmw_after_head',
             'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
-            'pwritev_zero', 'pwritev_done', 'empty_image_prepare' ] }
+            'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
+            'l1_shrink_write_table', 'l1_shrink_free_l2_clusters' ] }
 
 ##
 # @BlkdebugInjectErrorOptions:
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v8 4/4] qemu-iotests: add shrinking image test
  2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
                   ` (2 preceding siblings ...)
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 3/4] qcow2: add shrink image support Pavel Butsykin
@ 2017-09-18 12:42 ` Pavel Butsykin
  2017-09-18 18:22 ` [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Max Reitz
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Butsykin @ 2017-09-18 12:42 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: pbutsykin, jsnow, kwolf, mreitz, eblake, armbru, den

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/163.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 176 insertions(+)
 create mode 100644 tests/qemu-iotests/163
 create mode 100644 tests/qemu-iotests/163.out

diff --git a/tests/qemu-iotests/163 b/tests/qemu-iotests/163
new file mode 100644
index 0000000000..403842354e
--- /dev/null
+++ b/tests/qemu-iotests/163
@@ -0,0 +1,170 @@
+#!/usr/bin/env python
+#
+# Tests for shrinking images
+#
+# Copyright (c) 2016-2017 Parallels International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os, random, iotests, struct, qcow2
+from iotests import qemu_img, qemu_io, image_size
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+check_img = os.path.join(iotests.test_dir, 'check.img')
+
+def size_to_int(str):
+    suff = ['B', 'K', 'M', 'G', 'T']
+    return int(str[:-1]) * 1024**suff.index(str[-1:])
+
+class ShrinkBaseClass(iotests.QMPTestCase):
+    image_len = '128M'
+    shrink_size = '10M'
+    chunk_size = '16M'
+    refcount_bits = '16'
+
+    def __qcow2_check(self, filename):
+        entry_bits = 3
+        entry_size = 1 << entry_bits
+        l1_mask = 0x00fffffffffffe00
+        div_roundup = lambda n, d: (n + d - 1) / d
+
+        def split_by_n(data, n):
+            for x in xrange(0, len(data), n):
+                yield struct.unpack('>Q', data[x:x + n])[0] & l1_mask
+
+        def check_l1_table(h, l1_data):
+            l1_list = list(split_by_n(l1_data, entry_size))
+            real_l1_size = div_roundup(h.size,
+                                       1 << (h.cluster_bits*2 - entry_size))
+            used, unused = l1_list[:real_l1_size], l1_list[real_l1_size:]
+
+            self.assertTrue(len(used) != 0, "Verifying l1 table content")
+            self.assertFalse(any(unused), "Verifying l1 table content")
+
+        def check_reftable(fd, h, reftable):
+            for offset in split_by_n(reftable, entry_size):
+                if offset != 0:
+                    fd.seek(offset)
+                    cluster = fd.read(1 << h.cluster_bits)
+                    self.assertTrue(any(cluster), "Verifying reftable content")
+
+        with open(filename, "rb") as fd:
+            h = qcow2.QcowHeader(fd)
+
+            fd.seek(h.l1_table_offset)
+            l1_table = fd.read(h.l1_size << entry_bits)
+
+            fd.seek(h.refcount_table_offset)
+            reftable = fd.read(h.refcount_table_clusters << h.cluster_bits)
+
+            check_l1_table(h, l1_table)
+            check_reftable(fd, h, reftable)
+
+    def __raw_check(self, filename):
+        pass
+
+    image_check = {
+        'qcow2' : __qcow2_check,
+        'raw' : __raw_check
+    }
+
+    def setUp(self):
+        if iotests.imgfmt == 'raw':
+            qemu_img('create', '-f', iotests.imgfmt, test_img, self.image_len)
+            qemu_img('create', '-f', iotests.imgfmt, check_img,
+                     self.shrink_size)
+        else:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     '-o', 'cluster_size=' + self.cluster_size +
+                     ',refcount_bits=' + self.refcount_bits,
+                     test_img, self.image_len)
+            qemu_img('create', '-f', iotests.imgfmt,
+                     '-o', 'cluster_size=%s'% self.cluster_size,
+                     check_img, self.shrink_size)
+        qemu_io('-c', 'write -P 0xff 0 ' + self.shrink_size, check_img)
+
+    def tearDown(self):
+        os.remove(test_img)
+        os.remove(check_img)
+
+    def image_verify(self):
+        self.assertEqual(image_size(test_img), image_size(check_img),
+                         "Verifying image size")
+        self.image_check[iotests.imgfmt](self, test_img)
+
+        if iotests.imgfmt == 'raw':
+            return
+        self.assertEqual(qemu_img('check', test_img), 0,
+                         "Verifying image corruption")
+
+    def test_empty_image(self):
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(
+            qemu_io('-c', 'read -P 0x00 %s'%self.shrink_size, test_img),
+            qemu_io('-c', 'read -P 0x00 %s'%self.shrink_size, check_img),
+            "Verifying image content")
+
+        self.image_verify()
+
+    def test_sequential_write(self):
+        for offs in range(0, size_to_int(self.image_len),
+                          size_to_int(self.chunk_size)):
+            qemu_io('-c', 'write -P 0xff %d %s' % (offs, self.chunk_size),
+                    test_img)
+
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(qemu_img("compare", test_img, check_img), 0,
+                         "Verifying image content")
+
+        self.image_verify()
+
+    def test_random_write(self):
+        offs_list = range(0, size_to_int(self.image_len),
+                          size_to_int(self.chunk_size))
+        random.shuffle(offs_list)
+        for offs in offs_list:
+            qemu_io('-c', 'write -P 0xff %d %s' % (offs, self.chunk_size),
+                    test_img)
+
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(qemu_img("compare", test_img, check_img), 0,
+                         "Verifying image content")
+
+        self.image_verify()
+
+class TestShrink512(ShrinkBaseClass):
+    image_len = '3M'
+    shrink_size = '1M'
+    chunk_size = '256K'
+    cluster_size = '512'
+    refcount_bits = '64'
+
+class TestShrink64K(ShrinkBaseClass):
+    cluster_size = '64K'
+
+class TestShrink1M(ShrinkBaseClass):
+    cluster_size = '1M'
+    refcount_bits = '1'
+
+ShrinkBaseClass = None
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/163.out b/tests/qemu-iotests/163.out
new file mode 100644
index 0000000000..dae404e278
--- /dev/null
+++ b/tests/qemu-iotests/163.out
@@ -0,0 +1,5 @@
+.........
+----------------------------------------------------------------------
+Ran 9 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 4bd5017008..82c38b403b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -166,6 +166,7 @@
 159 rw auto quick
 160 rw auto quick
 162 auto quick
+163 rw auto quick
 165 rw auto quick
 170 rw auto quick
 171 rw auto quick
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
@ 2017-09-18 18:12   ` Max Reitz
  0 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2017-09-18 18:12 UTC (permalink / raw)
  To: Pavel Butsykin, qemu-block, qemu-devel; +Cc: jsnow, kwolf, eblake, armbru, den

[-- Attachment #1: Type: text/plain, Size: 1907 bytes --]

On 2017-09-18 14:42, Pavel Butsykin wrote:
> The flag is additional precaution against data loss. Perhaps in the future the
> operation shrink without this flag will be blocked for all formats, but for now
> we need to maintain compatibility with raw.
> 
> Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
> Reviewed-by: Max Reitz <mreitz@redhat.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> ---
>  qemu-img-cmds.hx       |  4 ++--
>  qemu-img.c             | 23 +++++++++++++++++++++++
>  qemu-img.texi          |  6 +++++-
>  tests/qemu-iotests/102 |  4 ++--
>  tests/qemu-iotests/106 |  2 +-
>  5 files changed, 33 insertions(+), 6 deletions(-)

[...]

> diff --git a/qemu-img.c b/qemu-img.c
> index 56ef49e214..b7b2386cbd 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c

[...]

> @@ -3571,6 +3577,23 @@ static int img_resize(int argc, char **argv)
>          goto out;
>      }
>  
> +    if (total_size < current_size && !shrink) {
> +        warn_report("Shrinking an image will delete all data beyond the "
> +                    "shrunken image's end. Before performing such an "
> +                    "operation, make sure there is no important data there.");
> +
> +        if (g_strcmp0(bdrv_get_format_name(blk_bs(blk)), "raw") != 0) {
> +            error_report(
> +              "Use the --shrink option to perform a shrink operation.");
> +            ret = -1;
> +            goto out;
> +        } else {
> +            warn_report("Using the --shrink option will suppress this message."

Still missing a space here.

Max

> +                        "Note that future versions of qemu-img may refuse to "
> +                        "shrink images without this option.");
> +        }
> +    }
> +
>      ret = blk_truncate(blk, total_size, prealloc, &err);
>      if (!ret) {
>          qprintf(quiet, "Image resized.\n");


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2
  2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
                   ` (3 preceding siblings ...)
  2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 4/4] qemu-iotests: add shrinking image test Pavel Butsykin
@ 2017-09-18 18:22 ` Max Reitz
  4 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2017-09-18 18:22 UTC (permalink / raw)
  To: Pavel Butsykin, qemu-block, qemu-devel; +Cc: jsnow, kwolf, eblake, armbru, den

[-- Attachment #1: Type: text/plain, Size: 1512 bytes --]

On 2017-09-18 14:42, Pavel Butsykin wrote:
> This patch add shrinking of the image file for qcow2. As a result, this allows
> us to reduce the virtual image size and free up space on the disk without
> copying the image. Image can be fragmented and shrink is done by punching holes
> in the image file.
> 
> # ./qemu-img create -f qcow2 image.qcow2 4G
> Formatting 'image.qcow2', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
> 
> # ./qemu-io -c "write -P 0x22 0 1G" image.qcow2
> wrote 1073741824/1073741824 bytes at offset 0
> 1 GiB, 1 ops; 0:00:04.59 (222.886 MiB/sec and 0.2177 ops/sec)
> 
> # ./qemu-img resize image.qcow2 512M
> warning: qemu-img: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
> error: qemu-img: Use the --shrink option to perform a shrink operation.
> 
> # ./qemu-img resize --shrink image.qcow2 128M
> Image resized.
> 
> # ./qemu-img info image.qcow2
> image: image.qcow2
> file format: qcow2
> virtual size: 128M (134217728 bytes)
> disk size: 128M
> cluster_size: 65536
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> # du -h image.qcow2
> 129M    image.qcow2

Thanks, I've added the missing space in patch 1 and applied the series
to my block branch:

https://github.com/XanClic/qemu/commits/block

Max


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-09-18 18:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-18 12:42 [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Pavel Butsykin
2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 1/4] qemu-img: add --shrink flag for resize Pavel Butsykin
2017-09-18 18:12   ` Max Reitz
2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 2/4] qcow2: add qcow2_cache_discard Pavel Butsykin
2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 3/4] qcow2: add shrink image support Pavel Butsykin
2017-09-18 12:42 ` [Qemu-devel] [PATCH v8 4/4] qemu-iotests: add shrinking image test Pavel Butsykin
2017-09-18 18:22 ` [Qemu-devel] [PATCH v8 0/4] Add shrink image for qcow2 Max Reitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).