qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters
@ 2013-06-19 11:44 Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This series adds options to make qcow2 discard freed clusters, in several
categories. By default, only freed clusters related to snapshots (i.e. mainly
snapshot deletion) are discarded.

v2:
- Removed leftover debug code
- Don't discard after COW (overwriting compressed clusters)
- Changed some commas into semicolons

Kevin Wolf (5):
  Revert "block: Disable driver-specific options for 1.5"
  qcow2: Add refcount update reason to all callers
  qcow2: Options to enable discard for freed clusters
  qcow2: Batch discards
  block: Always enable discard on the protocol level

 block.c                  |   2 +-
 block/qcow2-cluster.c    |  41 ++++++++++----
 block/qcow2-refcount.c   | 136 +++++++++++++++++++++++++++++++++++++++--------
 block/qcow2-snapshot.c   |   6 ++-
 block/qcow2.c            |  30 ++++++++++-
 block/qcow2.h            |  32 +++++++++--
 blockdev.c               | 118 ++--------------------------------------
 tests/qemu-iotests/group |   2 +-
 8 files changed, 214 insertions(+), 153 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v2 1/5] Revert "block: Disable driver-specific options for 1.5"
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
@ 2013-06-19 11:44 ` Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This reverts commit 8ec7d390b0d50b5e5b4b1d8dba7ba40d64a70875.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 blockdev.c               | 118 ++---------------------------------------------
 tests/qemu-iotests/group |   2 +-
 2 files changed, 5 insertions(+), 115 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 5975dde..459a87e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1737,120 +1737,10 @@ QemuOptsList qemu_drive_opts = {
     .name = "drive",
     .head = QTAILQ_HEAD_INITIALIZER(qemu_drive_opts.head),
     .desc = {
-        {
-            .name = "bus",
-            .type = QEMU_OPT_NUMBER,
-            .help = "bus number",
-        },{
-            .name = "unit",
-            .type = QEMU_OPT_NUMBER,
-            .help = "unit number (i.e. lun for scsi)",
-        },{
-            .name = "if",
-            .type = QEMU_OPT_STRING,
-            .help = "interface (ide, scsi, sd, mtd, floppy, pflash, virtio)",
-        },{
-            .name = "index",
-            .type = QEMU_OPT_NUMBER,
-            .help = "index number",
-        },{
-            .name = "cyls",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of cylinders (ide disk geometry)",
-        },{
-            .name = "heads",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of heads (ide disk geometry)",
-        },{
-            .name = "secs",
-            .type = QEMU_OPT_NUMBER,
-            .help = "number of sectors (ide disk geometry)",
-        },{
-            .name = "trans",
-            .type = QEMU_OPT_STRING,
-            .help = "chs translation (auto, lba. none)",
-        },{
-            .name = "media",
-            .type = QEMU_OPT_STRING,
-            .help = "media type (disk, cdrom)",
-        },{
-            .name = "snapshot",
-            .type = QEMU_OPT_BOOL,
-            .help = "enable/disable snapshot mode",
-        },{
-            .name = "file",
-            .type = QEMU_OPT_STRING,
-            .help = "disk image",
-        },{
-            .name = "discard",
-            .type = QEMU_OPT_STRING,
-            .help = "discard operation (ignore/off, unmap/on)",
-        },{
-            .name = "cache",
-            .type = QEMU_OPT_STRING,
-            .help = "host cache usage (none, writeback, writethrough, "
-                    "directsync, unsafe)",
-        },{
-            .name = "aio",
-            .type = QEMU_OPT_STRING,
-            .help = "host AIO implementation (threads, native)",
-        },{
-            .name = "format",
-            .type = QEMU_OPT_STRING,
-            .help = "disk format (raw, qcow2, ...)",
-        },{
-            .name = "serial",
-            .type = QEMU_OPT_STRING,
-            .help = "disk serial number",
-        },{
-            .name = "rerror",
-            .type = QEMU_OPT_STRING,
-            .help = "read error action",
-        },{
-            .name = "werror",
-            .type = QEMU_OPT_STRING,
-            .help = "write error action",
-        },{
-            .name = "addr",
-            .type = QEMU_OPT_STRING,
-            .help = "pci address (virtio only)",
-        },{
-            .name = "readonly",
-            .type = QEMU_OPT_BOOL,
-            .help = "open drive file as read-only",
-        },{
-            .name = "iops",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit total I/O operations per second",
-        },{
-            .name = "iops_rd",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit read operations per second",
-        },{
-            .name = "iops_wr",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit write operations per second",
-        },{
-            .name = "bps",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit total bytes per second",
-        },{
-            .name = "bps_rd",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit read bytes per second",
-        },{
-            .name = "bps_wr",
-            .type = QEMU_OPT_NUMBER,
-            .help = "limit write bytes per second",
-        },{
-            .name = "copy-on-read",
-            .type = QEMU_OPT_BOOL,
-            .help = "copy read data from backing file into image file",
-        },{
-            .name = "boot",
-            .type = QEMU_OPT_BOOL,
-            .help = "(deprecated, ignored)",
-        },
+        /*
+         * no elements => accept any params
+         * validation will happen later
+         */
         { /* end of list */ }
     },
 };
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 387b050..f487a8f 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -57,7 +57,7 @@
 048 img auto quick
 049 rw auto
 050 rw auto backing quick
-#051 rw auto
+051 rw auto
 052 rw auto backing
 053 rw auto
 054 rw auto
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v2 2/5] qcow2: Add refcount update reason to all callers
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
@ 2013-06-19 11:44 ` Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This adds a refcount update reason to all callers of update_refcounts(),
so that a follow-up patch can use this information to decide whether
clusters that reach a refcount of 0 should be discarded in the image
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c  | 19 +++++++++++------
 block/qcow2-refcount.c | 55 +++++++++++++++++++++++++++++++-------------------
 block/qcow2-snapshot.c |  6 ++++--
 block/qcow2.c          |  3 ++-
 block/qcow2.h          | 16 ++++++++++++---
 5 files changed, 66 insertions(+), 33 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 76f30e5..3191d6b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -98,14 +98,16 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
         goto fail;
     }
     g_free(s->l1_table);
-    qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, s->l1_table_offset, s->l1_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_OTHER);
     s->l1_table_offset = new_l1_table_offset;
     s->l1_table = new_l1_table;
     s->l1_size = new_l1_size;
     return 0;
  fail:
     g_free(new_l1_table);
-    qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2);
+    qcow2_free_clusters(bs, new_l1_table_offset, new_l1_size2,
+                        QCOW2_DISCARD_OTHER);
     return ret;
 }
 
@@ -548,7 +550,8 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
 
         /* Then decrease the refcount of the old table */
         if (l2_offset) {
-            qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t));
+            qcow2_free_clusters(bs, l2_offset, s->l2_size * sizeof(uint64_t),
+                                QCOW2_DISCARD_OTHER);
         }
     }
 
@@ -715,10 +718,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
     /*
      * If this was a COW, we need to decrease the refcount of the old cluster.
      * Also flush bs->file to get the right order for L2 and refcount update.
+     *
+     * Don't discard clusters that reach a refcount of 0 (e.g. compressed
+     * clusters), the next write will reuse them anyway.
      */
     if (j != 0) {
         for (i = 0; i < j; i++) {
-            qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1);
+            qcow2_free_any_clusters(bs, be64_to_cpu(old_cluster[i]), 1,
+                                    QCOW2_DISCARD_NEVER);
         }
     }
 
@@ -1339,7 +1346,7 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
         l2_table[l2_index + i] = cpu_to_be64(0);
 
         /* Then decrease the refcount */
-        qcow2_free_any_clusters(bs, old_offset, 1);
+        qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
     }
 
     ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
@@ -1415,7 +1422,7 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
         qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
         if (old_offset & QCOW_OFLAG_COMPRESSED) {
             l2_table[l2_index + i] = cpu_to_be64(QCOW_OFLAG_ZERO);
-            qcow2_free_any_clusters(bs, old_offset, 1);
+            qcow2_free_any_clusters(bs, old_offset, 1, QCOW2_DISCARD_REQUEST);
         } else {
             l2_table[l2_index + i] |= cpu_to_be64(QCOW_OFLAG_ZERO);
         }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index b32738f..6d35e49 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -29,7 +29,7 @@
 static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
                             int64_t offset, int64_t length,
-                            int addend);
+                            int addend, enum qcow2_discard_type type);
 
 
 /*********************************************************/
@@ -235,7 +235,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
     } else {
         /* Described somewhere else. This can recurse at most twice before we
          * arrive at a block that describes itself. */
-        ret = update_refcount(bs, new_block, s->cluster_size, 1);
+        ret = update_refcount(bs, new_block, s->cluster_size, 1,
+                              QCOW2_DISCARD_NEVER);
         if (ret < 0) {
             goto fail_block;
         }
@@ -399,7 +400,8 @@ static int alloc_refcount_block(BlockDriverState *bs,
 
     /* Free old table. Remember, we must not change free_cluster_index */
     uint64_t old_free_cluster_index = s->free_cluster_index;
-    qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_OTHER);
     s->free_cluster_index = old_free_cluster_index;
 
     ret = load_refcount_block(bs, new_block, (void**) refcount_block);
@@ -420,7 +422,7 @@ fail_block:
 
 /* XXX: cache several refcount block clusters ? */
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
-    int64_t offset, int64_t length, int addend)
+    int64_t offset, int64_t length, int addend, enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int64_t start, last, cluster_offset;
@@ -506,7 +508,8 @@ fail:
      */
     if (ret < 0) {
         int dummy;
-        dummy = update_refcount(bs, offset, cluster_offset - offset, -addend);
+        dummy = update_refcount(bs, offset, cluster_offset - offset, -addend,
+                                QCOW2_DISCARD_NEVER);
         (void)dummy;
     }
 
@@ -522,12 +525,14 @@ fail:
  */
 static int update_cluster_refcount(BlockDriverState *bs,
                                    int64_t cluster_index,
-                                   int addend)
+                                   int addend,
+                                   enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
     int ret;
 
-    ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend);
+    ret = update_refcount(bs, cluster_index << s->cluster_bits, 1, addend,
+                          type);
     if (ret < 0) {
         return ret;
     }
@@ -579,7 +584,7 @@ int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size)
         return offset;
     }
 
-    ret = update_refcount(bs, offset, size, 1);
+    ret = update_refcount(bs, offset, size, 1, QCOW2_DISCARD_NEVER);
     if (ret < 0) {
         return ret;
     }
@@ -611,7 +616,8 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     old_free_cluster_index = s->free_cluster_index;
     s->free_cluster_index = cluster_index + i;
 
-    ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
+    ret = update_refcount(bs, offset, i << s->cluster_bits, 1,
+                          QCOW2_DISCARD_NEVER);
     if (ret < 0) {
         return ret;
     }
@@ -649,7 +655,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if (free_in_cluster == 0)
             s->free_byte_offset = 0;
         if ((offset & (s->cluster_size - 1)) != 0)
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                    QCOW2_DISCARD_NEVER);
     } else {
         offset = qcow2_alloc_clusters(bs, s->cluster_size);
         if (offset < 0) {
@@ -659,7 +666,8 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
         if ((cluster_offset + s->cluster_size) == offset) {
             /* we are lucky: contiguous data */
             offset = s->free_byte_offset;
-            update_cluster_refcount(bs, offset >> s->cluster_bits, 1);
+            update_cluster_refcount(bs, offset >> s->cluster_bits, 1,
+                                    QCOW2_DISCARD_NEVER);
             s->free_byte_offset += size;
         } else {
             s->free_byte_offset = offset;
@@ -676,12 +684,13 @@ int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size)
 }
 
 void qcow2_free_clusters(BlockDriverState *bs,
-                          int64_t offset, int64_t size)
+                          int64_t offset, int64_t size,
+                          enum qcow2_discard_type type)
 {
     int ret;
 
     BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_FREE);
-    ret = update_refcount(bs, offset, size, -1);
+    ret = update_refcount(bs, offset, size, -1, type);
     if (ret < 0) {
         fprintf(stderr, "qcow2_free_clusters failed: %s\n", strerror(-ret));
         /* TODO Remember the clusters to free them later and avoid leaking */
@@ -692,8 +701,8 @@ void qcow2_free_clusters(BlockDriverState *bs,
  * Free a cluster using its L2 entry (handles clusters of all types, e.g.
  * normal cluster, compressed cluster, etc.)
  */
-void qcow2_free_any_clusters(BlockDriverState *bs,
-    uint64_t l2_entry, int nb_clusters)
+void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
+                             int nb_clusters, enum qcow2_discard_type type)
 {
     BDRVQcowState *s = bs->opaque;
 
@@ -705,12 +714,12 @@ void qcow2_free_any_clusters(BlockDriverState *bs,
                            s->csize_mask) + 1;
             qcow2_free_clusters(bs,
                 (l2_entry & s->cluster_offset_mask) & ~511,
-                nb_csectors * 512);
+                nb_csectors * 512, type);
         }
         break;
     case QCOW2_CLUSTER_NORMAL:
         qcow2_free_clusters(bs, l2_entry & L2E_OFFSET_MASK,
-                            nb_clusters << s->cluster_bits);
+                            nb_clusters << s->cluster_bits, type);
         break;
     case QCOW2_CLUSTER_UNALLOCATED:
     case QCOW2_CLUSTER_ZERO:
@@ -785,7 +794,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                             int ret;
                             ret = update_refcount(bs,
                                 (offset & s->cluster_offset_mask) & ~511,
-                                nb_csectors * 512, addend);
+                                nb_csectors * 512, addend,
+                                QCOW2_DISCARD_SNAPSHOT);
                             if (ret < 0) {
                                 goto fail;
                             }
@@ -795,7 +805,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
                     } else {
                         uint64_t cluster_index = (offset & L2E_OFFSET_MASK) >> s->cluster_bits;
                         if (addend != 0) {
-                            refcount = update_cluster_refcount(bs, cluster_index, addend);
+                            refcount = update_cluster_refcount(bs, cluster_index, addend,
+                                                               QCOW2_DISCARD_SNAPSHOT);
                         } else {
                             refcount = get_refcount(bs, cluster_index);
                         }
@@ -827,7 +838,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 
 
             if (addend != 0) {
-                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend);
+                refcount = update_cluster_refcount(bs, l2_offset >> s->cluster_bits, addend,
+                                                   QCOW2_DISCARD_SNAPSHOT);
             } else {
                 refcount = get_refcount(bs, l2_offset >> s->cluster_bits);
             }
@@ -1253,7 +1265,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
 
             if (num_fixed) {
                 ret = update_refcount(bs, i << s->cluster_bits, 1,
-                                      refcount2 - refcount1);
+                                      refcount2 - refcount1,
+                                      QCOW2_DISCARD_ALWAYS);
                 if (ret >= 0) {
                     (*num_fixed)++;
                     continue;
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 992a5c8..0caac90 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -262,7 +262,8 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
     }
 
     /* free the old snapshot table */
-    qcow2_free_clusters(bs, s->snapshots_offset, s->snapshots_size);
+    qcow2_free_clusters(bs, s->snapshots_offset, s->snapshots_size,
+                        QCOW2_DISCARD_SNAPSHOT);
     s->snapshots_offset = snapshots_offset;
     s->snapshots_size = snapshots_size;
     return 0;
@@ -569,7 +570,8 @@ int qcow2_snapshot_delete(BlockDriverState *bs, const char *snapshot_id)
     if (ret < 0) {
         return ret;
     }
-    qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t));
+    qcow2_free_clusters(bs, sn.l1_table_offset, sn.l1_size * sizeof(uint64_t),
+                        QCOW2_DISCARD_SNAPSHOT);
 
     /* must update the copied flag on the current cluster offsets */
     ret = qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, 0);
diff --git a/block/qcow2.c b/block/qcow2.c
index 0fa5cb2..e28ea47 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1196,7 +1196,8 @@ static int preallocate(BlockDriverState *bs)
 
         ret = qcow2_alloc_cluster_link_l2(bs, meta);
         if (ret < 0) {
-            qcow2_free_any_clusters(bs, meta->alloc_offset, meta->nb_clusters);
+            qcow2_free_any_clusters(bs, meta->alloc_offset, meta->nb_clusters,
+                                    QCOW2_DISCARD_NEVER);
             return ret;
         }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 6959c6a..64a6479 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -129,6 +129,15 @@ enum {
     QCOW2_COMPAT_FEAT_MASK            = QCOW2_COMPAT_LAZY_REFCOUNTS,
 };
 
+enum qcow2_discard_type {
+    QCOW2_DISCARD_NEVER = 0,
+    QCOW2_DISCARD_ALWAYS,
+    QCOW2_DISCARD_REQUEST,
+    QCOW2_DISCARD_SNAPSHOT,
+    QCOW2_DISCARD_OTHER,
+    QCOW2_DISCARD_MAX
+};
+
 typedef struct Qcow2Feature {
     uint8_t type;
     uint8_t bit;
@@ -349,9 +358,10 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
     int nb_clusters);
 int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size);
 void qcow2_free_clusters(BlockDriverState *bs,
-    int64_t offset, int64_t size);
-void qcow2_free_any_clusters(BlockDriverState *bs,
-    uint64_t cluster_offset, int nb_clusters);
+                          int64_t offset, int64_t size,
+                          enum qcow2_discard_type type);
+void qcow2_free_any_clusters(BlockDriverState *bs, uint64_t l2_entry,
+                             int nb_clusters, enum qcow2_discard_type type);
 
 int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     int64_t l1_table_offset, int l1_size, int addend);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v2 3/5] qcow2: Options to enable discard for freed clusters
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
@ 2013-06-19 11:44 ` Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Batch discards Kevin Wolf
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-refcount.c |  5 +++++
 block/qcow2.c          | 26 ++++++++++++++++++++++++++
 block/qcow2.h          |  5 +++++
 3 files changed, 36 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 6d35e49..7488988 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -488,6 +488,11 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             s->free_cluster_index = cluster_index;
         }
         refcount_block[block_index] = cpu_to_be16(refcount);
+        if (refcount == 0 && s->discard_passthrough[type]) {
+            /* Try discarding, ignore errors */
+            /* FIXME Doing this cluster by cluster will be painfully slow */
+            bdrv_discard(bs->file, cluster_offset, 1);
+        }
     }
 
     ret = 0;
diff --git a/block/qcow2.c b/block/qcow2.c
index e28ea47..ef8a2ca 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -295,6 +295,22 @@ static QemuOptsList qcow2_runtime_opts = {
             .type = QEMU_OPT_BOOL,
             .help = "Postpone refcount updates",
         },
+        {
+            .name = QCOW2_OPT_DISCARD_REQUEST,
+            .type = QEMU_OPT_BOOL,
+            .help = "Pass guest discard requests to the layer below",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_SNAPSHOT,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when snapshot related space "
+                    "is freed",
+        },
+        {
+            .name = QCOW2_OPT_DISCARD_OTHER,
+            .type = QEMU_OPT_BOOL,
+            .help = "Generate discard requests when other clusters are freed",
+        },
         { /* end of list */ }
     },
 };
@@ -532,6 +548,16 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     s->use_lazy_refcounts = qemu_opt_get_bool(opts, QCOW2_OPT_LAZY_REFCOUNTS,
         (s->compatible_features & QCOW2_COMPAT_LAZY_REFCOUNTS));
 
+    s->discard_passthrough[QCOW2_DISCARD_NEVER] = false;
+    s->discard_passthrough[QCOW2_DISCARD_ALWAYS] = true;
+    s->discard_passthrough[QCOW2_DISCARD_REQUEST] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_REQUEST,
+                          flags & BDRV_O_UNMAP);
+    s->discard_passthrough[QCOW2_DISCARD_SNAPSHOT] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_SNAPSHOT, true);
+    s->discard_passthrough[QCOW2_DISCARD_OTHER] =
+        qemu_opt_get_bool(opts, QCOW2_OPT_DISCARD_OTHER, false);
+
     qemu_opts_del(opts);
 
     if (s->use_lazy_refcounts && s->qcow_version < 3) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 64a6479..6f91b9a 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -60,6 +60,9 @@
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy_refcounts"
+#define QCOW2_OPT_DISCARD_REQUEST "pass_discard_request"
+#define QCOW2_OPT_DISCARD_SNAPSHOT "pass_discard_snapshot"
+#define QCOW2_OPT_DISCARD_OTHER "pass_discard_other"
 
 typedef struct QCowHeader {
     uint32_t magic;
@@ -187,6 +190,8 @@ typedef struct BDRVQcowState {
     int qcow_version;
     bool use_lazy_refcounts;
 
+    bool discard_passthrough[QCOW2_DISCARD_MAX];
+
     uint64_t incompatible_features;
     uint64_t compatible_features;
     uint64_t autoclear_features;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v2 4/5] qcow2: Batch discards
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (2 preceding siblings ...)
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
@ 2013-06-19 11:44 ` Kevin Wolf
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 5/5] block: Always enable discard on the protocol level Kevin Wolf
  2013-06-20  7:42 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Stefan Hajnoczi
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This optimises the discard operation for freed clusters by batching
discard requests (both snapshot deletion and bdrv_discard end up
updating the refcounts cluster by cluster).

Note that we don't discard asynchronously, but keep s->lock held. This
is to avoid that a freed cluster is reallocated and written to while the
discard is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c  | 22 +++++++++++---
 block/qcow2-refcount.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++--
 block/qcow2.c          |  1 +
 block/qcow2.h          | 11 +++++++
 4 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 3191d6b..cca76d4 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1377,18 +1377,25 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
 
     nb_clusters = size_to_clusters(s, end_offset - offset);
 
+    s->cache_discards = true;
+
     /* Each L2 table is handled by its own loop iteration */
     while (nb_clusters > 0) {
         ret = discard_single_l2(bs, offset, nb_clusters);
         if (ret < 0) {
-            return ret;
+            goto fail;
         }
 
         nb_clusters -= ret;
         offset += (ret * s->cluster_size);
     }
 
-    return 0;
+    ret = 0;
+fail:
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
+    return ret;
 }
 
 /*
@@ -1450,15 +1457,22 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors)
     /* Each L2 table is handled by its own loop iteration */
     nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS);
 
+    s->cache_discards = true;
+
     while (nb_clusters > 0) {
         ret = zero_single_l2(bs, offset, nb_clusters);
         if (ret < 0) {
-            return ret;
+            goto fail;
         }
 
         nb_clusters -= ret;
         offset += (ret * s->cluster_size);
     }
 
-    return 0;
+    ret = 0;
+fail:
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
+    return ret;
 }
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 7488988..1244693 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -420,6 +420,74 @@ fail_block:
     return ret;
 }
 
+void qcow2_process_discards(BlockDriverState *bs, int ret)
+{
+    BDRVQcowState *s = bs->opaque;
+    Qcow2DiscardRegion *d, *next;
+
+    QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) {
+        QTAILQ_REMOVE(&s->discards, d, next);
+
+        /* Discard is optional, ignore the return value */
+        if (ret >= 0) {
+            bdrv_discard(bs->file,
+                         d->offset >> BDRV_SECTOR_BITS,
+                         d->bytes >> BDRV_SECTOR_BITS);
+        }
+
+        g_free(d);
+    }
+}
+
+static void update_refcount_discard(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t length)
+{
+    BDRVQcowState *s = bs->opaque;
+    Qcow2DiscardRegion *d, *p, *next;
+
+    QTAILQ_FOREACH(d, &s->discards, next) {
+        uint64_t new_start = MIN(offset, d->offset);
+        uint64_t new_end = MAX(offset + length, d->offset + d->bytes);
+
+        if (new_end - new_start <= length + d->bytes) {
+            /* There can't be any overlap, areas ending up here have no
+             * references any more and therefore shouldn't get freed another
+             * time. */
+            assert(d->bytes + length == new_end - new_start);
+            d->offset = new_start;
+            d->bytes = new_end - new_start;
+            goto found;
+        }
+    }
+
+    d = g_malloc(sizeof(*d));
+    *d = (Qcow2DiscardRegion) {
+        .bs     = bs,
+        .offset = offset,
+        .bytes  = length,
+    };
+    QTAILQ_INSERT_TAIL(&s->discards, d, next);
+
+found:
+    /* Merge discard requests if they are adjacent now */
+    QTAILQ_FOREACH_SAFE(p, &s->discards, next, next) {
+        if (p == d
+            || p->offset > d->offset + d->bytes
+            || d->offset > p->offset + p->bytes)
+        {
+            continue;
+        }
+
+        /* Still no overlap possible */
+        assert(p->offset == d->offset + d->bytes
+            || d->offset == p->offset + p->bytes);
+
+        QTAILQ_REMOVE(&s->discards, p, next);
+        d->offset = MIN(d->offset, p->offset);
+        d->bytes += p->bytes;
+    }
+}
+
 /* XXX: cache several refcount block clusters ? */
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
     int64_t offset, int64_t length, int addend, enum qcow2_discard_type type)
@@ -488,15 +556,18 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
             s->free_cluster_index = cluster_index;
         }
         refcount_block[block_index] = cpu_to_be16(refcount);
+
         if (refcount == 0 && s->discard_passthrough[type]) {
-            /* Try discarding, ignore errors */
-            /* FIXME Doing this cluster by cluster will be painfully slow */
-            bdrv_discard(bs->file, cluster_offset, 1);
+            update_refcount_discard(bs, cluster_offset, s->cluster_size);
         }
     }
 
     ret = 0;
 fail:
+    if (!s->cache_discards) {
+        qcow2_process_discards(bs, ret);
+    }
+
     /* Write last changed block to disk */
     if (refcount_block) {
         int wret;
@@ -755,6 +826,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
     l1_table = NULL;
     l1_size2 = l1_size * sizeof(uint64_t);
 
+    s->cache_discards = true;
+
     /* WARNING: qcow2_snapshot_goto relies on this function not using the
      * l1_table_offset when it is the current s->l1_table_offset! Be careful
      * when changing this! */
@@ -867,6 +940,9 @@ fail:
         qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
     }
 
+    s->cache_discards = false;
+    qcow2_process_discards(bs, ret);
+
     /* Update L1 only if it isn't deleted anyway (addend = -1) */
     if (ret == 0 && addend >= 0 && l1_modified) {
         for (i = 0; i < l1_size; i++) {
diff --git a/block/qcow2.c b/block/qcow2.c
index ef8a2ca..9383990 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -486,6 +486,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags)
     }
 
     QLIST_INIT(&s->cluster_allocs);
+    QTAILQ_INIT(&s->discards);
 
     /* read qcow2 extensions */
     if (qcow2_read_extensions(bs, header.header_length, ext_end, NULL)) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 6f91b9a..3b2d5cd 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -147,6 +147,13 @@ typedef struct Qcow2Feature {
     char    name[46];
 } QEMU_PACKED Qcow2Feature;
 
+typedef struct Qcow2DiscardRegion {
+    BlockDriverState *bs;
+    uint64_t offset;
+    uint64_t bytes;
+    QTAILQ_ENTRY(Qcow2DiscardRegion) next;
+} Qcow2DiscardRegion;
+
 typedef struct BDRVQcowState {
     int cluster_bits;
     int cluster_size;
@@ -199,6 +206,8 @@ typedef struct BDRVQcowState {
     size_t unknown_header_fields_size;
     void* unknown_header_fields;
     QLIST_HEAD(, Qcow2UnknownHeaderExtension) unknown_header_ext;
+    QTAILQ_HEAD (, Qcow2DiscardRegion) discards;
+    bool cache_discards;
 } BDRVQcowState;
 
 /* XXX: use std qcow open function ? */
@@ -374,6 +383,8 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
                           BdrvCheckMode fix);
 
+void qcow2_process_discards(BlockDriverState *bs, int ret);
+
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [PATCH v2 5/5] block: Always enable discard on the protocol level
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (3 preceding siblings ...)
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Batch discards Kevin Wolf
@ 2013-06-19 11:44 ` Kevin Wolf
  2013-06-20  7:42 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Stefan Hajnoczi
  5 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2013-06-19 11:44 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Turning on discard options in qcow2 doesn't help a lot when the discard
requests that it issues are thrown away by the raw-posix layer. This
patch always enables discard functionality on the protocol level so that
it's the image format's responsibility to send (or not) discard
requests. Requests sent by the guest will be allowed or ignored by the
top level BlockDriverState, which depends on the discard=... option like
before.

In particular, this means that even without specifying options, the
qcow2 default of discarding deleted snapshots actually takes effect now,
both for qemu and qemu-img.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index b88ad2f..8e77d46 100644
--- a/block.c
+++ b/block.c
@@ -1045,7 +1045,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, QDict *options,
     extract_subqdict(options, &file_options, "file.");
 
     ret = bdrv_file_open(&file, filename, file_options,
-                         bdrv_open_flags(bs, flags));
+                         bdrv_open_flags(bs, flags | BDRV_O_UNMAP));
     if (ret < 0) {
         goto fail;
     }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters
  2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
                   ` (4 preceding siblings ...)
  2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 5/5] block: Always enable discard on the protocol level Kevin Wolf
@ 2013-06-20  7:42 ` Stefan Hajnoczi
  5 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2013-06-20  7:42 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel

On Wed, Jun 19, 2013 at 01:44:16PM +0200, Kevin Wolf wrote:
> This series adds options to make qcow2 discard freed clusters, in several
> categories. By default, only freed clusters related to snapshots (i.e. mainly
> snapshot deletion) are discarded.
> 
> v2:
> - Removed leftover debug code
> - Don't discard after COW (overwriting compressed clusters)
> - Changed some commas into semicolons
> 
> Kevin Wolf (5):
>   Revert "block: Disable driver-specific options for 1.5"
>   qcow2: Add refcount update reason to all callers
>   qcow2: Options to enable discard for freed clusters
>   qcow2: Batch discards
>   block: Always enable discard on the protocol level
> 
>  block.c                  |   2 +-
>  block/qcow2-cluster.c    |  41 ++++++++++----
>  block/qcow2-refcount.c   | 136 +++++++++++++++++++++++++++++++++++++++--------
>  block/qcow2-snapshot.c   |   6 ++-
>  block/qcow2.c            |  30 ++++++++++-
>  block/qcow2.h            |  32 +++++++++--
>  blockdev.c               | 118 ++--------------------------------------
>  tests/qemu-iotests/group |   2 +-
>  8 files changed, 214 insertions(+), 153 deletions(-)
> 
> -- 
> 1.8.1.4
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-06-20  7:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-19 11:44 [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Kevin Wolf
2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 1/5] Revert "block: Disable driver-specific options for 1.5" Kevin Wolf
2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 2/5] qcow2: Add refcount update reason to all callers Kevin Wolf
2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 3/5] qcow2: Options to enable discard for freed clusters Kevin Wolf
2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 4/5] qcow2: Batch discards Kevin Wolf
2013-06-19 11:44 ` [Qemu-devel] [PATCH v2 5/5] block: Always enable discard on the protocol level Kevin Wolf
2013-06-20  7:42 ` [Qemu-devel] [PATCH v2 0/5] qcow2: Discard freed clusters Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).