qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: kwolf@redhat.com, stefanha@redhat.com, qemu-devel@nongnu.org
Subject: [Qemu-devel] [PULL 57/58] qcow2: Discard/zero clusters by byte count
Date: Thu, 11 May 2017 16:33:00 +0200	[thread overview]
Message-ID: <1494513181-7900-58-git-send-email-kwolf@redhat.com> (raw)
In-Reply-To: <1494513181-7900-1-git-send-email-kwolf@redhat.com>

From: Eric Blake <eblake@redhat.com>

Passing a byte offset, but sector count, when we ultimately
want to operate on cluster granularity, is madness.  Clean up
the external interfaces to take both offset and count as bytes,
while still keeping the assertion added previously that the
caller must align the values to a cluster.  Then rename things
to make sure backports don't get confused by changed units:
instead of qcow2_discard_clusters() and qcow2_zero_clusters(),
we now have qcow2_cluster_discard() and qcow2_cluster_zeroize().

The internal functions still operate on clusters at a time, and
return an int for number of cleared clusters; but on an image
with 2M clusters, a single L2 table holds 256k entries that each
represent a 2M cluster, totalling well over INT_MAX bytes if we
ever had a request for that many bytes at once.  All our callers
currently limit themselves to 32-bit bytes (and therefore fewer
clusters), but by making this function 64-bit clean, we have one
less place to clean up if we later improve the block layer to
support 64-bit bytes through all operations (with the block layer
auto-fragmenting on behalf of more-limited drivers), rather than
the current state where some interfaces are artificially limited
to INT_MAX at a time.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170507000552.20847-13-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c  | 42 ++++++++++++++++++++++--------------------
 block/qcow2-snapshot.c |  7 +++----
 block/qcow2.c          | 22 +++++++++-------------
 block/qcow2.h          |  9 +++++----
 4 files changed, 39 insertions(+), 41 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 43bde56..347d94b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1549,34 +1549,36 @@ static int discard_single_l2(BlockDriverState *bs, uint64_t offset,
     return nb_clusters;
 }
 
-int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
-    int nb_sectors, enum qcow2_discard_type type, bool full_discard)
+int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, enum qcow2_discard_type type,
+                          bool full_discard)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t end_offset;
+    uint64_t end_offset = offset + bytes;
     uint64_t nb_clusters;
+    int64_t cleared;
     int ret;
 
-    end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS);
-
     /* Caller must pass aligned values, except at image end */
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
            end_offset == bs->total_sectors << BDRV_SECTOR_BITS);
 
-    nb_clusters = size_to_clusters(s, end_offset - offset);
+    nb_clusters = size_to_clusters(s, bytes);
 
     s->cache_discards = true;
 
     /* Each L2 table is handled by its own loop iteration */
     while (nb_clusters > 0) {
-        ret = discard_single_l2(bs, offset, nb_clusters, type, full_discard);
-        if (ret < 0) {
+        cleared = discard_single_l2(bs, offset, nb_clusters, type,
+                                    full_discard);
+        if (cleared < 0) {
+            ret = cleared;
             goto fail;
         }
 
-        nb_clusters -= ret;
-        offset += (ret * s->cluster_size);
+        nb_clusters -= cleared;
+        offset += (cleared * s->cluster_size);
     }
 
     ret = 0;
@@ -1641,16 +1643,15 @@ static int zero_single_l2(BlockDriverState *bs, uint64_t offset,
     return nb_clusters;
 }
 
-int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
-                        int flags)
+int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, int flags)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t end_offset;
+    uint64_t end_offset = offset + bytes;
     uint64_t nb_clusters;
+    int64_t cleared;
     int ret;
 
-    end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS);
-
     /* Caller must pass aligned values, except at image end */
     assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
     assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
@@ -1662,18 +1663,19 @@ int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
     }
 
     /* Each L2 table is handled by its own loop iteration */
-    nb_clusters = size_to_clusters(s, nb_sectors << BDRV_SECTOR_BITS);
+    nb_clusters = size_to_clusters(s, bytes);
 
     s->cache_discards = true;
 
     while (nb_clusters > 0) {
-        ret = zero_single_l2(bs, offset, nb_clusters, flags);
-        if (ret < 0) {
+        cleared = zero_single_l2(bs, offset, nb_clusters, flags);
+        if (cleared < 0) {
+            ret = cleared;
             goto fail;
         }
 
-        nb_clusters -= ret;
-        offset += (ret * s->cluster_size);
+        nb_clusters -= cleared;
+        offset += (cleared * s->cluster_size);
     }
 
     ret = 0;
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 0324243..44243e0 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -440,10 +440,9 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
 
     /* The VM state isn't needed any more in the active L1 table; in fact, it
      * hurts by causing expensive COW for the next snapshot. */
-    qcow2_discard_clusters(bs, qcow2_vm_state_offset(s),
-                           align_offset(sn->vm_state_size, s->cluster_size)
-                                >> BDRV_SECTOR_BITS,
-                           QCOW2_DISCARD_NEVER, false);
+    qcow2_cluster_discard(bs, qcow2_vm_state_offset(s),
+                          align_offset(sn->vm_state_size, s->cluster_size),
+                          QCOW2_DISCARD_NEVER, false);
 
 #ifdef DEBUG_ALLOC
     {
diff --git a/block/qcow2.c b/block/qcow2.c
index 72b1650..a8d61f0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2512,7 +2512,7 @@ static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
     trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, count);
 
     /* Whatever is left can use real zero clusters */
-    ret = qcow2_zero_clusters(bs, offset, count >> BDRV_SECTOR_BITS, flags);
+    ret = qcow2_cluster_zeroize(bs, offset, count, flags);
     qemu_co_mutex_unlock(&s->lock);
 
     return ret;
@@ -2535,8 +2535,8 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs,
     }
 
     qemu_co_mutex_lock(&s->lock);
-    ret = qcow2_discard_clusters(bs, offset, count >> BDRV_SECTOR_BITS,
-                                 QCOW2_DISCARD_REQUEST, false);
+    ret = qcow2_cluster_discard(bs, offset, count, QCOW2_DISCARD_REQUEST,
+                                false);
     qemu_co_mutex_unlock(&s->lock);
     return ret;
 }
@@ -2843,9 +2843,8 @@ fail:
 static int qcow2_make_empty(BlockDriverState *bs)
 {
     BDRVQcow2State *s = bs->opaque;
-    uint64_t start_sector;
-    int sector_step = (QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size) /
-                       BDRV_SECTOR_SIZE);
+    uint64_t offset, end_offset;
+    int step = QEMU_ALIGN_DOWN(INT_MAX, s->cluster_size);
     int l1_clusters, ret = 0;
 
     l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
@@ -2862,18 +2861,15 @@ static int qcow2_make_empty(BlockDriverState *bs)
 
     /* This fallback code simply discards every active cluster; this is slow,
      * but works in all cases */
-    for (start_sector = 0; start_sector < bs->total_sectors;
-         start_sector += sector_step)
-    {
+    end_offset = bs->total_sectors * BDRV_SECTOR_SIZE;
+    for (offset = 0; offset < end_offset; offset += step) {
         /* As this function is generally used after committing an external
          * snapshot, QCOW2_DISCARD_SNAPSHOT seems appropriate. Also, the
          * default action for this kind of discard is to pass the discard,
          * which will ideally result in an actually smaller image file, as
          * is probably desired. */
-        ret = qcow2_discard_clusters(bs, start_sector * BDRV_SECTOR_SIZE,
-                                     MIN(sector_step,
-                                         bs->total_sectors - start_sector),
-                                     QCOW2_DISCARD_SNAPSHOT, true);
+        ret = qcow2_cluster_discard(bs, offset, MIN(step, end_offset - offset),
+                                    QCOW2_DISCARD_SNAPSHOT, true);
         if (ret < 0) {
             break;
         }
diff --git a/block/qcow2.h b/block/qcow2.h
index 810ceb8..1801dc3 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -551,10 +551,11 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
                                          int compressed_size);
 
 int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m);
-int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
-    int nb_sectors, enum qcow2_discard_type type, bool full_discard);
-int qcow2_zero_clusters(BlockDriverState *bs, uint64_t offset, int nb_sectors,
-                        int flags);
+int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, enum qcow2_discard_type type,
+                          bool full_discard);
+int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
+                          uint64_t bytes, int flags);
 
 int qcow2_expand_zero_clusters(BlockDriverState *bs,
                                BlockDriverAmendStatusCB *status_cb,
-- 
1.8.3.1

  parent reply	other threads:[~2017-05-11 14:35 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 14:32 [Qemu-devel] [PULL 00/58] Block layer patches Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 01/58] block: Make bdrv_perm_names public Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 02/58] block: Add, parse and store "force-share" option Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 03/58] block: Respect "force-share" in perm propagating Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 04/58] qemu-img: Add --force-share option to subcommands Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 05/58] qemu-img: Update documentation for -U Kevin Wolf
2017-05-12 17:37   ` [Qemu-devel] [Qemu-block] " Max Reitz
2017-05-15  9:12     ` Fam Zheng
2017-05-11 14:32 ` [Qemu-devel] [PULL 06/58] qemu-io: Add --force-share option Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 07/58] iotests: 030: Prepare for image locking Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 08/58] iotests: 046: " Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 09/58] iotests: 055: Don't attach the target image already for drive-backup Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 10/58] iotests: 085: Avoid image locking conflict Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 11/58] iotests: 087: Don't attach test image twice Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 12/58] iotests: 091: Quit QEMU before checking image Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 13/58] iotests: 172: Use separate images for multiple devices Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 14/58] tests: Use null-co:// instead of /dev/null as the dummy image Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 15/58] file-posix: Add 'locking' option Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 16/58] file-win32: Error out if locking=on Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 17/58] tests: Disable image lock in test-replication Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 18/58] block: Reuse bs as backing hd for drive-backup sync=none Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 19/58] osdep: Add qemu_lock_fd and qemu_unlock_fd Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 20/58] osdep: Fall back to posix lock when OFD lock is unavailable Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 21/58] file-posix: Add image locking to perm operations Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 22/58] qemu-iotests: Add test case 153 for image locking Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 23/58] tests: Add POSIX image locking test case 182 Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 24/58] qcow2: Fix preallocation size formula Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 25/58] qcow2: Reuse preallocated zero clusters Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 26/58] qcow2: Discard " Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 27/58] iotests: Extend test 066 Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 28/58] migration: Unify block node activation error handling Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 29/58] block: New BdrvChildRole.activate() for blk_resume_after_migration() Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 30/58] block: Drop permissions when migration completes Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 31/58] block: Inactivate parents before children Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 32/58] block: Fix write/resize permissions for inactive images Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 33/58] file-posix: Remove .bdrv_inactivate/invalidate_cache Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 34/58] qemu-img: wait for convert coroutines to complete Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 35/58] nvme: Implement Write Zeroes Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 36/58] blockdev: use drained_begin/end for qmp_block_resize Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 37/58] qemu-io: Improve alignment checks Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 38/58] qemu-io: Switch 'alloc' command to byte-based length Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 39/58] qemu-io: Switch 'map' output to byte-based reporting Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 40/58] blkdebug: Sanity check block layer guarantees Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 41/58] blkdebug: Refactor error injection Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 42/58] blkdebug: Add pass-through write_zero and discard support Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 43/58] blkdebug: Simplify override logic Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 44/58] blkdebug: Add ability to override unmap geometries Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 45/58] tests: Add coverage for recent block geometry fixes Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 46/58] qcow2: Nicer variable names in qcow2_update_snapshot_refcount() Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 47/58] qcow2: Use consistent switch indentation Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 48/58] block: Update comments on BDRV_BLOCK_* meanings Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 49/58] qcow2: Correctly report status of preallocated zero clusters Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 50/58] qcow2: Name typedef for cluster type Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 51/58] qcow2: Make distinction between zero cluster types obvious Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 52/58] qcow2: Optimize zero_single_l2() to minimize L2 churn Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 53/58] iotests: Improve _filter_qemu_img_map Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 54/58] iotests: Add test 179 to cover write zeroes with unmap Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 55/58] qcow2: Optimize write zero of unaligned tail cluster Kevin Wolf
2017-05-11 14:32 ` [Qemu-devel] [PULL 56/58] qcow2: Assert that cluster operations are aligned Kevin Wolf
2017-05-11 14:33 ` Kevin Wolf [this message]
2017-05-11 14:33 ` [Qemu-devel] [PULL 58/58] MAINTAINERS: Add qemu-progress to the block layer Kevin Wolf
2017-05-12 13:39 ` [Qemu-devel] [PULL 00/58] Block layer patches Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1494513181-7900-58-git-send-email-kwolf@redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).