qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW
@ 2017-06-07 14:08 Alberto Garcia
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow() Alberto Garcia
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

Hi all,

here's a patch series that rewrites the copy-on-write code in the
qcow2 driver to reduce the number of I/O operations.

This is version v2, please refer to the original e-mail for a complete
description:

https://lists.gnu.org/archive/html/qemu-block/2017-05/msg00882.html

Regards,

Berto

Changes:

v2:
- Patch 1: Update commit message [Eric]
- Patch 7: Make sure that the number of iovs does not exceed IOV_MAX [Anton]
- Patch 7: Don't add zero-length buffers to the qiov in perform_cow()

v1: https://lists.gnu.org/archive/html/qemu-block/2017-05/msg00882.html
- Initial version

Output of git-backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/7:[down] 'qcow2: Remove unused Error variable in do_perform_cow()'
002/7:[----] [--] 'qcow2: Use unsigned int for both members of Qcow2COWRegion'
003/7:[----] [--] 'qcow2: Make perform_cow() call do_perform_cow() twice'
004/7:[----] [--] 'qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()'
005/7:[----] [--] 'qcow2: Allow reading both COW regions with only one request'
006/7:[----] [--] 'qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()'
007/7:[0014] [FC] 'qcow2: Merge the writing of the COW regions with the guest data'

Alberto Garcia (7):
  qcow2: Remove unused Error variable in do_perform_cow()
  qcow2: Use unsigned int for both members of Qcow2COWRegion
  qcow2: Make perform_cow() call do_perform_cow() twice
  qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
  qcow2: Allow reading both COW regions with only one request
  qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
  qcow2: Merge the writing of the COW regions with the guest data

 block/qcow2-cluster.c | 192 +++++++++++++++++++++++++++++++++++++-------------
 block/qcow2.c         |  64 ++++++++++++++---
 block/qcow2.h         |  11 ++-
 3 files changed, 207 insertions(+), 60 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow()
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-07 15:44   ` Eric Blake
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion Alberto Garcia
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

We are using the return value of qcow2_encrypt_sectors() to detect
problems but we are throwing away the returned Error since we have no
way to report it to the user. Therefore we can simply get rid of the
local Error variable and pass NULL instead.

Alternatively we could try to figure out a way to pass the original
error instead of simply returning -EIO, but that would be more
invasive, so let's keep the current approach.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index d779ea19cf..d1c419f52b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -440,16 +440,14 @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
     }
 
     if (bs->encrypted) {
-        Error *err = NULL;
         int64_t sector = (src_cluster_offset + offset_in_cluster)
                          >> BDRV_SECTOR_BITS;
         assert(s->cipher);
         assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
         assert((bytes & ~BDRV_SECTOR_MASK) == 0);
         if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
-                                  bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
+                                  bytes >> BDRV_SECTOR_BITS, true, NULL) < 0) {
             ret = -EIO;
-            error_free(err);
             goto out;
         }
     }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow() Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-07 16:02   ` Eric Blake
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice Alberto Garcia
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

Qcow2COWRegion has two attributes:

- The offset of the COW region from the start of the first cluster
  touched by the I/O request. Since it's always going to be positive
  and the maximum request size is at most INT_MAX, we can use a
  regular unsigned int to store this offset.

- The size of the COW region in bytes. This is guaranteed to be >= 0,
  so we should use an unsigned type instead.

In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes.
It will also help keep some assertions simpler now that we know that
there are no negative numbers.

The prototype of do_perform_cow() is also updated to reflect these
changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 4 ++--
 block/qcow2.h         | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index d1c419f52b..a86c5a75a9 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -406,8 +406,8 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
 static int coroutine_fn do_perform_cow(BlockDriverState *bs,
                                        uint64_t src_cluster_offset,
                                        uint64_t cluster_offset,
-                                       int offset_in_cluster,
-                                       int bytes)
+                                       unsigned offset_in_cluster,
+                                       unsigned bytes)
 {
     BDRVQcow2State *s = bs->opaque;
     QEMUIOVector qiov;
diff --git a/block/qcow2.h b/block/qcow2.h
index 1801dc30dc..c26ee0a33d 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -301,10 +301,10 @@ typedef struct Qcow2COWRegion {
      * Offset of the COW region in bytes from the start of the first cluster
      * touched by the request.
      */
-    uint64_t    offset;
+    unsigned    offset;
 
     /** Number of bytes to copy */
-    int         nb_bytes;
+    unsigned    nb_bytes;
 } Qcow2COWRegion;
 
 /**
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow() Alberto Garcia
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-07 21:43   ` Eric Blake
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write() Alberto Garcia
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

Instead of calling perform_cow() twice with a different COW region
each time, call it just once and make perform_cow() handle both
regions.

This patch simply moves code around. The next one will do the actual
reordering of the COW operations.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index a86c5a75a9..4c03639a72 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -414,6 +414,10 @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
     struct iovec iov;
     int ret;
 
+    if (bytes == 0) {
+        return 0;
+    }
+
     iov.iov_len = bytes;
     iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
     if (iov.iov_base == NULL) {
@@ -751,31 +755,40 @@ uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
     return cluster_offset;
 }
 
-static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
+static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
 {
     BDRVQcow2State *s = bs->opaque;
+    Qcow2COWRegion *start = &m->cow_start;
+    Qcow2COWRegion *end = &m->cow_end;
     int ret;
 
-    if (r->nb_bytes == 0) {
+    if (start->nb_bytes == 0 && end->nb_bytes == 0) {
         return 0;
     }
 
     qemu_co_mutex_unlock(&s->lock);
-    ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
+    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
+                         start->offset, start->nb_bytes);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
+                         end->offset, end->nb_bytes);
+
+fail:
     qemu_co_mutex_lock(&s->lock);
 
-    if (ret < 0) {
-        return ret;
-    }
-
     /*
      * Before we update the L2 table to actually point to the new cluster, we
      * need to be sure that the refcounts have been increased and COW was
      * handled.
      */
-    qcow2_cache_depends_on_flush(s->l2_table_cache);
+    if (ret == 0) {
+        qcow2_cache_depends_on_flush(s->l2_table_cache);
+    }
 
-    return 0;
+    return ret;
 }
 
 int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
@@ -795,12 +808,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
     }
 
     /* copy content of unmodified sectors */
-    ret = perform_cow(bs, m, &m->cow_start);
-    if (ret < 0) {
-        goto err;
-    }
-
-    ret = perform_cow(bs, m, &m->cow_end);
+    ret = perform_cow(bs, m);
     if (ret < 0) {
         goto err;
     }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
                   ` (2 preceding siblings ...)
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-09 14:53   ` Eric Blake
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 5/7] qcow2: Allow reading both COW regions with only one request Alberto Garcia
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

This patch splits do_perform_cow() into three separate functions to
read, encrypt and write the COW regions.

perform_cow() can now read both regions first, then encrypt them and
finally write them to disk. The memory allocation is also done in
this function now, using one single buffer large enough to hold both
regions.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 114 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 84 insertions(+), 30 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4c03639a72..af43e6a34f 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -403,34 +403,26 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
     return 0;
 }
 
-static int coroutine_fn do_perform_cow(BlockDriverState *bs,
-                                       uint64_t src_cluster_offset,
-                                       uint64_t cluster_offset,
-                                       unsigned offset_in_cluster,
-                                       unsigned bytes)
+static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
+                                            uint64_t src_cluster_offset,
+                                            unsigned offset_in_cluster,
+                                            uint8_t *buffer,
+                                            unsigned bytes)
 {
-    BDRVQcow2State *s = bs->opaque;
     QEMUIOVector qiov;
-    struct iovec iov;
+    struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
     int ret;
 
     if (bytes == 0) {
         return 0;
     }
 
-    iov.iov_len = bytes;
-    iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
-    if (iov.iov_base == NULL) {
-        return -ENOMEM;
-    }
-
     qemu_iovec_init_external(&qiov, &iov, 1);
 
     BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
 
     if (!bs->drv) {
-        ret = -ENOMEDIUM;
-        goto out;
+        return -ENOMEDIUM;
     }
 
     /* Call .bdrv_co_readv() directly instead of using the public block-layer
@@ -440,39 +432,63 @@ static int coroutine_fn do_perform_cow(BlockDriverState *bs,
     ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
                                   bytes, &qiov, 0);
     if (ret < 0) {
-        goto out;
+        return ret;
     }
 
-    if (bs->encrypted) {
+    return 0;
+}
+
+static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
+                                                uint64_t src_cluster_offset,
+                                                unsigned offset_in_cluster,
+                                                uint8_t *buffer,
+                                                unsigned bytes)
+{
+    if (bytes && bs->encrypted) {
+        BDRVQcow2State *s = bs->opaque;
         int64_t sector = (src_cluster_offset + offset_in_cluster)
                          >> BDRV_SECTOR_BITS;
         assert(s->cipher);
         assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
         assert((bytes & ~BDRV_SECTOR_MASK) == 0);
-        if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
+        if (qcow2_encrypt_sectors(s, sector, buffer, buffer,
                                   bytes >> BDRV_SECTOR_BITS, true, NULL) < 0) {
-            ret = -EIO;
-            goto out;
+            return false;
         }
     }
+    return true;
+}
+
+static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
+                                             uint64_t cluster_offset,
+                                             unsigned offset_in_cluster,
+                                             uint8_t *buffer,
+                                             unsigned bytes)
+{
+    QEMUIOVector qiov;
+    struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
+    int ret;
+
+    if (bytes == 0) {
+        return 0;
+    }
+
+    qemu_iovec_init_external(&qiov, &iov, 1);
 
     ret = qcow2_pre_write_overlap_check(bs, 0,
             cluster_offset + offset_in_cluster, bytes);
     if (ret < 0) {
-        goto out;
+        return ret;
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
     ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster,
                           bytes, &qiov, 0);
     if (ret < 0) {
-        goto out;
+        return ret;
     }
 
-    ret = 0;
-out:
-    qemu_vfree(iov.iov_base);
-    return ret;
+    return 0;
 }
 
 
@@ -760,22 +776,59 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     BDRVQcow2State *s = bs->opaque;
     Qcow2COWRegion *start = &m->cow_start;
     Qcow2COWRegion *end = &m->cow_end;
+    unsigned buffer_size = start->nb_bytes + end->nb_bytes;
+    uint8_t *start_buffer, *end_buffer;
     int ret;
 
+    assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
+
     if (start->nb_bytes == 0 && end->nb_bytes == 0) {
         return 0;
     }
 
+    /* Reserve a buffer large enough to store the data from both the
+     * start and end COW regions */
+    start_buffer = qemu_try_blockalign(bs, buffer_size);
+    if (start_buffer == NULL) {
+        return -ENOMEM;
+    }
+    /* The part of the buffer where the end region is located */
+    end_buffer = start_buffer + start->nb_bytes;
+
     qemu_co_mutex_unlock(&s->lock);
-    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
-                         start->offset, start->nb_bytes);
+    /* First we read the existing data from both COW regions */
+    ret = do_perform_cow_read(bs, m->offset, start->offset,
+                              start_buffer, start->nb_bytes);
     if (ret < 0) {
         goto fail;
     }
 
-    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
-                         end->offset, end->nb_bytes);
+    ret = do_perform_cow_read(bs, m->offset, end->offset,
+                              end_buffer, end->nb_bytes);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    /* Encrypt the data if necessary before writing it */
+    if (bs->encrypted) {
+        if (!do_perform_cow_encrypt(bs, m->offset, start->offset,
+                                    start_buffer, start->nb_bytes) ||
+            !do_perform_cow_encrypt(bs, m->offset, end->offset,
+                                    end_buffer, end->nb_bytes)) {
+            ret = -EIO;
+            goto fail;
+        }
+    }
+
+    /* And now we can write everything */
+    ret = do_perform_cow_write(bs, m->alloc_offset, start->offset,
+                               start_buffer, start->nb_bytes);
+    if (ret < 0) {
+        goto fail;
+    }
 
+    ret = do_perform_cow_write(bs, m->alloc_offset, end->offset,
+                               end_buffer, end->nb_bytes);
 fail:
     qemu_co_mutex_lock(&s->lock);
 
@@ -788,6 +841,7 @@ fail:
         qcow2_cache_depends_on_flush(s->l2_table_cache);
     }
 
+    qemu_vfree(start_buffer);
     return ret;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 5/7] qcow2: Allow reading both COW regions with only one request
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
                   ` (3 preceding siblings ...)
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write() Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}() Alberto Garcia
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

Reading both COW regions requires two separate requests, but it's
perfectly possible to merge them and perform only one. This generally
improves performance, particularly on rotating disk drives. The
downside is that the data in the middle region is read but discarded.

This patch takes a conservative approach and only merges reads when
the size of the middle region is <= 16KB.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 41 ++++++++++++++++++++++++++++++-----------
 1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index af43e6a34f..8f6bc3d0b9 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -777,34 +777,53 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     Qcow2COWRegion *start = &m->cow_start;
     Qcow2COWRegion *end = &m->cow_end;
     unsigned buffer_size = start->nb_bytes + end->nb_bytes;
+    unsigned data_bytes = end->offset - (start->offset + start->nb_bytes);
+    bool merge_reads = false;
     uint8_t *start_buffer, *end_buffer;
     int ret;
 
     assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
+    assert(buffer_size <= UINT_MAX - data_bytes);
+    assert(start->offset + start->nb_bytes <= end->offset);
 
     if (start->nb_bytes == 0 && end->nb_bytes == 0) {
         return 0;
     }
 
-    /* Reserve a buffer large enough to store the data from both the
-     * start and end COW regions */
+    /* If we have to read both the start and end COW regions and the
+     * middle region is not too large then perform just one read
+     * operation */
+    if (start->nb_bytes && end->nb_bytes && data_bytes <= 16384) {
+        merge_reads = true;
+        buffer_size += data_bytes;
+    }
+
+    /* Reserve a buffer large enough to store all the data that we're
+     * going to read */
     start_buffer = qemu_try_blockalign(bs, buffer_size);
     if (start_buffer == NULL) {
         return -ENOMEM;
     }
     /* The part of the buffer where the end region is located */
-    end_buffer = start_buffer + start->nb_bytes;
+    end_buffer = start_buffer + buffer_size - end->nb_bytes;
 
     qemu_co_mutex_unlock(&s->lock);
-    /* First we read the existing data from both COW regions */
-    ret = do_perform_cow_read(bs, m->offset, start->offset,
-                              start_buffer, start->nb_bytes);
-    if (ret < 0) {
-        goto fail;
-    }
+    /* First we read the existing data from both COW regions. We
+     * either read the whole region in one go, or the start and end
+     * regions separately. */
+    if (merge_reads) {
+        ret = do_perform_cow_read(bs, m->offset, start->offset,
+                                  start_buffer, buffer_size);
+    } else {
+        ret = do_perform_cow_read(bs, m->offset, start->offset,
+                                  start_buffer, start->nb_bytes);
+        if (ret < 0) {
+            goto fail;
+        }
 
-    ret = do_perform_cow_read(bs, m->offset, end->offset,
-                              end_buffer, end->nb_bytes);
+        ret = do_perform_cow_read(bs, m->offset, end->offset,
+                                  end_buffer, end->nb_bytes);
+    }
     if (ret < 0) {
         goto fail;
     }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}()
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
                   ` (4 preceding siblings ...)
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 5/7] qcow2: Allow reading both COW regions with only one request Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-07 16:20   ` Manos Pitsidianakis
  2017-06-16 15:31   ` Kevin Wolf
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data Alberto Garcia
  2017-06-16 15:31 ` [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Kevin Wolf
  7 siblings, 2 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

Instead of passing a single buffer pointer to do_perform_cow_write(),
pass a QEMUIOVector. This will allow us to merge the write requests
for the COW regions and the actual data into a single one.

Although do_perform_cow_read() does not strictly need to change its
API, we're doing it here as well for consistency.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 51 ++++++++++++++++++++++++---------------------------
 1 file changed, 24 insertions(+), 27 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8f6bc3d0b9..71609ff7a2 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -406,19 +406,14 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
 static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
                                             uint64_t src_cluster_offset,
                                             unsigned offset_in_cluster,
-                                            uint8_t *buffer,
-                                            unsigned bytes)
+                                            QEMUIOVector *qiov)
 {
-    QEMUIOVector qiov;
-    struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
     int ret;
 
-    if (bytes == 0) {
+    if (qiov->size == 0) {
         return 0;
     }
 
-    qemu_iovec_init_external(&qiov, &iov, 1);
-
     BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
 
     if (!bs->drv) {
@@ -430,7 +425,7 @@ static int coroutine_fn do_perform_cow_read(BlockDriverState *bs,
      * which can lead to deadlock when block layer copy-on-read is enabled.
      */
     ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
-                                  bytes, &qiov, 0);
+                                  qiov->size, qiov, 0);
     if (ret < 0) {
         return ret;
     }
@@ -462,28 +457,23 @@ static bool coroutine_fn do_perform_cow_encrypt(BlockDriverState *bs,
 static int coroutine_fn do_perform_cow_write(BlockDriverState *bs,
                                              uint64_t cluster_offset,
                                              unsigned offset_in_cluster,
-                                             uint8_t *buffer,
-                                             unsigned bytes)
+                                             QEMUIOVector *qiov)
 {
-    QEMUIOVector qiov;
-    struct iovec iov = { .iov_base = buffer, .iov_len = bytes };
     int ret;
 
-    if (bytes == 0) {
+    if (qiov->size == 0) {
         return 0;
     }
 
-    qemu_iovec_init_external(&qiov, &iov, 1);
-
     ret = qcow2_pre_write_overlap_check(bs, 0,
-            cluster_offset + offset_in_cluster, bytes);
+            cluster_offset + offset_in_cluster, qiov->size);
     if (ret < 0) {
         return ret;
     }
 
     BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
     ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster,
-                          bytes, &qiov, 0);
+                          qiov->size, qiov, 0);
     if (ret < 0) {
         return ret;
     }
@@ -780,6 +770,7 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     unsigned data_bytes = end->offset - (start->offset + start->nb_bytes);
     bool merge_reads = false;
     uint8_t *start_buffer, *end_buffer;
+    QEMUIOVector qiov;
     int ret;
 
     assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
@@ -807,22 +798,25 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     /* The part of the buffer where the end region is located */
     end_buffer = start_buffer + buffer_size - end->nb_bytes;
 
+    qemu_iovec_init(&qiov, 3);
+
     qemu_co_mutex_unlock(&s->lock);
     /* First we read the existing data from both COW regions. We
      * either read the whole region in one go, or the start and end
      * regions separately. */
     if (merge_reads) {
-        ret = do_perform_cow_read(bs, m->offset, start->offset,
-                                  start_buffer, buffer_size);
+        qemu_iovec_add(&qiov, start_buffer, buffer_size);
+        ret = do_perform_cow_read(bs, m->offset, start->offset, &qiov);
     } else {
-        ret = do_perform_cow_read(bs, m->offset, start->offset,
-                                  start_buffer, start->nb_bytes);
+        qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
+        ret = do_perform_cow_read(bs, m->offset, start->offset, &qiov);
         if (ret < 0) {
             goto fail;
         }
 
-        ret = do_perform_cow_read(bs, m->offset, end->offset,
-                                  end_buffer, end->nb_bytes);
+        qemu_iovec_reset(&qiov);
+        qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
+        ret = do_perform_cow_read(bs, m->offset, end->offset, &qiov);
     }
     if (ret < 0) {
         goto fail;
@@ -840,14 +834,16 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     }
 
     /* And now we can write everything */
-    ret = do_perform_cow_write(bs, m->alloc_offset, start->offset,
-                               start_buffer, start->nb_bytes);
+    qemu_iovec_reset(&qiov);
+    qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
+    ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
     if (ret < 0) {
         goto fail;
     }
 
-    ret = do_perform_cow_write(bs, m->alloc_offset, end->offset,
-                               end_buffer, end->nb_bytes);
+    qemu_iovec_reset(&qiov);
+    qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
+    ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
 fail:
     qemu_co_mutex_lock(&s->lock);
 
@@ -861,6 +857,7 @@ fail:
     }
 
     qemu_vfree(start_buffer);
+    qemu_iovec_destroy(&qiov);
     return ret;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
                   ` (5 preceding siblings ...)
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}() Alberto Garcia
@ 2017-06-07 14:08 ` Alberto Garcia
  2017-06-16 15:31   ` Kevin Wolf
  2017-06-16 15:31 ` [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Kevin Wolf
  7 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-07 14:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi, Alberto Garcia

If the guest tries to write data that results on the allocation of a
new cluster, instead of writing the guest data first and then the data
from the COW regions, write everything together using one single I/O
operation.

This can improve the write performance by 25% or more, depending on
several factors such as the media type, the cluster size and the I/O
request size.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2-cluster.c | 38 ++++++++++++++++++++++--------
 block/qcow2.c         | 64 +++++++++++++++++++++++++++++++++++++++++++--------
 block/qcow2.h         |  7 ++++++
 3 files changed, 90 insertions(+), 19 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 71609ff7a2..8e789e790b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -776,6 +776,7 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
     assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
     assert(buffer_size <= UINT_MAX - data_bytes);
     assert(start->offset + start->nb_bytes <= end->offset);
+    assert(!m->data_qiov || m->data_qiov->size == data_bytes);
 
     if (start->nb_bytes == 0 && end->nb_bytes == 0) {
         return 0;
@@ -833,17 +834,36 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
         }
     }
 
-    /* And now we can write everything */
-    qemu_iovec_reset(&qiov);
-    qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
-    ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
-    if (ret < 0) {
-        goto fail;
+    /* And now we can write everything. If we have the guest data we
+     * can write everything in one single operation */
+    if (m->data_qiov) {
+        qemu_iovec_reset(&qiov);
+        if (start->nb_bytes) {
+            qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
+        }
+        qemu_iovec_concat(&qiov, m->data_qiov, 0, data_bytes);
+        if (end->nb_bytes) {
+            qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
+        }
+        /* NOTE: we have a write_aio blkdebug event here followed by
+         * a cow_write one in do_perform_cow_write(), but there's only
+         * one single I/O operation */
+        BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
+        ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
+    } else {
+        /* If there's no guest data then write both COW regions separately */
+        qemu_iovec_reset(&qiov);
+        qemu_iovec_add(&qiov, start_buffer, start->nb_bytes);
+        ret = do_perform_cow_write(bs, m->alloc_offset, start->offset, &qiov);
+        if (ret < 0) {
+            goto fail;
+        }
+
+        qemu_iovec_reset(&qiov);
+        qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
+        ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
     }
 
-    qemu_iovec_reset(&qiov);
-    qemu_iovec_add(&qiov, end_buffer, end->nb_bytes);
-    ret = do_perform_cow_write(bs, m->alloc_offset, end->offset, &qiov);
 fail:
     qemu_co_mutex_lock(&s->lock);
 
diff --git a/block/qcow2.c b/block/qcow2.c
index b3ba5daa93..89be2083d4 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1575,6 +1575,44 @@ fail:
     return ret;
 }
 
+/* Check if it's possible to merge a write request with the writing of
+ * the data from the COW regions */
+static bool can_merge_cow(uint64_t offset, unsigned bytes,
+                          QEMUIOVector *hd_qiov, QCowL2Meta *l2meta)
+{
+    QCowL2Meta *m;
+
+    for (m = l2meta; m != NULL; m = m->next) {
+        /* If both COW regions are empty then there's nothing to merge */
+        if (m->cow_start.nb_bytes == 0 && m->cow_end.nb_bytes == 0) {
+            continue;
+        }
+
+        /* The data (middle) region must be immediately after the
+         * start region */
+        if (l2meta_cow_start(m) + m->cow_start.nb_bytes != offset) {
+            continue;
+        }
+
+        /* The end region must be immediately after the data (middle)
+         * region */
+        if (m->offset + m->cow_end.offset != offset + bytes) {
+            continue;
+        }
+
+        /* Make sure that adding both COW regions to the QEMUIOVector
+         * does not exceed IOV_MAX */
+        if (hd_qiov->niov > IOV_MAX - 2) {
+            continue;
+        }
+
+        m->data_qiov = hd_qiov;
+        return true;
+    }
+
+    return false;
+}
+
 static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                          uint64_t bytes, QEMUIOVector *qiov,
                                          int flags)
@@ -1657,16 +1695,22 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
             goto fail;
         }
 
-        qemu_co_mutex_unlock(&s->lock);
-        BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-        trace_qcow2_writev_data(qemu_coroutine_self(),
-                                cluster_offset + offset_in_cluster);
-        ret = bdrv_co_pwritev(bs->file,
-                              cluster_offset + offset_in_cluster,
-                              cur_bytes, &hd_qiov, 0);
-        qemu_co_mutex_lock(&s->lock);
-        if (ret < 0) {
-            goto fail;
+        /* If we need to do COW, check if it's possible to merge the
+         * writing of the guest data together with that of the COW regions.
+         * If it's not possible (or not necessary) then write the
+         * guest data now. */
+        if (!can_merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
+            qemu_co_mutex_unlock(&s->lock);
+            BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
+            trace_qcow2_writev_data(qemu_coroutine_self(),
+                                    cluster_offset + offset_in_cluster);
+            ret = bdrv_co_pwritev(bs->file,
+                                  cluster_offset + offset_in_cluster,
+                                  cur_bytes, &hd_qiov, 0);
+            qemu_co_mutex_lock(&s->lock);
+            if (ret < 0) {
+                goto fail;
+            }
         }
 
         while (l2meta != NULL) {
diff --git a/block/qcow2.h b/block/qcow2.h
index c26ee0a33d..87b15eb4aa 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -343,6 +343,13 @@ typedef struct QCowL2Meta
      */
     Qcow2COWRegion cow_end;
 
+    /**
+     * The I/O vector with the data from the actual guest write request.
+     * If non-NULL, this is meant to be merged together with the data
+     * from @cow_start and @cow_end into one single write operation.
+     */
+    QEMUIOVector *data_qiov;
+
     /** Pointer to next L2Meta of the same write request */
     struct QCowL2Meta *next;
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow()
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow() Alberto Garcia
@ 2017-06-07 15:44   ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2017-06-07 15:44 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On 06/07/2017 09:08 AM, Alberto Garcia wrote:
> We are using the return value of qcow2_encrypt_sectors() to detect
> problems but we are throwing away the returned Error since we have no
> way to report it to the user. Therefore we can simply get rid of the
> local Error variable and pass NULL instead.
> 
> Alternatively we could try to figure out a way to pass the original
> error instead of simply returning -EIO, but that would be more
> invasive, so let's keep the current approach.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion Alberto Garcia
@ 2017-06-07 16:02   ` Eric Blake
  2017-06-08 13:06     ` Alberto Garcia
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-06-07 16:02 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2511 bytes --]

On 06/07/2017 09:08 AM, Alberto Garcia wrote:
> Qcow2COWRegion has two attributes:
> 
> - The offset of the COW region from the start of the first cluster
>   touched by the I/O request. Since it's always going to be positive
>   and the maximum request size is at most INT_MAX, we can use a
>   regular unsigned int to store this offset.

I don't know if we will ever get to the point that we allow a 64-bit
request at the block layer (and then the block layer guarantees it is
split down to the driver's limits, which works when a driver is still
bound by 32-bit limits).  But we ALSO know that a cluster is at most 2M
(in our current implementation of qcow2), so the offset of where the COW
region starts in relation to the start of a cluster is < 2M.

> 
> - The size of the COW region in bytes. This is guaranteed to be >= 0,
>   so we should use an unsigned type instead.

And likewise, since a COW region is a sub-cluster, and clusters are
bounded at 2M, we also have a sub-int upper bound on the size of the region.

> 
> In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes.
> It will also help keep some assertions simpler now that we know that
> there are no negative numbers.
> 
> The prototype of do_perform_cow() is also updated to reflect these
> changes.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 4 ++--
>  block/qcow2.h         | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index d1c419f52b..a86c5a75a9 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -406,8 +406,8 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
>  static int coroutine_fn do_perform_cow(BlockDriverState *bs,
>                                         uint64_t src_cluster_offset,
>                                         uint64_t cluster_offset,
> -                                       int offset_in_cluster,
> -                                       int bytes)
> +                                       unsigned offset_in_cluster,
> +                                       unsigned bytes)

I don't know if the code base has a strong preference for 'unsigned int'
over 'unsigned', but it doesn't bother me.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}()
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}() Alberto Garcia
@ 2017-06-07 16:20   ` Manos Pitsidianakis
  2017-06-16 15:31   ` Kevin Wolf
  1 sibling, 0 replies; 21+ messages in thread
From: Manos Pitsidianakis @ 2017-06-07 16:20 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: qemu-devel, Kevin Wolf, qemu-block, Max Reitz, Stefan Hajnoczi,
	Denis V . Lunev

[-- Attachment #1: Type: text/plain, Size: 495 bytes --]

On Wed, Jun 07, 2017 at 05:08:27PM +0300, Alberto Garcia wrote:
>Instead of passing a single buffer pointer to do_perform_cow_write(),
>pass a QEMUIOVector. This will allow us to merge the write requests
>for the COW regions and the actual data into a single one.
>
>Although do_perform_cow_read() does not strictly need to change its
>API, we're doing it here as well for consistency.
>
>Signed-off-by: Alberto Garcia <berto@igalia.com>

Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice Alberto Garcia
@ 2017-06-07 21:43   ` Eric Blake
  2017-06-08  7:09     ` Alberto Garcia
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-06-07 21:43 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

On 06/07/2017 09:08 AM, Alberto Garcia wrote:
> Instead of calling perform_cow() twice with a different COW region
> each time, call it just once and make perform_cow() handle both
> regions.
> 
> This patch simply moves code around. The next one will do the actual
> reordering of the COW operations.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 38 +++++++++++++++++++++++---------------
>  1 file changed, 23 insertions(+), 15 deletions(-)

>      qemu_co_mutex_unlock(&s->lock);
> -    ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
> +    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
> +                         start->offset, start->nb_bytes);
> +    if (ret < 0) {
> +        goto fail;
> +    }
> +
> +    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
> +                         end->offset, end->nb_bytes);
> +
> +fail:

Since you reach this label even on success, it feels a bit awkward. I
probably would have avoided the goto and used fewer lines by refactoring
the logic as:

ret = do_perform_cow(..., start->...);
if (ret >= 0) {
    ret = do_perform_cow(..., end->...);
}

But that doesn't affect correctness, so whether or not you redo the
logic in the shorter fashion:

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice
  2017-06-07 21:43   ` Eric Blake
@ 2017-06-08  7:09     ` Alberto Garcia
  0 siblings, 0 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-08  7:09 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

On Wed 07 Jun 2017 11:43:34 PM CEST, Eric Blake wrote:
>>  block/qcow2-cluster.c | 38 +++++++++++++++++++++++---------------
>>  1 file changed, 23 insertions(+), 15 deletions(-)
>
>>      qemu_co_mutex_unlock(&s->lock);
>> -    ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
>> +    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
>> +                         start->offset, start->nb_bytes);
>> +    if (ret < 0) {
>> +        goto fail;
>> +    }
>> +
>> +    ret = do_perform_cow(bs, m->offset, m->alloc_offset,
>> +                         end->offset, end->nb_bytes);
>> +
>> +fail:
>
> Since you reach this label even on success, it feels a bit awkward. I
> probably would have avoided the goto and used fewer lines by
> refactoring the logic as:
>
> ret = do_perform_cow(..., start->...);
> if (ret >= 0) {
>     ret = do_perform_cow(..., end->...);
> }

I actually wrote this code without any gotos the way you mention, and it
works fine in this patch, but as I was making more changes I ended up
with a quite large piece of code that needed to add "if (!ret)" to
almost every if statement, so I didn't quite like the final result.

I'm open to reconsider it, but take a look first at the code after the
last patch and you'll see that it would not be as neat as it appears
now.

Berto

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion
  2017-06-07 16:02   ` Eric Blake
@ 2017-06-08 13:06     ` Alberto Garcia
  2017-06-08 13:38       ` Eric Blake
  0 siblings, 1 reply; 21+ messages in thread
From: Alberto Garcia @ 2017-06-08 13:06 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

On Wed 07 Jun 2017 06:02:06 PM CEST, Eric Blake wrote:

>> - The offset of the COW region from the start of the first cluster
>>   touched by the I/O request. Since it's always going to be positive
>>   and the maximum request size is at most INT_MAX, we can use a
>>   regular unsigned int to store this offset.
>
> I don't know if we will ever get to the point that we allow a 64-bit
> request at the block layer (and then the block layer guarantees it is
> split down to the driver's limits, which works when a driver is still
> bound by 32-bit limits).  But we ALSO know that a cluster is at most
> 2M (in our current implementation of qcow2), so the offset of where
> the COW region starts in relation to the start of a cluster is < 2M.

That's correct for the offset of the first region (m->cow_start),
however m->cow_end is also relative to the offset of the first allocated
cluster, so it can be > 2M if the requests spans several clusters.

So I guess the maximum theoretical offset would be something like
INT_MAX + 2*cluster_size (a bit below that actually).

>> - The size of the COW region in bytes. This is guaranteed to be >= 0,
>>   so we should use an unsigned type instead.
>
> And likewise, since a COW region is a sub-cluster, and clusters are
> bounded at 2M, we also have a sub-int upper bound on the size of the
> region.

That's correct.

Berto

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion
  2017-06-08 13:06     ` Alberto Garcia
@ 2017-06-08 13:38       ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2017-06-08 13:38 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 1504 bytes --]

On 06/08/2017 08:06 AM, Alberto Garcia wrote:
> On Wed 07 Jun 2017 06:02:06 PM CEST, Eric Blake wrote:
> 
>>> - The offset of the COW region from the start of the first cluster
>>>   touched by the I/O request. Since it's always going to be positive
>>>   and the maximum request size is at most INT_MAX, we can use a
>>>   regular unsigned int to store this offset.
>>
>> I don't know if we will ever get to the point that we allow a 64-bit
>> request at the block layer (and then the block layer guarantees it is
>> split down to the driver's limits, which works when a driver is still
>> bound by 32-bit limits).  But we ALSO know that a cluster is at most
>> 2M (in our current implementation of qcow2), so the offset of where
>> the COW region starts in relation to the start of a cluster is < 2M.
> 
> That's correct for the offset of the first region (m->cow_start),
> however m->cow_end is also relative to the offset of the first allocated
> cluster, so it can be > 2M if the requests spans several clusters.
> 
> So I guess the maximum theoretical offset would be something like
> INT_MAX + 2*cluster_size (a bit below that actually).

At which point, your earlier claim that we bound requests at INT_MAX
takes over.  But thanks for correcting me that the end cow region can
indeed be more than a cluster away from the beginning region.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write() Alberto Garcia
@ 2017-06-09 14:53   ` Eric Blake
  2017-06-12 13:00     ` Alberto Garcia
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-06-09 14:53 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 3321 bytes --]

On 06/07/2017 09:08 AM, Alberto Garcia wrote:
> This patch splits do_perform_cow() into three separate functions to
> read, encrypt and write the COW regions.
> 
> perform_cow() can now read both regions first, then encrypt them and
> finally write them to disk. The memory allocation is also done in
> this function now, using one single buffer large enough to hold both
> regions.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 114 +++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 84 insertions(+), 30 deletions(-)
> 

Let's suppose we have a guest issuing 512-byte aligned requests and a
host that requires 4k alignment; and the guest does an operation that
needs a COW with one sector at both the front and end of the cluster.

> @@ -760,22 +776,59 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
>      BDRVQcow2State *s = bs->opaque;
>      Qcow2COWRegion *start = &m->cow_start;
>      Qcow2COWRegion *end = &m->cow_end;
> +    unsigned buffer_size = start->nb_bytes + end->nb_bytes;

This sets buffer_size to 1024 initially.

> +    uint8_t *start_buffer, *end_buffer;
>      int ret;
>  
> +    assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
> +
>      if (start->nb_bytes == 0 && end->nb_bytes == 0) {
>          return 0;
>      }
>  
> +    /* Reserve a buffer large enough to store the data from both the
> +     * start and end COW regions */
> +    start_buffer = qemu_try_blockalign(bs, buffer_size);

This is going to allocate a bigger buffer, namely one that is at least
4k in size (at least, that's my understanding - the block device is able
to track its preferred IO size/alignment).

> +    if (start_buffer == NULL) {
> +        return -ENOMEM;
> +    }
> +    /* The part of the buffer where the end region is located */
> +    end_buffer = start_buffer + start->nb_bytes;

But now end_buffer does not have optimal alignment.  In the old code, we
called qemu_try_blockalign() twice, so that both read()s were called on
a 4k boundary; but now, the end read() is called unaligned to a 4k
boundary.  Of course, since we're only reading 512 bytes, instead of 4k,
it MIGHT not matter, but I don't know if we are going to cause a bounce
buffer to come into play that we could otherwise avoid if we were
smarter with our alignments.  Is that something we need to analyze
further, or even possibly intentionally over-allocate our buffer to
ensure optimal read alignments?

> +    /* And now we can write everything */
> +    ret = do_perform_cow_write(bs, m->alloc_offset, start->offset,
> +                               start_buffer, start->nb_bytes);
> +    if (ret < 0) {
> +        goto fail;
> +    }
>  
> +    ret = do_perform_cow_write(bs, m->alloc_offset, end->offset,
> +                               end_buffer, end->nb_bytes);

At any rate, other than the potential of a pessimization due to poor
alignments, your split looks sane, and it makes it more obvious that if
we set up the write iov, a later patch could then call a single
do_perform_cow_write using pwritev() over both chunks in one syscall.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
  2017-06-09 14:53   ` Eric Blake
@ 2017-06-12 13:00     ` Alberto Garcia
  0 siblings, 0 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-12 13:00 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Denis V . Lunev,
	Stefan Hajnoczi

On Fri 09 Jun 2017 04:53:05 PM CEST, Eric Blake wrote:
> Let's suppose we have a guest issuing 512-byte aligned requests and a
> host that requires 4k alignment; and the guest does an operation that
> needs a COW with one sector at both the front and end of the cluster.
>
>> @@ -760,22 +776,59 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
>>      BDRVQcow2State *s = bs->opaque;
>>      Qcow2COWRegion *start = &m->cow_start;
>>      Qcow2COWRegion *end = &m->cow_end;
>> +    unsigned buffer_size = start->nb_bytes + end->nb_bytes;
>
> This sets buffer_size to 1024 initially.
>
>> +    uint8_t *start_buffer, *end_buffer;
>>      int ret;
>>  
>> +    assert(start->nb_bytes <= UINT_MAX - end->nb_bytes);
>> +
>>      if (start->nb_bytes == 0 && end->nb_bytes == 0) {
>>          return 0;
>>      }
>>  
>> +    /* Reserve a buffer large enough to store the data from both the
>> +     * start and end COW regions */
>> +    start_buffer = qemu_try_blockalign(bs, buffer_size);
>
> This is going to allocate a bigger buffer, namely one that is at least
> 4k in size (at least, that's my understanding - the block device is
> able to track its preferred IO size/alignment).
>
>> +    if (start_buffer == NULL) {
>> +        return -ENOMEM;
>> +    }
>> +    /* The part of the buffer where the end region is located */
>> +    end_buffer = start_buffer + start->nb_bytes;
>
> But now end_buffer does not have optimal alignment.  In the old code,
> we called qemu_try_blockalign() twice, so that both read()s were
> called on a 4k boundary; but now, the end read() is called unaligned
> to a 4k boundary.  Of course, since we're only reading 512 bytes,
> instead of 4k, it MIGHT not matter, but I don't know if we are going
> to cause a bounce buffer to come into play that we could otherwise
> avoid if we were smarter with our alignments.  Is that something we
> need to analyze further, or even possibly intentionally over-allocate
> our buffer to ensure optimal read alignments?

I think you're right, I actually thought about this but forgot it when
preparing the series for publishing.

If I'm not wrong it should be simply a matter of replacing

   buffer_size = start->nb_bytes + end->nb_bytes;

with

   buffer_size = QEMU_ALIGN_UP(start->nb_bytes, bdrv_opt_mem_align(bs))
                 + end->nb_bytes;

and then

    end_buffer = start_buffer + buffer_size - end->nb_bytes;

    (this one we do anyway in patch 5)

Berto

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}()
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}() Alberto Garcia
  2017-06-07 16:20   ` Manos Pitsidianakis
@ 2017-06-16 15:31   ` Kevin Wolf
  1 sibling, 0 replies; 21+ messages in thread
From: Kevin Wolf @ 2017-06-16 15:31 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: qemu-devel, qemu-block, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi

Am 07.06.2017 um 16:08 hat Alberto Garcia geschrieben:
> Instead of passing a single buffer pointer to do_perform_cow_write(),
> pass a QEMUIOVector. This will allow us to merge the write requests
> for the COW regions and the actual data into a single one.
> 
> Although do_perform_cow_read() does not strictly need to change its
> API, we're doing it here as well for consistency.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2-cluster.c | 51 ++++++++++++++++++++++++---------------------------
>  1 file changed, 24 insertions(+), 27 deletions(-)

> @@ -807,22 +798,25 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m)
>      /* The part of the buffer where the end region is located */
>      end_buffer = start_buffer + buffer_size - end->nb_bytes;
>  
> +    qemu_iovec_init(&qiov, 3);
> +

You don't actually make use of more than one iovec in this patch. And
after the last patch, the value is potentially too low - which isn't
fatal because this is not a fixed maximum, but just a hint, but it can
cause unnecessary memory allocations.

It would probably be better to use 1 here and change it into
2 + m->data_qiov->niov when you add write merging.

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data Alberto Garcia
@ 2017-06-16 15:31   ` Kevin Wolf
  2017-06-19 11:50     ` Alberto Garcia
  0 siblings, 1 reply; 21+ messages in thread
From: Kevin Wolf @ 2017-06-16 15:31 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: qemu-devel, qemu-block, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi

Am 07.06.2017 um 16:08 hat Alberto Garcia geschrieben:
> If the guest tries to write data that results on the allocation of a
> new cluster, instead of writing the guest data first and then the data
> from the COW regions, write everything together using one single I/O
> operation.
> 
> This can improve the write performance by 25% or more, depending on
> several factors such as the media type, the cluster size and the I/O
> request size.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>

> diff --git a/block/qcow2.c b/block/qcow2.c
> index b3ba5daa93..89be2083d4 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1575,6 +1575,44 @@ fail:
>      return ret;
>  }
>  
> +/* Check if it's possible to merge a write request with the writing of
> + * the data from the COW regions */
> +static bool can_merge_cow(uint64_t offset, unsigned bytes,
> +                          QEMUIOVector *hd_qiov, QCowL2Meta *l2meta)
> +{
> +    QCowL2Meta *m;
> +
> +    for (m = l2meta; m != NULL; m = m->next) {
> +        /* If both COW regions are empty then there's nothing to merge */
> +        if (m->cow_start.nb_bytes == 0 && m->cow_end.nb_bytes == 0) {
> +            continue;
> +        }
> +
> +        /* The data (middle) region must be immediately after the
> +         * start region */
> +        if (l2meta_cow_start(m) + m->cow_start.nb_bytes != offset) {
> +            continue;
> +        }
> +
> +        /* The end region must be immediately after the data (middle)
> +         * region */
> +        if (m->offset + m->cow_end.offset != offset + bytes) {
> +            continue;
> +        }
> +
> +        /* Make sure that adding both COW regions to the QEMUIOVector
> +         * does not exceed IOV_MAX */
> +        if (hd_qiov->niov > IOV_MAX - 2) {
> +            continue;
> +        }
> +
> +        m->data_qiov = hd_qiov;

I don't think having this side effect is good for a function that is
called can_xyz(). I'd either call it merge_cow() or move the assignment
to the caller.

If we consider allowing to merge multiple L2Metas in the future, the
actual merging could become more complicated, so maybe merge_cow() is
the better option.

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW
  2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
                   ` (6 preceding siblings ...)
  2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data Alberto Garcia
@ 2017-06-16 15:31 ` Kevin Wolf
  7 siblings, 0 replies; 21+ messages in thread
From: Kevin Wolf @ 2017-06-16 15:31 UTC (permalink / raw)
  To: Alberto Garcia
  Cc: qemu-devel, qemu-block, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi

Am 07.06.2017 um 16:08 hat Alberto Garcia geschrieben:
> Hi all,
> 
> here's a patch series that rewrites the copy-on-write code in the
> qcow2 driver to reduce the number of I/O operations.
> 
> This is version v2, please refer to the original e-mail for a complete
> description:
> 
> https://lists.gnu.org/archive/html/qemu-block/2017-05/msg00882.html

I had two minor comments, and there's the one from Eric about the buffer
alignement. Other than that, this series looks good. Feel free to add my
Reviewed-by after addressing these three comments.

Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data
  2017-06-16 15:31   ` Kevin Wolf
@ 2017-06-19 11:50     ` Alberto Garcia
  0 siblings, 0 replies; 21+ messages in thread
From: Alberto Garcia @ 2017-06-19 11:50 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: qemu-devel, qemu-block, Max Reitz, Eric Blake, Denis V . Lunev,
	Stefan Hajnoczi

On Fri 16 Jun 2017 05:31:42 PM CEST, Kevin Wolf wrote:
>> +        /* Make sure that adding both COW regions to the QEMUIOVector
>> +         * does not exceed IOV_MAX */
>> +        if (hd_qiov->niov > IOV_MAX - 2) {
>> +            continue;
>> +        }
>> +
>> +        m->data_qiov = hd_qiov;
>
> I don't think having this side effect is good for a function that is
> called can_xyz(). I'd either call it merge_cow() or move the
> assignment to the caller.

You're right, I think I prefer to move the assignment to the caller for
now.

Berto

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-06-19 11:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-07 14:08 [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Alberto Garcia
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 1/7] qcow2: Remove unused Error variable in do_perform_cow() Alberto Garcia
2017-06-07 15:44   ` Eric Blake
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 2/7] qcow2: Use unsigned int for both members of Qcow2COWRegion Alberto Garcia
2017-06-07 16:02   ` Eric Blake
2017-06-08 13:06     ` Alberto Garcia
2017-06-08 13:38       ` Eric Blake
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 3/7] qcow2: Make perform_cow() call do_perform_cow() twice Alberto Garcia
2017-06-07 21:43   ` Eric Blake
2017-06-08  7:09     ` Alberto Garcia
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 4/7] qcow2: Split do_perform_cow() into _read(), _encrypt() and _write() Alberto Garcia
2017-06-09 14:53   ` Eric Blake
2017-06-12 13:00     ` Alberto Garcia
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 5/7] qcow2: Allow reading both COW regions with only one request Alberto Garcia
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 6/7] qcow2: Pass a QEMUIOVector to do_perform_cow_{read, write}() Alberto Garcia
2017-06-07 16:20   ` Manos Pitsidianakis
2017-06-16 15:31   ` Kevin Wolf
2017-06-07 14:08 ` [Qemu-devel] [PATCH v2 7/7] qcow2: Merge the writing of the COW regions with the guest data Alberto Garcia
2017-06-16 15:31   ` Kevin Wolf
2017-06-19 11:50     ` Alberto Garcia
2017-06-16 15:31 ` [Qemu-devel] [PATCH v2 0/7] qcow2: Reduce the number of I/O ops when doing COW Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).