qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous
@ 2018-06-26 14:24 Kevin Wolf
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value Kevin Wolf
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

This fixes the problem that blockdev-create on a local file blocks the
main loop despite being a background job. This was caused by file-posix
preallocating the image with blocking syscalls rather than moving this
to the thread pool and yielding the coroutine meanwhile.

v2:
- Add locking to qcow2_co_discard()
- Extra qcow2 fix and cleanup related to the locking code
- Use tracked requests infrastructure for serialising I/O requests
  against truncate in newly allocated areas

Kevin Wolf (6):
  qcow2: Fix qcow2_truncate() error return value
  block: Convert .bdrv_truncate callback to coroutine_fn
  qcow2: Remove coroutine trampoline for preallocate_co()
  block: Move bdrv_truncate() implementation to io.c
  block: Use tracked request for truncate
  file-posix: Make .bdrv_co_truncate asynchronous

 include/block/block.h     |   4 +
 include/block/block_int.h |   7 +-
 include/block/raw-aio.h   |   4 +-
 block.c                   |  64 +----------
 block/copy-on-read.c      |   8 +-
 block/crypto.c            |   9 +-
 block/file-posix.c        | 277 ++++++++++++++++++++++++++--------------------
 block/file-win32.c        |   6 +-
 block/gluster.c           |  14 ++-
 block/io.c                | 134 ++++++++++++++++++++++
 block/iscsi.c             |   8 +-
 block/nfs.c               |   7 +-
 block/qcow2.c             | 124 +++++++++------------
 block/qed.c               |   8 +-
 block/raw-format.c        |   8 +-
 block/rbd.c               |   8 +-
 block/sheepdog.c          |  12 +-
 block/ssh.c               |   6 +-
 18 files changed, 405 insertions(+), 303 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:51   ` Stefan Hajnoczi
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn Kevin Wolf
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

If qcow2_alloc_clusters_at() returns an error, we do need to negate it
to get back the positive errno code for error_setg_errno(), but we still
need to return the negative error code.

Fixes: 772d1f973f87269f6a4a4ea4b880680f3779bbdf
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index a3a3aa2a97..6b2e1e1058 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3597,7 +3597,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         if (clusters_allocated < 0) {
             error_setg_errno(errp, -clusters_allocated,
                              "Failed to allocate data clusters");
-            return -clusters_allocated;
+            return clusters_allocated;
         }
 
         assert(clusters_allocated == nb_new_data_clusters);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:54   ` Stefan Hajnoczi
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co() Kevin Wolf
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

bdrv_truncate() is an operation that can block (even for a quite long
time, depending on the PreallocMode) in I/O paths that shouldn't block.
Convert it to a coroutine_fn so that we have the infrastructure for
drivers to make their .bdrv_co_truncate implementation asynchronous.

This change could potentially introduce new race conditions because
bdrv_truncate() isn't necessarily executed atomically any more. Whether
this is a problem needs to be evaluated for each block driver that
supports truncate:

* file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
  protocol drivers are trivially safe because they don't actually yield
  yet, so there is no change in behaviour.

* copy-on-read, crypto, raw-format: Essentially just filter drivers that
  pass the request to a child node, no problem.

* qcow2: The implementation modifies metadata, so it needs to hold
  s->lock to be safe with concurrent I/O requests. In order to avoid
  double locking, this requires pulling the locking out into
  preallocate_co() and using qcow2_write_caches() instead of
  bdrv_flush().

* qed: Does a single header update, this is fine without locking.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block.h     |  4 +++
 include/block/block_int.h |  4 +--
 block.c                   | 63 ++++++++++++++++++++++++++++++++++------
 block/copy-on-read.c      |  8 +++---
 block/crypto.c            |  9 +++---
 block/file-posix.c        | 12 ++++----
 block/file-win32.c        |  6 ++--
 block/gluster.c           | 14 +++++----
 block/iscsi.c             |  8 +++---
 block/nfs.c               |  7 +++--
 block/qcow2.c             | 73 ++++++++++++++++++++++++++++-------------------
 block/qed.c               |  8 ++++--
 block/raw-format.c        |  8 +++---
 block/rbd.c               |  8 ++++--
 block/sheepdog.c          | 12 ++++----
 block/ssh.c               |  6 ++--
 16 files changed, 161 insertions(+), 89 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index b1d6fdb97a..42e59ff585 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -300,8 +300,12 @@ int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild *child, int64_t offset,
 BlockDriverState *bdrv_find_backing_image(BlockDriverState *bs,
     const char *backing_file);
 void bdrv_refresh_filename(BlockDriverState *bs);
+
+int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
+                                  PreallocMode prealloc, Error **errp);
 int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
                   Error **errp);
+
 int64_t bdrv_nb_sectors(BlockDriverState *bs);
 int64_t bdrv_getlength(BlockDriverState *bs);
 int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 74646ed722..c653ee663a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -289,8 +289,8 @@ struct BlockDriver {
      * bdrv_parse_filename.
      */
     const char *protocol_name;
-    int (*bdrv_truncate)(BlockDriverState *bs, int64_t offset,
-                         PreallocMode prealloc, Error **errp);
+    int coroutine_fn (*bdrv_co_truncate)(BlockDriverState *bs, int64_t offset,
+                                         PreallocMode prealloc, Error **errp);
 
     int64_t (*bdrv_getlength)(BlockDriverState *bs);
     bool has_variable_length;
diff --git a/block.c b/block.c
index 1b8147c1b3..f761bc85d5 100644
--- a/block.c
+++ b/block.c
@@ -3788,8 +3788,8 @@ exit:
 /**
  * Truncate file to 'offset' bytes (needed only for file protocols)
  */
-int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
-                  Error **errp)
+int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
+                                  PreallocMode prealloc, Error **errp)
 {
     BlockDriverState *bs = child->bs;
     BlockDriver *drv = bs->drv;
@@ -3807,23 +3807,28 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
         return -EINVAL;
     }
 
-    if (!drv->bdrv_truncate) {
+    bdrv_inc_in_flight(bs);
+
+    if (!drv->bdrv_co_truncate) {
         if (bs->file && drv->is_filter) {
-            return bdrv_truncate(bs->file, offset, prealloc, errp);
+            ret = bdrv_co_truncate(bs->file, offset, prealloc, errp);
+            goto out;
         }
         error_setg(errp, "Image format driver does not support resize");
-        return -ENOTSUP;
+        ret = -ENOTSUP;
+        goto out;
     }
     if (bs->read_only) {
         error_setg(errp, "Image is read-only");
-        return -EACCES;
+        ret = -EACCES;
+        goto out;
     }
 
     assert(!(bs->open_flags & BDRV_O_INACTIVE));
 
-    ret = drv->bdrv_truncate(bs, offset, prealloc, errp);
+    ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
     if (ret < 0) {
-        return ret;
+        goto out;
     }
     ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
     if (ret < 0) {
@@ -3834,9 +3839,51 @@ int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
     bdrv_dirty_bitmap_truncate(bs, offset);
     bdrv_parent_cb_resize(bs);
     atomic_inc(&bs->write_gen);
+
+out:
+    bdrv_dec_in_flight(bs);
     return ret;
 }
 
+typedef struct TruncateCo {
+    BdrvChild *child;
+    int64_t offset;
+    PreallocMode prealloc;
+    Error **errp;
+    int ret;
+} TruncateCo;
+
+static void coroutine_fn bdrv_truncate_co_entry(void *opaque)
+{
+    TruncateCo *tco = opaque;
+    tco->ret = bdrv_co_truncate(tco->child, tco->offset, tco->prealloc,
+                                tco->errp);
+}
+
+int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
+                  Error **errp)
+{
+    Coroutine *co;
+    TruncateCo tco = {
+        .child      = child,
+        .offset     = offset,
+        .prealloc   = prealloc,
+        .errp       = errp,
+        .ret        = NOT_DONE,
+    };
+
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_truncate_co_entry(&tco);
+    } else {
+        co = qemu_coroutine_create(bdrv_truncate_co_entry, &tco);
+        qemu_coroutine_enter(co);
+        BDRV_POLL_WHILE(child->bs, tco.ret == NOT_DONE);
+    }
+
+    return tco.ret;
+}
+
 /**
  * Length of a allocated file in bytes. Sparse files are counted by actual
  * allocated space. Return < 0 if error or unknown.
diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 6a97208888..1dcdaeed69 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -80,10 +80,10 @@ static int64_t cor_getlength(BlockDriverState *bs)
 }
 
 
-static int cor_truncate(BlockDriverState *bs, int64_t offset,
-                        PreallocMode prealloc, Error **errp)
+static int coroutine_fn cor_co_truncate(BlockDriverState *bs, int64_t offset,
+                                        PreallocMode prealloc, Error **errp)
 {
-    return bdrv_truncate(bs->file, offset, prealloc, errp);
+    return bdrv_co_truncate(bs->file, offset, prealloc, errp);
 }
 
 
@@ -147,7 +147,7 @@ BlockDriver bdrv_copy_on_read = {
     .bdrv_child_perm                    = cor_child_perm,
 
     .bdrv_getlength                     = cor_getlength,
-    .bdrv_truncate                      = cor_truncate,
+    .bdrv_co_truncate                   = cor_co_truncate,
 
     .bdrv_co_preadv                     = cor_co_preadv,
     .bdrv_co_pwritev                    = cor_co_pwritev,
diff --git a/block/crypto.c b/block/crypto.c
index aaa8fb7530..210c80d5c7 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -357,8 +357,9 @@ static int block_crypto_co_create_generic(BlockDriverState *bs,
     return ret;
 }
 
-static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
-                                 PreallocMode prealloc, Error **errp)
+static int coroutine_fn
+block_crypto_co_truncate(BlockDriverState *bs, int64_t offset,
+                         PreallocMode prealloc, Error **errp)
 {
     BlockCrypto *crypto = bs->opaque;
     uint64_t payload_offset =
@@ -371,7 +372,7 @@ static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
 
     offset += payload_offset;
 
-    return bdrv_truncate(bs->file, offset, prealloc, errp);
+    return bdrv_co_truncate(bs->file, offset, prealloc, errp);
 }
 
 static void block_crypto_close(BlockDriverState *bs)
@@ -700,7 +701,7 @@ BlockDriver bdrv_crypto_luks = {
     .bdrv_child_perm    = bdrv_format_default_perms,
     .bdrv_co_create     = block_crypto_co_create_luks,
     .bdrv_co_create_opts = block_crypto_co_create_opts_luks,
-    .bdrv_truncate      = block_crypto_truncate,
+    .bdrv_co_truncate   = block_crypto_co_truncate,
     .create_opts        = &block_crypto_create_opts_luks,
 
     .bdrv_reopen_prepare = block_crypto_reopen_prepare,
diff --git a/block/file-posix.c b/block/file-posix.c
index 07bb061fe4..6223a8bccc 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1856,8 +1856,8 @@ out:
     return result;
 }
 
-static int raw_truncate(BlockDriverState *bs, int64_t offset,
-                        PreallocMode prealloc, Error **errp)
+static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
+                                        PreallocMode prealloc, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
     struct stat st;
@@ -2602,7 +2602,7 @@ BlockDriver bdrv_file = {
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
 
-    .bdrv_truncate = raw_truncate,
+    .bdrv_co_truncate = raw_co_truncate,
     .bdrv_getlength = raw_getlength,
     .bdrv_get_info = raw_get_info,
     .bdrv_get_allocated_file_size
@@ -3082,7 +3082,7 @@ static BlockDriver bdrv_host_device = {
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
 
-    .bdrv_truncate      = raw_truncate,
+    .bdrv_co_truncate       = raw_co_truncate,
     .bdrv_getlength	= raw_getlength,
     .bdrv_get_info = raw_get_info,
     .bdrv_get_allocated_file_size
@@ -3204,7 +3204,7 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
 
-    .bdrv_truncate      = raw_truncate,
+    .bdrv_co_truncate    = raw_co_truncate,
     .bdrv_getlength      = raw_getlength,
     .has_variable_length = true,
     .bdrv_get_allocated_file_size
@@ -3334,7 +3334,7 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_io_plug = raw_aio_plug,
     .bdrv_io_unplug = raw_aio_unplug,
 
-    .bdrv_truncate      = raw_truncate,
+    .bdrv_co_truncate    = raw_co_truncate,
     .bdrv_getlength      = raw_getlength,
     .has_variable_length = true,
     .bdrv_get_allocated_file_size
diff --git a/block/file-win32.c b/block/file-win32.c
index 3c67db4336..0411fe80fd 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -467,8 +467,8 @@ static void raw_close(BlockDriverState *bs)
     }
 }
 
-static int raw_truncate(BlockDriverState *bs, int64_t offset,
-                        PreallocMode prealloc, Error **errp)
+static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
+                                        PreallocMode prealloc, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
     LONG low, high;
@@ -640,7 +640,7 @@ BlockDriver bdrv_file = {
     .bdrv_aio_pwritev   = raw_aio_pwritev,
     .bdrv_aio_flush     = raw_aio_flush,
 
-    .bdrv_truncate	= raw_truncate,
+    .bdrv_co_truncate   = raw_co_truncate,
     .bdrv_getlength	= raw_getlength,
     .bdrv_get_allocated_file_size
                         = raw_get_allocated_file_size,
diff --git a/block/gluster.c b/block/gluster.c
index b5fe7f3e87..a4e1c8ecd8 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1177,8 +1177,10 @@ static coroutine_fn int qemu_gluster_co_rw(BlockDriverState *bs,
     return acb.ret;
 }
 
-static int qemu_gluster_truncate(BlockDriverState *bs, int64_t offset,
-                                 PreallocMode prealloc, Error **errp)
+static coroutine_fn int qemu_gluster_co_truncate(BlockDriverState *bs,
+                                                 int64_t offset,
+                                                 PreallocMode prealloc,
+                                                 Error **errp)
 {
     BDRVGlusterState *s = bs->opaque;
     return qemu_gluster_do_truncate(s->fd, offset, prealloc, errp);
@@ -1499,7 +1501,7 @@ static BlockDriver bdrv_gluster = {
     .bdrv_co_create_opts          = qemu_gluster_co_create_opts,
     .bdrv_getlength               = qemu_gluster_getlength,
     .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
+    .bdrv_co_truncate             = qemu_gluster_co_truncate,
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
@@ -1528,7 +1530,7 @@ static BlockDriver bdrv_gluster_tcp = {
     .bdrv_co_create_opts          = qemu_gluster_co_create_opts,
     .bdrv_getlength               = qemu_gluster_getlength,
     .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
+    .bdrv_co_truncate             = qemu_gluster_co_truncate,
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
@@ -1557,7 +1559,7 @@ static BlockDriver bdrv_gluster_unix = {
     .bdrv_co_create_opts          = qemu_gluster_co_create_opts,
     .bdrv_getlength               = qemu_gluster_getlength,
     .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
+    .bdrv_co_truncate             = qemu_gluster_co_truncate,
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
@@ -1592,7 +1594,7 @@ static BlockDriver bdrv_gluster_rdma = {
     .bdrv_co_create_opts          = qemu_gluster_co_create_opts,
     .bdrv_getlength               = qemu_gluster_getlength,
     .bdrv_get_allocated_file_size = qemu_gluster_allocated_file_size,
-    .bdrv_truncate                = qemu_gluster_truncate,
+    .bdrv_co_truncate             = qemu_gluster_co_truncate,
     .bdrv_co_readv                = qemu_gluster_co_readv,
     .bdrv_co_writev               = qemu_gluster_co_writev,
     .bdrv_co_flush_to_disk        = qemu_gluster_co_flush_to_disk,
diff --git a/block/iscsi.c b/block/iscsi.c
index 9f00fb47a5..bc84b14e20 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -2085,8 +2085,8 @@ static void iscsi_reopen_commit(BDRVReopenState *reopen_state)
     }
 }
 
-static int iscsi_truncate(BlockDriverState *bs, int64_t offset,
-                          PreallocMode prealloc, Error **errp)
+static int coroutine_fn iscsi_co_truncate(BlockDriverState *bs, int64_t offset,
+                                          PreallocMode prealloc, Error **errp)
 {
     IscsiLun *iscsilun = bs->opaque;
     Error *local_err = NULL;
@@ -2431,7 +2431,7 @@ static BlockDriver bdrv_iscsi = {
 
     .bdrv_getlength  = iscsi_getlength,
     .bdrv_get_info   = iscsi_get_info,
-    .bdrv_truncate   = iscsi_truncate,
+    .bdrv_co_truncate    = iscsi_co_truncate,
     .bdrv_refresh_limits = iscsi_refresh_limits,
 
     .bdrv_co_block_status  = iscsi_co_block_status,
@@ -2468,7 +2468,7 @@ static BlockDriver bdrv_iser = {
 
     .bdrv_getlength  = iscsi_getlength,
     .bdrv_get_info   = iscsi_get_info,
-    .bdrv_truncate   = iscsi_truncate,
+    .bdrv_co_truncate    = iscsi_co_truncate,
     .bdrv_refresh_limits = iscsi_refresh_limits,
 
     .bdrv_co_block_status  = iscsi_co_block_status,
diff --git a/block/nfs.c b/block/nfs.c
index 743ca0450e..eab1a2c408 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -743,8 +743,9 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
     return (task.ret < 0 ? task.ret : st.st_blocks * 512);
 }
 
-static int nfs_file_truncate(BlockDriverState *bs, int64_t offset,
-                             PreallocMode prealloc, Error **errp)
+static int coroutine_fn
+nfs_file_co_truncate(BlockDriverState *bs, int64_t offset,
+                     PreallocMode prealloc, Error **errp)
 {
     NFSClient *client = bs->opaque;
     int ret;
@@ -873,7 +874,7 @@ static BlockDriver bdrv_nfs = {
 
     .bdrv_has_zero_init             = nfs_has_zero_init,
     .bdrv_get_allocated_file_size   = nfs_get_allocated_file_size,
-    .bdrv_truncate                  = nfs_file_truncate,
+    .bdrv_co_truncate               = nfs_file_co_truncate,
 
     .bdrv_file_open                 = nfs_file_open,
     .bdrv_close                     = nfs_file_close,
diff --git a/block/qcow2.c b/block/qcow2.c
index 6b2e1e1058..cd1836bc5b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2543,15 +2543,12 @@ static void coroutine_fn preallocate_co(void *opaque)
     BlockDriverState *bs = params->bs;
     uint64_t offset = params->offset;
     uint64_t new_length = params->new_length;
-    BDRVQcow2State *s = bs->opaque;
     uint64_t bytes;
     uint64_t host_offset = 0;
     unsigned int cur_bytes;
     int ret;
     QCowL2Meta *meta;
 
-    qemu_co_mutex_lock(&s->lock);
-
     assert(offset <= new_length);
     bytes = new_length - offset;
 
@@ -2604,7 +2601,6 @@ static void coroutine_fn preallocate_co(void *opaque)
     ret = 0;
 
 done:
-    qemu_co_mutex_unlock(&s->lock);
     params->ret = ret;
 }
 
@@ -3041,7 +3037,11 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
 
     /* And if we're supposed to preallocate metadata, do that now */
     if (qcow2_opts->preallocation != PREALLOC_MODE_OFF) {
+        BDRVQcow2State *s = blk_bs(blk)->opaque;
+        qemu_co_mutex_lock(&s->lock);
         ret = preallocate(blk_bs(blk), 0, qcow2_opts->size);
+        qemu_co_mutex_unlock(&s->lock);
+
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Could not preallocate metadata");
             goto out;
@@ -3437,8 +3437,8 @@ fail:
     return ret;
 }
 
-static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
-                          PreallocMode prealloc, Error **errp)
+static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
+                                          PreallocMode prealloc, Error **errp)
 {
     BDRVQcow2State *s = bs->opaque;
     uint64_t old_length;
@@ -3458,17 +3458,21 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         return -EINVAL;
     }
 
+    qemu_co_mutex_lock(&s->lock);
+
     /* cannot proceed if image has snapshots */
     if (s->nb_snapshots) {
         error_setg(errp, "Can't resize an image which has snapshots");
-        return -ENOTSUP;
+        ret = -ENOTSUP;
+        goto fail;
     }
 
     /* cannot proceed if image has bitmaps */
     if (s->nb_bitmaps) {
         /* TODO: resize bitmaps in the image */
         error_setg(errp, "Can't resize an image which has bitmaps");
-        return -ENOTSUP;
+        ret = -ENOTSUP;
+        goto fail;
     }
 
     old_length = bs->total_sectors * 512;
@@ -3479,7 +3483,8 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         if (prealloc != PREALLOC_MODE_OFF) {
             error_setg(errp,
                        "Preallocation can't be used for shrinking an image");
-            return -EINVAL;
+            ret = -EINVAL;
+            goto fail;
         }
 
         ret = qcow2_cluster_discard(bs, ROUND_UP(offset, s->cluster_size),
@@ -3488,40 +3493,42 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
                                     QCOW2_DISCARD_ALWAYS, true);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Failed to discard cropped clusters");
-            return ret;
+            goto fail;
         }
 
         ret = qcow2_shrink_l1_table(bs, new_l1_size);
         if (ret < 0) {
             error_setg_errno(errp, -ret,
                              "Failed to reduce the number of L2 tables");
-            return ret;
+            goto fail;
         }
 
         ret = qcow2_shrink_reftable(bs);
         if (ret < 0) {
             error_setg_errno(errp, -ret,
                              "Failed to discard unused refblocks");
-            return ret;
+            goto fail;
         }
 
         old_file_size = bdrv_getlength(bs->file->bs);
         if (old_file_size < 0) {
             error_setg_errno(errp, -old_file_size,
                              "Failed to inquire current file length");
-            return old_file_size;
+            ret = old_file_size;
+            goto fail;
         }
         last_cluster = qcow2_get_last_cluster(bs, old_file_size);
         if (last_cluster < 0) {
             error_setg_errno(errp, -last_cluster,
                              "Failed to find the last cluster");
-            return last_cluster;
+            ret = last_cluster;
+            goto fail;
         }
         if ((last_cluster + 1) * s->cluster_size < old_file_size) {
             Error *local_err = NULL;
 
-            bdrv_truncate(bs->file, (last_cluster + 1) * s->cluster_size,
-                          PREALLOC_MODE_OFF, &local_err);
+            bdrv_co_truncate(bs->file, (last_cluster + 1) * s->cluster_size,
+                             PREALLOC_MODE_OFF, &local_err);
             if (local_err) {
                 warn_reportf_err(local_err,
                                  "Failed to truncate the tail of the image: ");
@@ -3531,7 +3538,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         ret = qcow2_grow_l1_table(bs, new_l1_size, true);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Failed to grow the L1 table");
-            return ret;
+            goto fail;
         }
     }
 
@@ -3543,7 +3550,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         ret = preallocate(bs, old_length, offset);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Preallocation failed");
-            return ret;
+            goto fail;
         }
         break;
 
@@ -3559,7 +3566,8 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         if (old_file_size < 0) {
             error_setg_errno(errp, -old_file_size,
                              "Failed to inquire current file length");
-            return old_file_size;
+            ret = old_file_size;
+            goto fail;
         }
         old_file_size = ROUND_UP(old_file_size, s->cluster_size);
 
@@ -3589,7 +3597,8 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         if (allocation_start < 0) {
             error_setg_errno(errp, -allocation_start,
                              "Failed to resize refcount structures");
-            return allocation_start;
+            ret = allocation_start;
+            goto fail;
         }
 
         clusters_allocated = qcow2_alloc_clusters_at(bs, allocation_start,
@@ -3597,7 +3606,8 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         if (clusters_allocated < 0) {
             error_setg_errno(errp, -clusters_allocated,
                              "Failed to allocate data clusters");
-            return clusters_allocated;
+            ret = clusters_allocated;
+            goto fail;
         }
 
         assert(clusters_allocated == nb_new_data_clusters);
@@ -3605,13 +3615,13 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
         /* Allocate the data area */
         new_file_size = allocation_start +
                         nb_new_data_clusters * s->cluster_size;
-        ret = bdrv_truncate(bs->file, new_file_size, prealloc, errp);
+        ret = bdrv_co_truncate(bs->file, new_file_size, prealloc, errp);
         if (ret < 0) {
             error_prepend(errp, "Failed to resize underlying file: ");
             qcow2_free_clusters(bs, allocation_start,
                                 nb_new_data_clusters * s->cluster_size,
                                 QCOW2_DISCARD_OTHER);
-            return ret;
+            goto fail;
         }
 
         /* Create the necessary L2 entries */
@@ -3634,7 +3644,7 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
                 qcow2_free_clusters(bs, host_offset,
                                     nb_new_data_clusters * s->cluster_size,
                                     QCOW2_DISCARD_OTHER);
-                return ret;
+                goto fail;
             }
 
             guest_offset += nb_clusters * s->cluster_size;
@@ -3650,11 +3660,11 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
 
     if (prealloc != PREALLOC_MODE_OFF) {
         /* Flush metadata before actually changing the image size */
-        ret = bdrv_flush(bs);
+        ret = qcow2_write_caches(bs);
         if (ret < 0) {
             error_setg_errno(errp, -ret,
                              "Failed to flush the preallocated area to disk");
-            return ret;
+            goto fail;
         }
     }
 
@@ -3664,11 +3674,14 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
                            &offset, sizeof(uint64_t));
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Failed to update the image size");
-        return ret;
+        goto fail;
     }
 
     s->l1_vm_state_index = new_l1_size;
-    return 0;
+    ret = 0;
+fail:
+    qemu_co_mutex_unlock(&s->lock);
+    return ret;
 }
 
 /* XXX: put compressed sectors first, then all the cluster aligned
@@ -3692,7 +3705,7 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, uint64_t offset,
         if (cluster_offset < 0) {
             return cluster_offset;
         }
-        return bdrv_truncate(bs->file, cluster_offset, PREALLOC_MODE_OFF, NULL);
+        return bdrv_co_truncate(bs->file, cluster_offset, PREALLOC_MODE_OFF, NULL);
     }
 
     if (offset_into_cluster(s, offset)) {
@@ -4696,7 +4709,7 @@ BlockDriver bdrv_qcow2 = {
     .bdrv_co_pdiscard       = qcow2_co_pdiscard,
     .bdrv_co_copy_range_from = qcow2_co_copy_range_from,
     .bdrv_co_copy_range_to  = qcow2_co_copy_range_to,
-    .bdrv_truncate          = qcow2_truncate,
+    .bdrv_co_truncate       = qcow2_co_truncate,
     .bdrv_co_pwritev_compressed = qcow2_co_pwritev_compressed,
     .bdrv_make_empty        = qcow2_make_empty,
 
diff --git a/block/qed.c b/block/qed.c
index 2363814538..689ea9d4d5 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -1467,8 +1467,10 @@ static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
                           QED_AIOCB_WRITE | QED_AIOCB_ZERO);
 }
 
-static int bdrv_qed_truncate(BlockDriverState *bs, int64_t offset,
-                             PreallocMode prealloc, Error **errp)
+static int coroutine_fn bdrv_qed_co_truncate(BlockDriverState *bs,
+                                             int64_t offset,
+                                             PreallocMode prealloc,
+                                             Error **errp)
 {
     BDRVQEDState *s = bs->opaque;
     uint64_t old_image_size;
@@ -1678,7 +1680,7 @@ static BlockDriver bdrv_qed = {
     .bdrv_co_readv            = bdrv_qed_co_readv,
     .bdrv_co_writev           = bdrv_qed_co_writev,
     .bdrv_co_pwrite_zeroes    = bdrv_qed_co_pwrite_zeroes,
-    .bdrv_truncate            = bdrv_qed_truncate,
+    .bdrv_co_truncate         = bdrv_qed_co_truncate,
     .bdrv_getlength           = bdrv_qed_getlength,
     .bdrv_get_info            = bdrv_qed_get_info,
     .bdrv_refresh_limits      = bdrv_qed_refresh_limits,
diff --git a/block/raw-format.c b/block/raw-format.c
index f2e468df6f..b78da564d4 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -366,8 +366,8 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
     }
 }
 
-static int raw_truncate(BlockDriverState *bs, int64_t offset,
-                        PreallocMode prealloc, Error **errp)
+static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
+                                        PreallocMode prealloc, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
 
@@ -383,7 +383,7 @@ static int raw_truncate(BlockDriverState *bs, int64_t offset,
 
     s->size = offset;
     offset += s->offset;
-    return bdrv_truncate(bs->file, offset, prealloc, errp);
+    return bdrv_co_truncate(bs->file, offset, prealloc, errp);
 }
 
 static void raw_eject(BlockDriverState *bs, bool eject_flag)
@@ -545,7 +545,7 @@ BlockDriver bdrv_raw = {
     .bdrv_co_block_status = &raw_co_block_status,
     .bdrv_co_copy_range_from = &raw_co_copy_range_from,
     .bdrv_co_copy_range_to  = &raw_co_copy_range_to,
-    .bdrv_truncate        = &raw_truncate,
+    .bdrv_co_truncate     = &raw_co_truncate,
     .bdrv_getlength       = &raw_getlength,
     .has_variable_length  = true,
     .bdrv_measure         = &raw_measure,
diff --git a/block/rbd.c b/block/rbd.c
index f2c6965418..ca8e5bbace 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -990,8 +990,10 @@ static int64_t qemu_rbd_getlength(BlockDriverState *bs)
     return info.size;
 }
 
-static int qemu_rbd_truncate(BlockDriverState *bs, int64_t offset,
-                             PreallocMode prealloc, Error **errp)
+static int coroutine_fn qemu_rbd_co_truncate(BlockDriverState *bs,
+                                             int64_t offset,
+                                             PreallocMode prealloc,
+                                             Error **errp)
 {
     BDRVRBDState *s = bs->opaque;
     int r;
@@ -1184,7 +1186,7 @@ static BlockDriver bdrv_rbd = {
     .bdrv_get_info          = qemu_rbd_getinfo,
     .create_opts            = &qemu_rbd_create_opts,
     .bdrv_getlength         = qemu_rbd_getlength,
-    .bdrv_truncate          = qemu_rbd_truncate,
+    .bdrv_co_truncate       = qemu_rbd_co_truncate,
     .protocol_name          = "rbd",
 
     .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
diff --git a/block/sheepdog.c b/block/sheepdog.c
index 665b1763eb..b229a664d9 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -2292,8 +2292,8 @@ static int64_t sd_getlength(BlockDriverState *bs)
     return s->inode.vdi_size;
 }
 
-static int sd_truncate(BlockDriverState *bs, int64_t offset,
-                       PreallocMode prealloc, Error **errp)
+static int coroutine_fn sd_co_truncate(BlockDriverState *bs, int64_t offset,
+                                       PreallocMode prealloc, Error **errp)
 {
     BDRVSheepdogState *s = bs->opaque;
     int ret, fd;
@@ -2609,7 +2609,7 @@ static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
 
     assert(!flags);
     if (offset > s->inode.vdi_size) {
-        ret = sd_truncate(bs, offset, PREALLOC_MODE_OFF, NULL);
+        ret = sd_co_truncate(bs, offset, PREALLOC_MODE_OFF, NULL);
         if (ret < 0) {
             return ret;
         }
@@ -3231,7 +3231,7 @@ static BlockDriver bdrv_sheepdog = {
     .bdrv_has_zero_init           = bdrv_has_zero_init_1,
     .bdrv_getlength               = sd_getlength,
     .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
-    .bdrv_truncate                = sd_truncate,
+    .bdrv_co_truncate             = sd_co_truncate,
 
     .bdrv_co_readv                = sd_co_readv,
     .bdrv_co_writev               = sd_co_writev,
@@ -3268,7 +3268,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
     .bdrv_has_zero_init           = bdrv_has_zero_init_1,
     .bdrv_getlength               = sd_getlength,
     .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
-    .bdrv_truncate                = sd_truncate,
+    .bdrv_co_truncate             = sd_co_truncate,
 
     .bdrv_co_readv                = sd_co_readv,
     .bdrv_co_writev               = sd_co_writev,
@@ -3305,7 +3305,7 @@ static BlockDriver bdrv_sheepdog_unix = {
     .bdrv_has_zero_init           = bdrv_has_zero_init_1,
     .bdrv_getlength               = sd_getlength,
     .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
-    .bdrv_truncate                = sd_truncate,
+    .bdrv_co_truncate             = sd_co_truncate,
 
     .bdrv_co_readv                = sd_co_readv,
     .bdrv_co_writev               = sd_co_writev,
diff --git a/block/ssh.c b/block/ssh.c
index da7bbf73e2..7fbc27abdf 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -1243,8 +1243,8 @@ static int64_t ssh_getlength(BlockDriverState *bs)
     return length;
 }
 
-static int ssh_truncate(BlockDriverState *bs, int64_t offset,
-                        PreallocMode prealloc, Error **errp)
+static int coroutine_fn ssh_co_truncate(BlockDriverState *bs, int64_t offset,
+                                        PreallocMode prealloc, Error **errp)
 {
     BDRVSSHState *s = bs->opaque;
 
@@ -1279,7 +1279,7 @@ static BlockDriver bdrv_ssh = {
     .bdrv_co_readv                = ssh_co_readv,
     .bdrv_co_writev               = ssh_co_writev,
     .bdrv_getlength               = ssh_getlength,
-    .bdrv_truncate                = ssh_truncate,
+    .bdrv_co_truncate             = ssh_co_truncate,
     .bdrv_co_flush_to_disk        = ssh_co_flush,
     .create_opts                  = &ssh_create_opts,
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co()
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value Kevin Wolf
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:55   ` Stefan Hajnoczi
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c Kevin Wolf
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

All callers are coroutine_fns now, so we can just directly call
preallocate_co().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 51 ++++++++-------------------------------------------
 1 file changed, 8 insertions(+), 43 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cd1836bc5b..3062e34f5e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2521,15 +2521,6 @@ static int qcow2_set_up_encryption(BlockDriverState *bs,
     return ret;
 }
 
-
-typedef struct PreallocCo {
-    BlockDriverState *bs;
-    uint64_t offset;
-    uint64_t new_length;
-
-    int ret;
-} PreallocCo;
-
 /**
  * Preallocates metadata structures for data clusters between @offset (in the
  * guest disk) and @new_length (which is thus generally the new guest disk
@@ -2537,12 +2528,9 @@ typedef struct PreallocCo {
  *
  * Returns: 0 on success, -errno on failure.
  */
-static void coroutine_fn preallocate_co(void *opaque)
+static int coroutine_fn preallocate_co(BlockDriverState *bs, uint64_t offset,
+                                       uint64_t new_length)
 {
-    PreallocCo *params = opaque;
-    BlockDriverState *bs = params->bs;
-    uint64_t offset = params->offset;
-    uint64_t new_length = params->new_length;
     uint64_t bytes;
     uint64_t host_offset = 0;
     unsigned int cur_bytes;
@@ -2557,7 +2545,7 @@ static void coroutine_fn preallocate_co(void *opaque)
         ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                          &host_offset, &meta);
         if (ret < 0) {
-            goto done;
+            return ret;
         }
 
         while (meta) {
@@ -2567,7 +2555,7 @@ static void coroutine_fn preallocate_co(void *opaque)
             if (ret < 0) {
                 qcow2_free_any_clusters(bs, meta->alloc_offset,
                                         meta->nb_clusters, QCOW2_DISCARD_NEVER);
-                goto done;
+                return ret;
             }
 
             /* There are no dependent requests, but we need to remove our
@@ -2594,34 +2582,11 @@ static void coroutine_fn preallocate_co(void *opaque)
         ret = bdrv_pwrite(bs->file, (host_offset + cur_bytes) - 1,
                           &data, 1);
         if (ret < 0) {
-            goto done;
+            return ret;
         }
     }
 
-    ret = 0;
-
-done:
-    params->ret = ret;
-}
-
-static int preallocate(BlockDriverState *bs,
-                       uint64_t offset, uint64_t new_length)
-{
-    PreallocCo params = {
-        .bs         = bs,
-        .offset     = offset,
-        .new_length = new_length,
-        .ret        = -EINPROGRESS,
-    };
-
-    if (qemu_in_coroutine()) {
-        preallocate_co(&params);
-    } else {
-        Coroutine *co = qemu_coroutine_create(preallocate_co, &params);
-        bdrv_coroutine_enter(bs, co);
-        BDRV_POLL_WHILE(bs, params.ret == -EINPROGRESS);
-    }
-    return params.ret;
+    return 0;
 }
 
 /* qcow2_refcount_metadata_size:
@@ -3039,7 +3004,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
     if (qcow2_opts->preallocation != PREALLOC_MODE_OFF) {
         BDRVQcow2State *s = blk_bs(blk)->opaque;
         qemu_co_mutex_lock(&s->lock);
-        ret = preallocate(blk_bs(blk), 0, qcow2_opts->size);
+        ret = preallocate_co(blk_bs(blk), 0, qcow2_opts->size);
         qemu_co_mutex_unlock(&s->lock);
 
         if (ret < 0) {
@@ -3547,7 +3512,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
         break;
 
     case PREALLOC_MODE_METADATA:
-        ret = preallocate(bs, old_length, offset);
+        ret = preallocate_co(bs, old_length, offset);
         if (ret < 0) {
             error_setg_errno(errp, -ret, "Preallocation failed");
             goto fail;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
                   ` (2 preceding siblings ...)
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co() Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:56   ` Stefan Hajnoczi
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate Kevin Wolf
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

This moves the bdrv_truncate() implementation from block.c to block/io.c
so it can have access to the tracked requests infrastructure.

This involves making refresh_total_sectors() public (in block_int.h).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block_int.h |   2 +
 block.c                   | 111 +---------------------------------------------
 block/io.c                | 109 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 112 insertions(+), 110 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c653ee663a..740166a996 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1157,4 +1157,6 @@ int coroutine_fn bdrv_co_copy_range_to(BdrvChild *src, uint64_t src_offset,
                                        BdrvChild *dst, uint64_t dst_offset,
                                        uint64_t bytes, BdrvRequestFlags flags);
 
+int refresh_total_sectors(BlockDriverState *bs, int64_t hint);
+
 #endif /* BLOCK_INT_H */
diff --git a/block.c b/block.c
index f761bc85d5..70a46fdd84 100644
--- a/block.c
+++ b/block.c
@@ -725,7 +725,7 @@ static int find_image_format(BlockBackend *file, const char *filename,
  * Set the current 'total_sectors' value
  * Return 0 on success, -errno on error.
  */
-static int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
+int refresh_total_sectors(BlockDriverState *bs, int64_t hint)
 {
     BlockDriver *drv = bs->drv;
 
@@ -2226,16 +2226,6 @@ static void bdrv_parent_cb_change_media(BlockDriverState *bs, bool load)
     }
 }
 
-static void bdrv_parent_cb_resize(BlockDriverState *bs)
-{
-    BdrvChild *c;
-    QLIST_FOREACH(c, &bs->parents, next_parent) {
-        if (c->role->resize) {
-            c->role->resize(c);
-        }
-    }
-}
-
 /*
  * Sets the backing file link of a BDS. A new reference is created; callers
  * which don't need their own reference any more must call bdrv_unref().
@@ -3786,105 +3776,6 @@ exit:
 }
 
 /**
- * Truncate file to 'offset' bytes (needed only for file protocols)
- */
-int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
-                                  PreallocMode prealloc, Error **errp)
-{
-    BlockDriverState *bs = child->bs;
-    BlockDriver *drv = bs->drv;
-    int ret;
-
-    assert(child->perm & BLK_PERM_RESIZE);
-
-    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
-    if (!drv) {
-        error_setg(errp, "No medium inserted");
-        return -ENOMEDIUM;
-    }
-    if (offset < 0) {
-        error_setg(errp, "Image size cannot be negative");
-        return -EINVAL;
-    }
-
-    bdrv_inc_in_flight(bs);
-
-    if (!drv->bdrv_co_truncate) {
-        if (bs->file && drv->is_filter) {
-            ret = bdrv_co_truncate(bs->file, offset, prealloc, errp);
-            goto out;
-        }
-        error_setg(errp, "Image format driver does not support resize");
-        ret = -ENOTSUP;
-        goto out;
-    }
-    if (bs->read_only) {
-        error_setg(errp, "Image is read-only");
-        ret = -EACCES;
-        goto out;
-    }
-
-    assert(!(bs->open_flags & BDRV_O_INACTIVE));
-
-    ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
-    if (ret < 0) {
-        goto out;
-    }
-    ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Could not refresh total sector count");
-    } else {
-        offset = bs->total_sectors * BDRV_SECTOR_SIZE;
-    }
-    bdrv_dirty_bitmap_truncate(bs, offset);
-    bdrv_parent_cb_resize(bs);
-    atomic_inc(&bs->write_gen);
-
-out:
-    bdrv_dec_in_flight(bs);
-    return ret;
-}
-
-typedef struct TruncateCo {
-    BdrvChild *child;
-    int64_t offset;
-    PreallocMode prealloc;
-    Error **errp;
-    int ret;
-} TruncateCo;
-
-static void coroutine_fn bdrv_truncate_co_entry(void *opaque)
-{
-    TruncateCo *tco = opaque;
-    tco->ret = bdrv_co_truncate(tco->child, tco->offset, tco->prealloc,
-                                tco->errp);
-}
-
-int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
-                  Error **errp)
-{
-    Coroutine *co;
-    TruncateCo tco = {
-        .child      = child,
-        .offset     = offset,
-        .prealloc   = prealloc,
-        .errp       = errp,
-        .ret        = NOT_DONE,
-    };
-
-    if (qemu_in_coroutine()) {
-        /* Fast-path if already in coroutine context */
-        bdrv_truncate_co_entry(&tco);
-    } else {
-        co = qemu_coroutine_create(bdrv_truncate_co_entry, &tco);
-        qemu_coroutine_enter(co);
-        BDRV_POLL_WHILE(child->bs, tco.ret == NOT_DONE);
-    }
-
-    return tco.ret;
-}
-
-/**
  * Length of a allocated file in bytes. Sparse files are counted by actual
  * allocated space. Return < 0 if error or unknown.
  */
diff --git a/block/io.c b/block/io.c
index ef4fedd364..7e87a42b8e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3020,3 +3020,112 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
     bdrv_dec_in_flight(dst_bs);
     return ret;
 }
+
+static void bdrv_parent_cb_resize(BlockDriverState *bs)
+{
+    BdrvChild *c;
+    QLIST_FOREACH(c, &bs->parents, next_parent) {
+        if (c->role->resize) {
+            c->role->resize(c);
+        }
+    }
+}
+
+/**
+ * Truncate file to 'offset' bytes (needed only for file protocols)
+ */
+int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
+                                  PreallocMode prealloc, Error **errp)
+{
+    BlockDriverState *bs = child->bs;
+    BlockDriver *drv = bs->drv;
+    int ret;
+
+    assert(child->perm & BLK_PERM_RESIZE);
+
+    /* if bs->drv == NULL, bs is closed, so there's nothing to do here */
+    if (!drv) {
+        error_setg(errp, "No medium inserted");
+        return -ENOMEDIUM;
+    }
+    if (offset < 0) {
+        error_setg(errp, "Image size cannot be negative");
+        return -EINVAL;
+    }
+
+    bdrv_inc_in_flight(bs);
+
+    if (!drv->bdrv_co_truncate) {
+        if (bs->file && drv->is_filter) {
+            ret = bdrv_co_truncate(bs->file, offset, prealloc, errp);
+            goto out;
+        }
+        error_setg(errp, "Image format driver does not support resize");
+        ret = -ENOTSUP;
+        goto out;
+    }
+    if (bs->read_only) {
+        error_setg(errp, "Image is read-only");
+        ret = -EACCES;
+        goto out;
+    }
+
+    assert(!(bs->open_flags & BDRV_O_INACTIVE));
+
+    ret = drv->bdrv_co_truncate(bs, offset, prealloc, errp);
+    if (ret < 0) {
+        goto out;
+    }
+    ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not refresh total sector count");
+    } else {
+        offset = bs->total_sectors * BDRV_SECTOR_SIZE;
+    }
+    bdrv_dirty_bitmap_truncate(bs, offset);
+    bdrv_parent_cb_resize(bs);
+    atomic_inc(&bs->write_gen);
+
+out:
+    bdrv_dec_in_flight(bs);
+    return ret;
+}
+
+typedef struct TruncateCo {
+    BdrvChild *child;
+    int64_t offset;
+    PreallocMode prealloc;
+    Error **errp;
+    int ret;
+} TruncateCo;
+
+static void coroutine_fn bdrv_truncate_co_entry(void *opaque)
+{
+    TruncateCo *tco = opaque;
+    tco->ret = bdrv_co_truncate(tco->child, tco->offset, tco->prealloc,
+                                tco->errp);
+}
+
+int bdrv_truncate(BdrvChild *child, int64_t offset, PreallocMode prealloc,
+                  Error **errp)
+{
+    Coroutine *co;
+    TruncateCo tco = {
+        .child      = child,
+        .offset     = offset,
+        .prealloc   = prealloc,
+        .errp       = errp,
+        .ret        = NOT_DONE,
+    };
+
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_truncate_co_entry(&tco);
+    } else {
+        co = qemu_coroutine_create(bdrv_truncate_co_entry, &tco);
+        qemu_coroutine_enter(co);
+        BDRV_POLL_WHILE(child->bs, tco.ret == NOT_DONE);
+    }
+
+    return tco.ret;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
                   ` (3 preceding siblings ...)
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:57   ` Stefan Hajnoczi
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous Kevin Wolf
  2018-06-27 11:58 ` [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Stefan Hajnoczi
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

When growing an image, block drivers (especially protocol drivers) may
initialise the newly added area. I/O requests to the same area need to
wait for this initialisation to be completed so that data writes don't
get overwritten and reads don't read uninitialised data.

To avoid overhead in the fast I/O path by adding new locking in the
protocol drivers and to restrict the impact to requests that actually
touch the new area, reuse the existing tracked request infrastructure in
block/io.c and mark all discard requests as serialising.

With this change, it is safe for protocol drivers to make
.bdrv_co_truncate actually asynchronous.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block_int.h |  1 +
 block/io.c                | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 740166a996..af71b414be 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -63,6 +63,7 @@ enum BdrvTrackedRequestType {
     BDRV_TRACKED_READ,
     BDRV_TRACKED_WRITE,
     BDRV_TRACKED_DISCARD,
+    BDRV_TRACKED_TRUNCATE,
 };
 
 typedef struct BdrvTrackedRequest {
diff --git a/block/io.c b/block/io.c
index 7e87a42b8e..01a3c4eac5 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3039,6 +3039,8 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
 {
     BlockDriverState *bs = child->bs;
     BlockDriver *drv = bs->drv;
+    BdrvTrackedRequest req;
+    int64_t old_size, new_bytes;
     int ret;
 
     assert(child->perm & BLK_PERM_RESIZE);
@@ -3053,7 +3055,28 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
         return -EINVAL;
     }
 
+    old_size = bdrv_getlength(bs);
+    if (old_size < 0) {
+        error_setg_errno(errp, -old_size, "Failed to get old image size");
+        return old_size;
+    }
+
+    if (offset > old_size) {
+        new_bytes = offset - old_size;
+    } else {
+        new_bytes = 0;
+    }
+
     bdrv_inc_in_flight(bs);
+    tracked_request_begin(&req, bs, offset, new_bytes, BDRV_TRACKED_TRUNCATE);
+
+    /* If we are growing the image and potentially using preallocation for the
+     * new area, we need to make sure that no write requests are made to it
+     * concurrently or they might be overwritten by preallocation. */
+    if (new_bytes) {
+        mark_request_serialising(&req, 1);
+        wait_serialising_requests(&req);
+    }
 
     if (!drv->bdrv_co_truncate) {
         if (bs->file && drv->is_filter) {
@@ -3087,7 +3110,9 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset,
     atomic_inc(&bs->write_gen);
 
 out:
+    tracked_request_end(&req);
     bdrv_dec_in_flight(bs);
+
     return ret;
 }
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
                   ` (4 preceding siblings ...)
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate Kevin Wolf
@ 2018-06-26 14:24 ` Kevin Wolf
  2018-06-27 11:57   ` Stefan Hajnoczi
  2018-06-27 11:58 ` [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Stefan Hajnoczi
  6 siblings, 1 reply; 15+ messages in thread
From: Kevin Wolf @ 2018-06-26 14:24 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, mreitz, stefanha, qemu-devel

This moves the code to resize an image file to the thread pool to avoid
blocking.

Creating large images with preallocation with blockdev-create is now
actually a background job instead of blocking the monitor (and most
other things) until the preallocation has completed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/raw-aio.h |   4 +-
 block/file-posix.c      | 265 +++++++++++++++++++++++++++---------------------
 2 files changed, 153 insertions(+), 116 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index 0e717fd475..e496ae31b2 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -26,6 +26,7 @@
 #define QEMU_AIO_DISCARD      0x0010
 #define QEMU_AIO_WRITE_ZEROES 0x0020
 #define QEMU_AIO_COPY_RANGE   0x0040
+#define QEMU_AIO_TRUNCATE     0x0080
 #define QEMU_AIO_TYPE_MASK \
         (QEMU_AIO_READ | \
          QEMU_AIO_WRITE | \
@@ -33,7 +34,8 @@
          QEMU_AIO_FLUSH | \
          QEMU_AIO_DISCARD | \
          QEMU_AIO_WRITE_ZEROES | \
-         QEMU_AIO_COPY_RANGE)
+         QEMU_AIO_COPY_RANGE | \
+         QEMU_AIO_TRUNCATE)
 
 /* AIO flags */
 #define QEMU_AIO_MISALIGNED   0x1000
diff --git a/block/file-posix.c b/block/file-posix.c
index 6223a8bccc..fa918f2bdb 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -188,8 +188,16 @@ typedef struct RawPosixAIOData {
 #define aio_ioctl_cmd   aio_nbytes /* for QEMU_AIO_IOCTL */
     off_t aio_offset;
     int aio_type;
-    int aio_fd2;
-    off_t aio_offset2;
+    union {
+        struct {
+            int aio_fd2;
+            off_t aio_offset2;
+        };
+        struct {
+            PreallocMode prealloc;
+            Error **errp;
+        };
+    };
 } RawPosixAIOData;
 
 #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
@@ -1533,6 +1541,121 @@ static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
     return ret;
 }
 
+static int handle_aiocb_truncate(RawPosixAIOData *aiocb)
+{
+    int result = 0;
+    int64_t current_length = 0;
+    char *buf = NULL;
+    struct stat st;
+    int fd = aiocb->aio_fildes;
+    int64_t offset = aiocb->aio_offset;
+    Error **errp = aiocb->errp;
+
+    if (fstat(fd, &st) < 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not stat file");
+        return result;
+    }
+
+    current_length = st.st_size;
+    if (current_length > offset && aiocb->prealloc != PREALLOC_MODE_OFF) {
+        error_setg(errp, "Cannot use preallocation for shrinking files");
+        return -ENOTSUP;
+    }
+
+    switch (aiocb->prealloc) {
+#ifdef CONFIG_POSIX_FALLOCATE
+    case PREALLOC_MODE_FALLOC:
+        /*
+         * Truncating before posix_fallocate() makes it about twice slower on
+         * file systems that do not support fallocate(), trying to check if a
+         * block is allocated before allocating it, so don't do that here.
+         */
+        if (offset != current_length) {
+            result = -posix_fallocate(fd, current_length, offset - current_length);
+            if (result != 0) {
+                /* posix_fallocate() doesn't set errno. */
+                error_setg_errno(errp, -result,
+                                 "Could not preallocate new data");
+            }
+        } else {
+            result = 0;
+        }
+        goto out;
+#endif
+    case PREALLOC_MODE_FULL:
+    {
+        int64_t num = 0, left = offset - current_length;
+        off_t seek_result;
+
+        /*
+         * Knowing the final size from the beginning could allow the file
+         * system driver to do less allocations and possibly avoid
+         * fragmentation of the file.
+         */
+        if (ftruncate(fd, offset) != 0) {
+            result = -errno;
+            error_setg_errno(errp, -result, "Could not resize file");
+            goto out;
+        }
+
+        buf = g_malloc0(65536);
+
+        seek_result = lseek(fd, current_length, SEEK_SET);
+        if (seek_result < 0) {
+            result = -errno;
+            error_setg_errno(errp, -result,
+                             "Failed to seek to the old end of file");
+            goto out;
+        }
+
+        while (left > 0) {
+            num = MIN(left, 65536);
+            result = write(fd, buf, num);
+            if (result < 0) {
+                result = -errno;
+                error_setg_errno(errp, -result,
+                                 "Could not write zeros for preallocation");
+                goto out;
+            }
+            left -= result;
+        }
+        if (result >= 0) {
+            result = fsync(fd);
+            if (result < 0) {
+                result = -errno;
+                error_setg_errno(errp, -result,
+                                 "Could not flush file to disk");
+                goto out;
+            }
+        }
+        goto out;
+    }
+    case PREALLOC_MODE_OFF:
+        if (ftruncate(fd, offset) != 0) {
+            result = -errno;
+            error_setg_errno(errp, -result, "Could not resize file");
+        }
+        return result;
+    default:
+        result = -ENOTSUP;
+        error_setg(errp, "Unsupported preallocation mode: %s",
+                   PreallocMode_str(aiocb->prealloc));
+        return result;
+    }
+
+out:
+    if (result < 0) {
+        if (ftruncate(fd, current_length) < 0) {
+            error_report("Failed to restore old file length: %s",
+                         strerror(errno));
+        }
+    }
+
+    g_free(buf);
+    return result;
+}
+
 static int aio_worker(void *arg)
 {
     RawPosixAIOData *aiocb = arg;
@@ -1576,6 +1699,9 @@ static int aio_worker(void *arg)
     case QEMU_AIO_COPY_RANGE:
         ret = handle_aiocb_copy_range(aiocb);
         break;
+    case QEMU_AIO_TRUNCATE:
+        ret = handle_aiocb_truncate(aiocb);
+        break;
     default:
         fprintf(stderr, "invalid aio request (0x%x)\n", aiocb->aio_type);
         ret = -EINVAL;
@@ -1743,117 +1869,25 @@ static void raw_close(BlockDriverState *bs)
  *
  * Returns: 0 on success, -errno on failure.
  */
-static int raw_regular_truncate(int fd, int64_t offset, PreallocMode prealloc,
-                                Error **errp)
+static int coroutine_fn
+raw_regular_truncate(BlockDriverState *bs, int fd, int64_t offset,
+                     PreallocMode prealloc, Error **errp)
 {
-    int result = 0;
-    int64_t current_length = 0;
-    char *buf = NULL;
-    struct stat st;
-
-    if (fstat(fd, &st) < 0) {
-        result = -errno;
-        error_setg_errno(errp, -result, "Could not stat file");
-        return result;
-    }
-
-    current_length = st.st_size;
-    if (current_length > offset && prealloc != PREALLOC_MODE_OFF) {
-        error_setg(errp, "Cannot use preallocation for shrinking files");
-        return -ENOTSUP;
-    }
-
-    switch (prealloc) {
-#ifdef CONFIG_POSIX_FALLOCATE
-    case PREALLOC_MODE_FALLOC:
-        /*
-         * Truncating before posix_fallocate() makes it about twice slower on
-         * file systems that do not support fallocate(), trying to check if a
-         * block is allocated before allocating it, so don't do that here.
-         */
-        if (offset != current_length) {
-            result = -posix_fallocate(fd, current_length, offset - current_length);
-            if (result != 0) {
-                /* posix_fallocate() doesn't set errno. */
-                error_setg_errno(errp, -result,
-                                 "Could not preallocate new data");
-            }
-        } else {
-            result = 0;
-        }
-        goto out;
-#endif
-    case PREALLOC_MODE_FULL:
-    {
-        int64_t num = 0, left = offset - current_length;
-        off_t seek_result;
-
-        /*
-         * Knowing the final size from the beginning could allow the file
-         * system driver to do less allocations and possibly avoid
-         * fragmentation of the file.
-         */
-        if (ftruncate(fd, offset) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not resize file");
-            goto out;
-        }
-
-        buf = g_malloc0(65536);
-
-        seek_result = lseek(fd, current_length, SEEK_SET);
-        if (seek_result < 0) {
-            result = -errno;
-            error_setg_errno(errp, -result,
-                             "Failed to seek to the old end of file");
-            goto out;
-        }
-
-        while (left > 0) {
-            num = MIN(left, 65536);
-            result = write(fd, buf, num);
-            if (result < 0) {
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not write zeros for preallocation");
-                goto out;
-            }
-            left -= result;
-        }
-        if (result >= 0) {
-            result = fsync(fd);
-            if (result < 0) {
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not flush file to disk");
-                goto out;
-            }
-        }
-        goto out;
-    }
-    case PREALLOC_MODE_OFF:
-        if (ftruncate(fd, offset) != 0) {
-            result = -errno;
-            error_setg_errno(errp, -result, "Could not resize file");
-        }
-        return result;
-    default:
-        result = -ENOTSUP;
-        error_setg(errp, "Unsupported preallocation mode: %s",
-                   PreallocMode_str(prealloc));
-        return result;
-    }
+    RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
+    ThreadPool *pool;
 
-out:
-    if (result < 0) {
-        if (ftruncate(fd, current_length) < 0) {
-            error_report("Failed to restore old file length: %s",
-                         strerror(errno));
-        }
-    }
+    *acb = (RawPosixAIOData) {
+        .bs             = bs,
+        .aio_fildes     = fd,
+        .aio_type       = QEMU_AIO_TRUNCATE,
+        .aio_offset     = offset,
+        .prealloc       = prealloc,
+        .errp           = errp,
+    };
 
-    g_free(buf);
-    return result;
+    /* @bs can be NULL, bdrv_get_aio_context() returns the main context then */
+    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    return thread_pool_submit_co(pool, aio_worker, acb);
 }
 
 static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
@@ -1870,7 +1904,7 @@ static int coroutine_fn raw_co_truncate(BlockDriverState *bs, int64_t offset,
     }
 
     if (S_ISREG(st.st_mode)) {
-        return raw_regular_truncate(s->fd, offset, prealloc, errp);
+        return raw_regular_truncate(bs, s->fd, offset, prealloc, errp);
     }
 
     if (prealloc != PREALLOC_MODE_OFF) {
@@ -2072,7 +2106,8 @@ static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
     return (int64_t)st.st_blocks * 512;
 }
 
-static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
+static int coroutine_fn
+raw_co_create(BlockdevCreateOptions *options, Error **errp)
 {
     BlockdevCreateOptionsFile *file_opts;
     int fd;
@@ -2124,7 +2159,7 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
     }
 
     /* Clear the file by truncating it to 0 */
-    result = raw_regular_truncate(fd, 0, PREALLOC_MODE_OFF, errp);
+    result = raw_regular_truncate(NULL, fd, 0, PREALLOC_MODE_OFF, errp);
     if (result < 0) {
         goto out_close;
     }
@@ -2146,8 +2181,8 @@ static int raw_co_create(BlockdevCreateOptions *options, Error **errp)
 
     /* Resize and potentially preallocate the file to the desired
      * final size */
-    result = raw_regular_truncate(fd, file_opts->size, file_opts->preallocation,
-                                  errp);
+    result = raw_regular_truncate(NULL, fd, file_opts->size,
+                                  file_opts->preallocation, errp);
     if (result < 0) {
         goto out_close;
     }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value Kevin Wolf
@ 2018-06-27 11:51   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:51 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 493 bytes --]

On Tue, Jun 26, 2018 at 04:24:29PM +0200, Kevin Wolf wrote:
> If qcow2_alloc_clusters_at() returns an error, we do need to negate it
> to get back the positive errno code for error_setg_errno(), but we still
> need to return the negative error code.
> 
> Fixes: 772d1f973f87269f6a4a4ea4b880680f3779bbdf
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn Kevin Wolf
@ 2018-06-27 11:54   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:54 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]

On Tue, Jun 26, 2018 at 04:24:30PM +0200, Kevin Wolf wrote:
> bdrv_truncate() is an operation that can block (even for a quite long
> time, depending on the PreallocMode) in I/O paths that shouldn't block.
> Convert it to a coroutine_fn so that we have the infrastructure for
> drivers to make their .bdrv_co_truncate implementation asynchronous.
> 
> This change could potentially introduce new race conditions because
> bdrv_truncate() isn't necessarily executed atomically any more. Whether
> this is a problem needs to be evaluated for each block driver that
> supports truncate:
> 
> * file-posix/win32, gluster, iscsi, nfs, rbd, ssh, sheepdog: The
>   protocol drivers are trivially safe because they don't actually yield
>   yet, so there is no change in behaviour.
> 
> * copy-on-read, crypto, raw-format: Essentially just filter drivers that
>   pass the request to a child node, no problem.
> 
> * qcow2: The implementation modifies metadata, so it needs to hold
>   s->lock to be safe with concurrent I/O requests. In order to avoid
>   double locking, this requires pulling the locking out into
>   preallocate_co() and using qcow2_write_caches() instead of
>   bdrv_flush().
> 
> * qed: Does a single header update, this is fine without locking.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/block.h     |  4 +++
>  include/block/block_int.h |  4 +--
>  block.c                   | 63 ++++++++++++++++++++++++++++++++++------
>  block/copy-on-read.c      |  8 +++---
>  block/crypto.c            |  9 +++---
>  block/file-posix.c        | 12 ++++----
>  block/file-win32.c        |  6 ++--
>  block/gluster.c           | 14 +++++----
>  block/iscsi.c             |  8 +++---
>  block/nfs.c               |  7 +++--
>  block/qcow2.c             | 73 ++++++++++++++++++++++++++++-------------------
>  block/qed.c               |  8 ++++--
>  block/raw-format.c        |  8 +++---
>  block/rbd.c               |  8 ++++--
>  block/sheepdog.c          | 12 ++++----
>  block/ssh.c               |  6 ++--
>  16 files changed, 161 insertions(+), 89 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co()
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co() Kevin Wolf
@ 2018-06-27 11:55   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:55 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 390 bytes --]

On Tue, Jun 26, 2018 at 04:24:31PM +0200, Kevin Wolf wrote:
> All callers are coroutine_fns now, so we can just directly call
> preallocate_co().
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2.c | 51 ++++++++-------------------------------------------
>  1 file changed, 8 insertions(+), 43 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c Kevin Wolf
@ 2018-06-27 11:56   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 652 bytes --]

On Tue, Jun 26, 2018 at 04:24:32PM +0200, Kevin Wolf wrote:
> This moves the bdrv_truncate() implementation from block.c to block/io.c
> so it can have access to the tracked requests infrastructure.
> 
> This involves making refresh_total_sectors() public (in block_int.h).
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/block_int.h |   2 +
>  block.c                   | 111 +---------------------------------------------
>  block/io.c                | 109 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 112 insertions(+), 110 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate Kevin Wolf
@ 2018-06-27 11:57   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:57 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 985 bytes --]

On Tue, Jun 26, 2018 at 04:24:33PM +0200, Kevin Wolf wrote:
> When growing an image, block drivers (especially protocol drivers) may
> initialise the newly added area. I/O requests to the same area need to
> wait for this initialisation to be completed so that data writes don't
> get overwritten and reads don't read uninitialised data.
> 
> To avoid overhead in the fast I/O path by adding new locking in the
> protocol drivers and to restrict the impact to requests that actually
> touch the new area, reuse the existing tracked request infrastructure in
> block/io.c and mark all discard requests as serialising.
> 
> With this change, it is safe for protocol drivers to make
> .bdrv_co_truncate actually asynchronous.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/block_int.h |  1 +
>  block/io.c                | 25 +++++++++++++++++++++++++
>  2 files changed, 26 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous Kevin Wolf
@ 2018-06-27 11:57   ` Stefan Hajnoczi
  0 siblings, 0 replies; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:57 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 642 bytes --]

On Tue, Jun 26, 2018 at 04:24:34PM +0200, Kevin Wolf wrote:
> This moves the code to resize an image file to the thread pool to avoid
> blocking.
> 
> Creating large images with preallocation with blockdev-create is now
> actually a background job instead of blocking the monitor (and most
> other things) until the preallocation has completed.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/raw-aio.h |   4 +-
>  block/file-posix.c      | 265 +++++++++++++++++++++++++++---------------------
>  2 files changed, 153 insertions(+), 116 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous
  2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
                   ` (5 preceding siblings ...)
  2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous Kevin Wolf
@ 2018-06-27 11:58 ` Stefan Hajnoczi
  2018-06-27 12:42   ` Kevin Wolf
  6 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2018-06-27 11:58 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 587 bytes --]

On Tue, Jun 26, 2018 at 04:24:28PM +0200, Kevin Wolf wrote:
> This fixes the problem that blockdev-create on a local file blocks the
> main loop despite being a background job. This was caused by file-posix
> preallocating the image with blocking syscalls rather than moving this
> to the thread pool and yielding the coroutine meanwhile.
> 
> v2:
> - Add locking to qcow2_co_discard()
> - Extra qcow2 fix and cleanup related to the locking code
> - Use tracked requests infrastructure for serialising I/O requests
>   against truncate in newly allocated areas

Looks good!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous
  2018-06-27 11:58 ` [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Stefan Hajnoczi
@ 2018-06-27 12:42   ` Kevin Wolf
  0 siblings, 0 replies; 15+ messages in thread
From: Kevin Wolf @ 2018-06-27 12:42 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-block, mreitz, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

Am 27.06.2018 um 13:58 hat Stefan Hajnoczi geschrieben:
> On Tue, Jun 26, 2018 at 04:24:28PM +0200, Kevin Wolf wrote:
> > This fixes the problem that blockdev-create on a local file blocks the
> > main loop despite being a background job. This was caused by file-posix
> > preallocating the image with blocking syscalls rather than moving this
> > to the thread pool and yielding the coroutine meanwhile.
> > 
> > v2:
> > - Add locking to qcow2_co_discard()
> > - Extra qcow2 fix and cleanup related to the locking code
> > - Use tracked requests infrastructure for serialising I/O requests
> >   against truncate in newly allocated areas
> 
> Looks good!

Thanks for the review! Applied to my block branch.

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-27 12:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-26 14:24 [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Kevin Wolf
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 1/6] qcow2: Fix qcow2_truncate() error return value Kevin Wolf
2018-06-27 11:51   ` Stefan Hajnoczi
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 2/6] block: Convert .bdrv_truncate callback to coroutine_fn Kevin Wolf
2018-06-27 11:54   ` Stefan Hajnoczi
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 3/6] qcow2: Remove coroutine trampoline for preallocate_co() Kevin Wolf
2018-06-27 11:55   ` Stefan Hajnoczi
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 4/6] block: Move bdrv_truncate() implementation to io.c Kevin Wolf
2018-06-27 11:56   ` Stefan Hajnoczi
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 5/6] block: Use tracked request for truncate Kevin Wolf
2018-06-27 11:57   ` Stefan Hajnoczi
2018-06-26 14:24 ` [Qemu-devel] [PATCH v2 6/6] file-posix: Make .bdrv_co_truncate asynchronous Kevin Wolf
2018-06-27 11:57   ` Stefan Hajnoczi
2018-06-27 11:58 ` [Qemu-devel] [PATCH v2 0/6] file-posix: Make truncate/preallocate asynchronous Stefan Hajnoczi
2018-06-27 12:42   ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).