qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series)
@ 2011-10-14  8:41 Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Thanks to Stefan's series from today, I managed to understand what
the coroutinization was all about.  So I split this part out of the
NBD series.

Applies on top of block branch + the other five patches.

Paolo Bonzini (3):
  block: rename bdrv_co_rw_bh
  block: unify flush implementations
  block: add bdrv_co_discard and bdrv_aio_discard support

Stefan Hajnoczi (1):
  block: drop redundant bdrv_flush implementation

 block.c           |  236 ++++++++++++++++++++++++++++++-----------------------
 block.h           |    3 +
 block/blkdebug.c  |    6 --
 block/blkverify.c |    9 --
 block/qcow.c      |    6 --
 block/qcow2.c     |   19 ----
 block/qed.c       |    6 --
 block/raw-posix.c |   18 ----
 block/raw.c       |   19 ++---
 block_int.h       |   10 ++-
 trace-events      |    1 +
 11 files changed, 152 insertions(+), 181 deletions(-)

-- 
1.7.6

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh
  2011-10-14  8:41 [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series) Paolo Bonzini
@ 2011-10-14  8:41 ` Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 2/4] block: unify flush implementations Paolo Bonzini
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 9873b57..7184a0f 100644
--- a/block.c
+++ b/block.c
@@ -2735,7 +2735,7 @@ static AIOPool bdrv_em_co_aio_pool = {
     .cancel             = bdrv_aio_co_cancel_em,
 };
 
-static void bdrv_co_rw_bh(void *opaque)
+static void bdrv_co_em_bh(void *opaque)
 {
     BlockDriverAIOCBCoroutine *acb = opaque;
 
@@ -2758,7 +2758,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
             acb->req.nb_sectors, acb->req.qiov);
     }
 
-    acb->bh = qemu_bh_new(bdrv_co_rw_bh, acb);
+    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
     qemu_bh_schedule(acb->bh);
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14  8:41 [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series) Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh Paolo Bonzini
@ 2011-10-14  8:41 ` Paolo Bonzini
  2011-10-14 11:08   ` Kevin Wolf
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 3/4] block: drop redundant bdrv_flush implementation Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support Paolo Bonzini
  3 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

Add coroutine support for flush and apply the same emulation that
we already do for read/write.  bdrv_aio_flush is simplified to always
go through a coroutine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c     |  160 ++++++++++++++++++++++++++---------------------------------
 block_int.h |    1 +
 2 files changed, 71 insertions(+), 90 deletions(-)

diff --git a/block.c b/block.c
index 7184a0f..0af9a89 100644
--- a/block.c
+++ b/block.c
@@ -53,17 +53,12 @@ static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs,
 static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockDriverCompletionFunc *cb, void *opaque);
-static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
-        BlockDriverCompletionFunc *cb, void *opaque);
-static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
-        BlockDriverCompletionFunc *cb, void *opaque);
 static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
                                          int64_t sector_num, int nb_sectors,
                                          QEMUIOVector *iov);
 static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
                                          int64_t sector_num, int nb_sectors,
                                          QEMUIOVector *iov);
-static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs);
 static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
     int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
 static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs,
@@ -203,9 +198,6 @@ void bdrv_register(BlockDriver *bdrv)
         }
     }
 
-    if (!bdrv->bdrv_aio_flush)
-        bdrv->bdrv_aio_flush = bdrv_aio_flush_em;
-
     QLIST_INSERT_HEAD(&bdrv_drivers, bdrv, list);
 }
 
@@ -1027,11 +1019,6 @@ static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num,
                                    nb_sectors * BDRV_SECTOR_SIZE);
 }
 
-static inline bool bdrv_has_async_flush(BlockDriver *drv)
-{
-    return drv->bdrv_aio_flush != bdrv_aio_flush_em;
-}
-
 typedef struct RwCo {
     BlockDriverState *bs;
     int64_t sector_num;
@@ -1759,33 +1746,6 @@ const char *bdrv_get_device_name(BlockDriverState *bs)
     return bs->device_name;
 }
 
-int bdrv_flush(BlockDriverState *bs)
-{
-    if (bs->open_flags & BDRV_O_NO_FLUSH) {
-        return 0;
-    }
-
-    if (bs->drv && bdrv_has_async_flush(bs->drv) && qemu_in_coroutine()) {
-        return bdrv_co_flush_em(bs);
-    }
-
-    if (bs->drv && bs->drv->bdrv_flush) {
-        return bs->drv->bdrv_flush(bs);
-    }
-
-    /*
-     * Some block drivers always operate in either writethrough or unsafe mode
-     * and don't support bdrv_flush therefore. Usually qemu doesn't know how
-     * the server works (because the behaviour is hardcoded or depends on
-     * server-side configuration), so we can't ensure that everything is safe
-     * on disk. Returning an error doesn't work because that would break guests
-     * even if the server operates in writethrough mode.
-     *
-     * Let's hope the user knows what he's doing.
-     */
-    return 0;
-}
-
 void bdrv_flush_all(void)
 {
     BlockDriverState *bs;
@@ -2610,22 +2570,6 @@ fail:
     return -1;
 }
 
-BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
-        BlockDriverCompletionFunc *cb, void *opaque)
-{
-    BlockDriver *drv = bs->drv;
-
-    trace_bdrv_aio_flush(bs, opaque);
-
-    if (bs->open_flags & BDRV_O_NO_FLUSH) {
-        return bdrv_aio_noop_em(bs, cb, opaque);
-    }
-
-    if (!drv)
-        return NULL;
-    return drv->bdrv_aio_flush(bs, cb, opaque);
-}
-
 void bdrv_aio_cancel(BlockDriverAIOCB *acb)
 {
     acb->pool->cancel(acb);
@@ -2785,41 +2729,28 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
     return &acb->common;
 }
 
-static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs,
-        BlockDriverCompletionFunc *cb, void *opaque)
+static void coroutine_fn bdrv_co_flush(void *opaque)
 {
-    BlockDriverAIOCBSync *acb;
-
-    acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque);
-    acb->is_write = 1; /* don't bounce in the completion hadler */
-    acb->qiov = NULL;
-    acb->bounce = NULL;
-    acb->ret = 0;
-
-    if (!acb->bh)
-        acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
+    BlockDriverAIOCBCoroutine *acb = opaque;
+    BlockDriverState *bs = acb->common.bs;
 
-    bdrv_flush(bs);
+    acb->req.error = bdrv_flush(bs);
+    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
     qemu_bh_schedule(acb->bh);
-    return &acb->common;
 }
 
-static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs,
+BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
         BlockDriverCompletionFunc *cb, void *opaque)
 {
-    BlockDriverAIOCBSync *acb;
+    trace_bdrv_aio_flush(bs, opaque);
 
-    acb = qemu_aio_get(&bdrv_em_aio_pool, bs, cb, opaque);
-    acb->is_write = 1; /* don't bounce in the completion handler */
-    acb->qiov = NULL;
-    acb->bounce = NULL;
-    acb->ret = 0;
+    Coroutine *co;
+    BlockDriverAIOCBCoroutine *acb;
 
-    if (!acb->bh) {
-        acb->bh = qemu_bh_new(bdrv_aio_bh_cb, acb);
-    }
+    acb = qemu_aio_get(&bdrv_em_co_aio_pool, bs, cb, opaque);
+    co = qemu_coroutine_create(bdrv_co_flush);
+    qemu_coroutine_enter(co, acb);
 
-    qemu_bh_schedule(acb->bh);
     return &acb->common;
 }
 
@@ -2916,19 +2847,68 @@ static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
     return bdrv_co_io_em(bs, sector_num, nb_sectors, iov, true);
 }
 
-static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs)
+static void bdrv_flush_co_entry(void *opaque)
 {
-    CoroutineIOCompletion co = {
-        .coroutine = qemu_coroutine_self(),
+    RwCo *rwco = opaque;
+    BlockDriverState *bs = rwco->bs;
+
+    if (bs->open_flags & BDRV_O_NO_FLUSH) {
+        rwco->ret = 0;
+    } else if (!bs->drv) {
+        rwco->ret = 0;
+    } else if (bs->drv->bdrv_co_flush) {
+        rwco->ret = bs->drv->bdrv_co_flush(bs);
+    } else if (bs->drv->bdrv_aio_flush) {
+        BlockDriverAIOCB *acb;
+        CoroutineIOCompletion co = {
+            .coroutine = qemu_coroutine_self(),
+        };
+
+        acb = bs->drv->bdrv_aio_flush(bs, bdrv_co_io_em_complete, &co);
+        if (acb == NULL) {
+            rwco->ret = -EIO;
+        } else {
+            qemu_coroutine_yield();
+            rwco->ret = co.ret;
+        }
+    } else if (bs->drv->bdrv_flush) {
+        rwco->ret = bs->drv->bdrv_flush(bs);
+    } else {
+        /*
+         * Some block drivers always operate in either writethrough or unsafe
+         * mode and don't support bdrv_flush therefore. Usually qemu doesn't
+         * know how the server works (because the behaviour is hardcoded or
+         * depends on server-side configuration), so we can't ensure that
+         * everything is safe on disk. Returning an error doesn't work because
+         * that would break guests even if the server operates in writethrough
+         * mode.
+         *
+         * Let's hope the user knows what he's doing.
+         */
+        rwco->ret = 0;
+    }
+}
+
+int bdrv_flush(BlockDriverState *bs)
+{
+    Coroutine *co;
+    RwCo rwco = {
+        .bs = bs,
+        .ret = NOT_DONE,
     };
-    BlockDriverAIOCB *acb;
 
-    acb = bdrv_aio_flush(bs, bdrv_co_io_em_complete, &co);
-    if (!acb) {
-        return -EIO;
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_flush_co_entry(&rwco);
+    } else {
+        co = qemu_coroutine_create(bdrv_flush_co_entry);
+        qemu_coroutine_enter(co, &rwco);
+        while (rwco.ret == NOT_DONE) {
+            qemu_aio_wait();
+        }
     }
-    qemu_coroutine_yield();
-    return co.ret;
+
+    return rwco.ret;
 }
 
 /**************************************************************/
diff --git a/block_int.h b/block_int.h
index f2f4f2d..9cb536d 100644
--- a/block_int.h
+++ b/block_int.h
@@ -83,6 +83,7 @@ struct BlockDriver {
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
     int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
+    int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs);
 
     int (*bdrv_aio_multiwrite)(BlockDriverState *bs, BlockRequest *reqs,
         int num_reqs);
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/4] block: drop redundant bdrv_flush implementation
  2011-10-14  8:41 [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series) Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 2/4] block: unify flush implementations Paolo Bonzini
@ 2011-10-14  8:41 ` Paolo Bonzini
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support Paolo Bonzini
  3 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

From: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

Block drivers now only need to provide either of .bdrv_co_flush,
.bdrv_aio_flush() or for legacy drivers .bdrv_flush().  Remove
the redundant .bdrv_flush() and, for the raw driver, replace the
asynchronous operation with the coroutine-based one.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/blkdebug.c  |    6 ------
 block/blkverify.c |    9 ---------
 block/qcow.c      |    6 ------
 block/qcow2.c     |   19 -------------------
 block/qed.c       |    6 ------
 block/raw-posix.c |   18 ------------------
 block/raw.c       |   11 ++---------
 7 files changed, 2 insertions(+), 73 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index b3c5d42..9b88535 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -397,11 +397,6 @@ static void blkdebug_close(BlockDriverState *bs)
     }
 }
 
-static int blkdebug_flush(BlockDriverState *bs)
-{
-    return bdrv_flush(bs->file);
-}
-
 static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
     BlockDriverCompletionFunc *cb, void *opaque)
 {
@@ -454,7 +449,6 @@ static BlockDriver bdrv_blkdebug = {
 
     .bdrv_file_open     = blkdebug_open,
     .bdrv_close         = blkdebug_close,
-    .bdrv_flush         = blkdebug_flush,
 
     .bdrv_aio_readv     = blkdebug_aio_readv,
     .bdrv_aio_writev    = blkdebug_aio_writev,
diff --git a/block/blkverify.c b/block/blkverify.c
index c7522b4..483f3b3 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -116,14 +116,6 @@ static void blkverify_close(BlockDriverState *bs)
     s->test_file = NULL;
 }
 
-static int blkverify_flush(BlockDriverState *bs)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-
-    /* Only flush test file, the raw file is not important */
-    return bdrv_flush(s->test_file);
-}
-
 static int64_t blkverify_getlength(BlockDriverState *bs)
 {
     BDRVBlkverifyState *s = bs->opaque;
@@ -368,7 +360,6 @@ static BlockDriver bdrv_blkverify = {
 
     .bdrv_file_open     = blkverify_open,
     .bdrv_close         = blkverify_close,
-    .bdrv_flush         = blkverify_flush,
 
     .bdrv_aio_readv     = blkverify_aio_readv,
     .bdrv_aio_writev    = blkverify_aio_writev,
diff --git a/block/qcow.c b/block/qcow.c
index c8bfecc..9b71116 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -781,11 +781,6 @@ static int qcow_write_compressed(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
-static int qcow_flush(BlockDriverState *bs)
-{
-    return bdrv_flush(bs->file);
-}
-
 static BlockDriverAIOCB *qcow_aio_flush(BlockDriverState *bs,
         BlockDriverCompletionFunc *cb, void *opaque)
 {
@@ -826,7 +821,6 @@ static BlockDriver bdrv_qcow = {
     .bdrv_open		= qcow_open,
     .bdrv_close		= qcow_close,
     .bdrv_create	= qcow_create,
-    .bdrv_flush		= qcow_flush,
     .bdrv_is_allocated	= qcow_is_allocated,
     .bdrv_set_key	= qcow_set_key,
     .bdrv_make_empty	= qcow_make_empty,
diff --git a/block/qcow2.c b/block/qcow2.c
index 510ff68..4dc980c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1092,24 +1092,6 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num,
     return 0;
 }
 
-static int qcow2_flush(BlockDriverState *bs)
-{
-    BDRVQcowState *s = bs->opaque;
-    int ret;
-
-    ret = qcow2_cache_flush(bs, s->l2_table_cache);
-    if (ret < 0) {
-        return ret;
-    }
-
-    ret = qcow2_cache_flush(bs, s->refcount_block_cache);
-    if (ret < 0) {
-        return ret;
-    }
-
-    return bdrv_flush(bs->file);
-}
-
 static BlockDriverAIOCB *qcow2_aio_flush(BlockDriverState *bs,
                                          BlockDriverCompletionFunc *cb,
                                          void *opaque)
@@ -1242,7 +1224,6 @@ static BlockDriver bdrv_qcow2 = {
     .bdrv_open          = qcow2_open,
     .bdrv_close         = qcow2_close,
     .bdrv_create        = qcow2_create,
-    .bdrv_flush         = qcow2_flush,
     .bdrv_is_allocated  = qcow2_is_allocated,
     .bdrv_set_key       = qcow2_set_key,
     .bdrv_make_empty    = qcow2_make_empty,
diff --git a/block/qed.c b/block/qed.c
index e87dc4d..2e06992 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -533,11 +533,6 @@ static void bdrv_qed_close(BlockDriverState *bs)
     qemu_vfree(s->l1_table);
 }
 
-static int bdrv_qed_flush(BlockDriverState *bs)
-{
-    return bdrv_flush(bs->file);
-}
-
 static int qed_create(const char *filename, uint32_t cluster_size,
                       uint64_t image_size, uint32_t table_size,
                       const char *backing_file, const char *backing_fmt)
@@ -1479,7 +1474,6 @@ static BlockDriver bdrv_qed = {
     .bdrv_open                = bdrv_qed_open,
     .bdrv_close               = bdrv_qed_close,
     .bdrv_create              = bdrv_qed_create,
-    .bdrv_flush               = bdrv_qed_flush,
     .bdrv_is_allocated        = bdrv_qed_is_allocated,
     .bdrv_make_empty          = bdrv_qed_make_empty,
     .bdrv_aio_readv           = bdrv_qed_aio_readv,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index c7f5544..afcb4c1 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -583,19 +583,6 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
     return result;
 }
 
-static int raw_flush(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-
-    ret = qemu_fdatasync(s->fd);
-    if (ret < 0) {
-        return -errno;
-    }
-
-    return 0;
-}
-
 #ifdef CONFIG_XFS
 static int xfs_discard(BDRVRawState *s, int64_t sector_num, int nb_sectors)
 {
@@ -645,7 +632,6 @@ static BlockDriver bdrv_file = {
     .bdrv_file_open = raw_open,
     .bdrv_close = raw_close,
     .bdrv_create = raw_create,
-    .bdrv_flush = raw_flush,
     .bdrv_discard = raw_discard,
 
     .bdrv_aio_readv = raw_aio_readv,
@@ -915,7 +901,6 @@ static BlockDriver bdrv_host_device = {
     .bdrv_create        = hdev_create,
     .create_options     = raw_create_options,
     .bdrv_has_zero_init = hdev_has_zero_init,
-    .bdrv_flush         = raw_flush,
 
     .bdrv_aio_readv	= raw_aio_readv,
     .bdrv_aio_writev	= raw_aio_writev,
@@ -1035,7 +1020,6 @@ static BlockDriver bdrv_host_floppy = {
     .bdrv_create        = hdev_create,
     .create_options     = raw_create_options,
     .bdrv_has_zero_init = hdev_has_zero_init,
-    .bdrv_flush         = raw_flush,
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
@@ -1135,7 +1119,6 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_create        = hdev_create,
     .create_options     = raw_create_options,
     .bdrv_has_zero_init = hdev_has_zero_init,
-    .bdrv_flush         = raw_flush,
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
@@ -1255,7 +1238,6 @@ static BlockDriver bdrv_host_cdrom = {
     .bdrv_create        = hdev_create,
     .create_options     = raw_create_options,
     .bdrv_has_zero_init = hdev_has_zero_init,
-    .bdrv_flush         = raw_flush,
 
     .bdrv_aio_readv     = raw_aio_readv,
     .bdrv_aio_writev    = raw_aio_writev,
diff --git a/block/raw.c b/block/raw.c
index 5ca606b..161c9cf 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -25,17 +25,11 @@ static void raw_close(BlockDriverState *bs)
 {
 }
 
-static int raw_flush(BlockDriverState *bs)
+static int coroutine_fn raw_co_flush(BlockDriverState *bs)
 {
     return bdrv_flush(bs->file);
 }
 
-static BlockDriverAIOCB *raw_aio_flush(BlockDriverState *bs,
-    BlockDriverCompletionFunc *cb, void *opaque)
-{
-    return bdrv_aio_flush(bs->file, cb, opaque);
-}
-
 static int64_t raw_getlength(BlockDriverState *bs)
 {
     return bdrv_getlength(bs->file);
@@ -117,12 +111,11 @@ static BlockDriver bdrv_raw = {
     .bdrv_close         = raw_close,
     .bdrv_co_readv      = raw_co_readv,
     .bdrv_co_writev     = raw_co_writev,
-    .bdrv_flush         = raw_flush,
+    .bdrv_co_flush      = raw_co_flush,
     .bdrv_probe         = raw_probe,
     .bdrv_getlength     = raw_getlength,
     .bdrv_truncate      = raw_truncate,
 
-    .bdrv_aio_flush     = raw_aio_flush,
     .bdrv_discard       = raw_discard,
 
     .bdrv_is_inserted   = raw_is_inserted,
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
  2011-10-14  8:41 [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series) Paolo Bonzini
                   ` (2 preceding siblings ...)
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 3/4] block: drop redundant bdrv_flush implementation Paolo Bonzini
@ 2011-10-14  8:41 ` Paolo Bonzini
  2011-10-14 14:23   ` Kevin Wolf
  3 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14  8:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, stefanha

This similarly adds support for coroutine and asynchronous discard.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
	I was not sure if qcow2 could be changed to co_discard, though
	I suspected yes.

 block.c      |   72 +++++++++++++++++++++++++++++++++++++++++++++++++---------
 block.h      |    3 ++
 block/raw.c  |    8 ++++--
 block_int.h  |    9 +++++-
 trace-events |    1 +
 5 files changed, 77 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index 0af9a89..7c60361 100644
--- a/block.c
+++ b/block.c
@@ -1768,17 +1768,6 @@ int bdrv_has_zero_init(BlockDriverState *bs)
     return 1;
 }
 
-int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
-{
-    if (!bs->drv) {
-        return -ENOMEDIUM;
-    }
-    if (!bs->drv->bdrv_discard) {
-        return 0;
-    }
-    return bs->drv->bdrv_discard(bs, sector_num, nb_sectors);
-}
-
 /*
  * Returns true iff the specified sector is present in the disk image. Drivers
  * not implementing the functionality are assumed to not support backing files,
@@ -2911,6 +2900,66 @@ int bdrv_flush(BlockDriverState *bs)
     return rwco.ret;
 }
 
+static void bdrv_discard_co_entry(void *opaque)
+{
+    RwCo *rwco = opaque;
+    BlockDriverState *bs = rwco->bs;
+    int64_t sector_num = rwco->sector_num;
+    int nb_sectors = rwco->nb_sectors;
+
+    if (!bs->drv) {
+        rwco->ret = -ENOMEDIUM;
+    } else if (bdrv_check_request(bs, sector_num, nb_sectors)) {
+        rwco->ret = -EIO;
+    } else if (bs->read_only) {
+        rwco->ret = -EROFS;
+    } else if (bs->drv->bdrv_co_discard) {
+        rwco->ret = bs->drv->bdrv_co_discard(bs, sector_num, nb_sectors);
+    } else if (bs->drv->bdrv_aio_discard) {
+        BlockDriverAIOCB *acb;
+        CoroutineIOCompletion co = {
+            .coroutine = qemu_coroutine_self(),
+        };
+
+        acb = bs->drv->bdrv_aio_discard(bs, sector_num, nb_sectors,
+                                        bdrv_co_io_em_complete, &co);
+        if (acb == NULL) {
+            rwco->ret = -EIO;
+        } else {
+            qemu_coroutine_yield();
+            rwco->ret = co.ret;
+        }
+    } else if (bs->drv->bdrv_discard) {
+        rwco->ret = bs->drv->bdrv_discard(bs, sector_num, nb_sectors);
+    } else {
+        rwco->ret = 0;
+    }
+}
+
+int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+{
+    Coroutine *co;
+    RwCo rwco = {
+        .bs = bs,
+        .sector_num = sector_num,
+        .nb_sectors = nb_sectors,
+        .ret = NOT_DONE,
+    };
+
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        bdrv_discard_co_entry(&rwco);
+    } else {
+        co = qemu_coroutine_create(bdrv_discard_co_entry);
+        qemu_coroutine_enter(co, &rwco);
+        while (rwco.ret == NOT_DONE) {
+            qemu_aio_wait();
+        }
+    }
+
+    return rwco.ret;
+}
+
 /**************************************************************/
 /* removable device support */
 
diff --git a/block.h b/block.h
index e77988e..4f4aa94 100644
--- a/block.h
+++ b/block.h
@@ -166,6 +166,9 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num,
                                   BlockDriverCompletionFunc *cb, void *opaque);
 BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs,
                                  BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs,
+                                   int64_t sector_num, int nb_sectors,
+                                   BlockDriverCompletionFunc *cb, void *opaque);
 void bdrv_aio_cancel(BlockDriverAIOCB *acb);
 
 typedef struct BlockRequest {
diff --git a/block/raw.c b/block/raw.c
index 161c9cf..ff68514 100644
--- a/block/raw.c
+++ b/block/raw.c
@@ -45,7 +45,8 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
    return 1; /* everything can be opened as raw image */
 }
 
-static int raw_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
+static int coroutine_fn raw_co_discard(BlockDriverState *bs,
+                                       int64_t sector_num, int nb_sectors)
 {
     return bdrv_discard(bs->file, sector_num, nb_sectors);
 }
@@ -109,15 +110,16 @@ static BlockDriver bdrv_raw = {
 
     .bdrv_open          = raw_open,
     .bdrv_close         = raw_close,
+
     .bdrv_co_readv      = raw_co_readv,
     .bdrv_co_writev     = raw_co_writev,
     .bdrv_co_flush      = raw_co_flush,
+    .bdrv_co_discard    = raw_co_discard,
+
     .bdrv_probe         = raw_probe,
     .bdrv_getlength     = raw_getlength,
     .bdrv_truncate      = raw_truncate,
 
-    .bdrv_discard       = raw_discard,
-
     .bdrv_is_inserted   = raw_is_inserted,
     .bdrv_media_changed = raw_media_changed,
     .bdrv_eject         = raw_eject,
diff --git a/block_int.h b/block_int.h
index 9cb536d..384598f 100644
--- a/block_int.h
+++ b/block_int.h
@@ -63,6 +63,8 @@ struct BlockDriver {
     void (*bdrv_close)(BlockDriverState *bs);
     int (*bdrv_create)(const char *filename, QEMUOptionParameter *options);
     int (*bdrv_flush)(BlockDriverState *bs);
+    int (*bdrv_discard)(BlockDriverState *bs, int64_t sector_num,
+                        int nb_sectors);
     int (*bdrv_is_allocated)(BlockDriverState *bs, int64_t sector_num,
                              int nb_sectors, int *pnum);
     int (*bdrv_set_key)(BlockDriverState *bs, const char *key);
@@ -76,14 +78,17 @@ struct BlockDriver {
         BlockDriverCompletionFunc *cb, void *opaque);
     BlockDriverAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
         BlockDriverCompletionFunc *cb, void *opaque);
-    int (*bdrv_discard)(BlockDriverState *bs, int64_t sector_num,
-                        int nb_sectors);
+    BlockDriverAIOCB *(*bdrv_aio_discard)(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors,
+        BlockDriverCompletionFunc *cb, void *opaque);
 
     int coroutine_fn (*bdrv_co_readv)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
     int coroutine_fn (*bdrv_co_writev)(BlockDriverState *bs,
         int64_t sector_num, int nb_sectors, QEMUIOVector *qiov);
     int coroutine_fn (*bdrv_co_flush)(BlockDriverState *bs);
+    int coroutine_fn (*bdrv_co_discard)(BlockDriverState *bs,
+        int64_t sector_num, int nb_sectors);
 
     int (*bdrv_aio_multiwrite)(BlockDriverState *bs, BlockRequest *reqs,
         int num_reqs);
diff --git a/trace-events b/trace-events
index 63d8c8e..e6160ad 100644
--- a/trace-events
+++ b/trace-events
@@ -61,6 +61,7 @@ multiwrite_cb(void *mcb, int ret) "mcb %p ret %d"
 bdrv_aio_multiwrite(void *mcb, int num_callbacks, int num_reqs) "mcb %p num_callbacks %d num_reqs %d"
 bdrv_aio_multiwrite_earlyfail(void *mcb) "mcb %p"
 bdrv_aio_multiwrite_latefail(void *mcb, int i) "mcb %p i %d"
+bdrv_aio_discard(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
 bdrv_aio_flush(void *bs, void *opaque) "bs %p opaque %p"
 bdrv_aio_readv(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
 bdrv_aio_writev(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 2/4] block: unify flush implementations Paolo Bonzini
@ 2011-10-14 11:08   ` Kevin Wolf
  2011-10-14 11:30     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-10-14 11:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.10.2011 10:41, schrieb Paolo Bonzini:
> Add coroutine support for flush and apply the same emulation that
> we already do for read/write.  bdrv_aio_flush is simplified to always
> go through a coroutine.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

To make the implementation more consistent with read/write operations,
wouldn't it make sense to provide a bdrv_co_flush() globally instead of
using the synchronous version as the preferred public interface?

This is the semantics that I would expect of a bdrv_co_flush() anyway,
your use of it for an AIO emulation functions confused me a bit at first.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 11:08   ` Kevin Wolf
@ 2011-10-14 11:30     ` Paolo Bonzini
  2011-10-14 11:54       ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 11:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On 10/14/2011 01:08 PM, Kevin Wolf wrote:
> Am 14.10.2011 10:41, schrieb Paolo Bonzini:
>> Add coroutine support for flush and apply the same emulation that
>> we already do for read/write.  bdrv_aio_flush is simplified to always
>> go through a coroutine.
>>
>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>
> To make the implementation more consistent with read/write operations,
> wouldn't it make sense to provide a bdrv_co_flush() globally instead of
> using the synchronous version as the preferred public interface?

I thought about it, but then it turned out that I would have

         bdrv_flush
         -> create coroutine or just fast-path to bdrv_flush_co_entry
            -> bdrv_flush_co_entry
               -> driver

and

         bdrv_co_flush
         -> bdrv_flush_co_entry
            -> driver

In other words, the code would be exactly the same, save for an "if 
(qemu_in_coroutine())".  The reason is that, unlike read/write, neither 
flush nor discard take a qiov.

In general, I think that with Stefan's cleanup having specialized 
coroutine versions has in general a much smaller benefit.  The code 
reading benefit of naming routines like bdrv_co_* is already lost, for 
example, since bdrv_read can yield when called for coroutine context.

Let me show how this might go.  Right now you have

         bdrv_read/write
         -> bdrv_rw_co
            -> create qiov
            -> create coroutine or just fast-path to bdrv_rw_co_entry
	      -> bdrv_rw_co_entry
                  -> bdrv_co_do_readv/writev
                     -> driver

         bdrv_co_readv/writev
         -> bdrv_co_do_readv/writev
            -> driver

But starting from here, you might just as well reorganize it like this:

         bdrv_read/writev
         -> bdrv_rw_co
            -> create qiov
            -> bdrv_readv/writev

         bdrv_readv/writev
         -> create coroutine or just fast-path to bdrv_rw_co_entry
            -> bdrv_rw_co_entry
               -> bdrv_co_do_readv/writev
                  -> driver

and just drop bdrv_co_readv, since it would just be hard-coding the 
fast-path of bdrv_readv.

Since some amount of synchronous I/O would likely always be there, for 
example in qemu-img, I think this unification would make more sense than 
providing two separate entrypoints for bdrv_co_flush and bdrv_flush.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 11:30     ` Paolo Bonzini
@ 2011-10-14 11:54       ` Kevin Wolf
  2011-10-14 12:42         ` Paolo Bonzini
  2011-10-14 13:20         ` Stefan Hajnoczi
  0 siblings, 2 replies; 17+ messages in thread
From: Kevin Wolf @ 2011-10-14 11:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.10.2011 13:30, schrieb Paolo Bonzini:
> On 10/14/2011 01:08 PM, Kevin Wolf wrote:
>> Am 14.10.2011 10:41, schrieb Paolo Bonzini:
>>> Add coroutine support for flush and apply the same emulation that
>>> we already do for read/write.  bdrv_aio_flush is simplified to always
>>> go through a coroutine.
>>>
>>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>
>> To make the implementation more consistent with read/write operations,
>> wouldn't it make sense to provide a bdrv_co_flush() globally instead of
>> using the synchronous version as the preferred public interface?
> 
> I thought about it, but then it turned out that I would have
> 
>          bdrv_flush
>          -> create coroutine or just fast-path to bdrv_flush_co_entry
>             -> bdrv_flush_co_entry
>                -> driver
> 
> and
> 
>          bdrv_co_flush
>          -> bdrv_flush_co_entry
>             -> driver
> 
> In other words, the code would be exactly the same, save for an "if 
> (qemu_in_coroutine())".  The reason is that, unlike read/write, neither 
> flush nor discard take a qiov.

What I was thinking of looks a bit different:

    bdrv_flush
    -> create coroutine or just fast-path to bdrv_flush_co_entry
       -> bdrv_flush_co_entry
          -> bdrv_co_flush

and

    bdrv_co_flush
    -> driver

And the reason for this is that bdrv_co_flush would be a function that
does only little more than passing the function to the driver (just like
most bdrv_* functions do), with no emulation going on at all.

> In general, I think that with Stefan's cleanup having specialized 
> coroutine versions has in general a much smaller benefit.  The code 
> reading benefit of naming routines like bdrv_co_* is already lost, for 
> example, since bdrv_read can yield when called for coroutine context.

Instead of taking a void* and working on a RwCo structure that is really
meant for emulation, bdrv_co_flush would take a BlockDriverState and
improve readability this way.

The more complicated and ugly code would be left separated and only used
for emulation. I think that would make it easier to understand the
common path without being distracted by emulation code.

> Let me show how this might go.  Right now you have
> 
>          bdrv_read/write
>          -> bdrv_rw_co
>             -> create qiov
>             -> create coroutine or just fast-path to bdrv_rw_co_entry
> 	      -> bdrv_rw_co_entry
>                   -> bdrv_co_do_readv/writev
>                      -> driver
> 
>          bdrv_co_readv/writev
>          -> bdrv_co_do_readv/writev
>             -> driver
> 
> But starting from here, you might just as well reorganize it like this:
> 
>          bdrv_read/writev
>          -> bdrv_rw_co
>             -> create qiov
>             -> bdrv_readv/writev
> 
>          bdrv_readv/writev
>          -> create coroutine or just fast-path to bdrv_rw_co_entry
>             -> bdrv_rw_co_entry
>                -> bdrv_co_do_readv/writev
>                   -> driver
> 
> and just drop bdrv_co_readv, since it would just be hard-coding the 
> fast-path of bdrv_readv.

I guess it's all a matter of taste. Stefan, what do you think?

> Since some amount of synchronous I/O would likely always be there, for 
> example in qemu-img, I think this unification would make more sense than 
> providing two separate entrypoints for bdrv_co_flush and bdrv_flush.

Actually, I'm not so sure about qemu-img. I think we have thought of
scenarios where converting it to a coroutine based version with a main
loop would be helpful (can't remember the details, though).

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 11:54       ` Kevin Wolf
@ 2011-10-14 12:42         ` Paolo Bonzini
  2011-10-14 14:02           ` Kevin Wolf
  2011-10-14 13:20         ` Stefan Hajnoczi
  1 sibling, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 12:42 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On 10/14/2011 01:54 PM, Kevin Wolf wrote:
> Am 14.10.2011 13:30, schrieb Paolo Bonzini:
>> On 10/14/2011 01:08 PM, Kevin Wolf wrote:
>>> Am 14.10.2011 10:41, schrieb Paolo Bonzini:
>>>> Add coroutine support for flush and apply the same emulation that
>>>> we already do for read/write.  bdrv_aio_flush is simplified to always
>>>> go through a coroutine.
>>>>
>>>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>>
>>> To make the implementation more consistent with read/write operations,
>>> wouldn't it make sense to provide a bdrv_co_flush() globally instead of
>>> using the synchronous version as the preferred public interface?
>>
> What I was thinking of looks a bit different:
>
>      bdrv_flush
>      ->  create coroutine or just fast-path to bdrv_flush_co_entry
>         ->  bdrv_flush_co_entry
>            ->  bdrv_co_flush
>
> and
>
>      bdrv_co_flush
>      ->  driver
>
> And the reason for this is that bdrv_co_flush would be a function that
> does only little more than passing the function to the driver (just like
> most bdrv_* functions do), with no emulation going on at all.

It would still host the checks on BDRV_O_NO_FLUSH and bs->drv->*_flush. 
  It would be the same as bdrv_flush_co_entry is now, minus the 
marshalling in/out of the RwCo.

> Instead of taking a void* and working on a RwCo structure that is really
> meant for emulation, bdrv_co_flush would take a BlockDriverState and
> improve readability this way.

I see.  Yeah, that's doable, but I'd still need two coroutines (one for 
bdrv_flush, one for bdrv_aio_flush) and the patch would be bigger overall...

> The more complicated and ugly code would be left separated and only used
> for emulation. I think that would make it easier to understand the
> common path without being distracted by emulation code.

... and on the other hand the length of the call chain would increse. 
It easily gets confusing, it already is for me in the read/write case.

Would bdrv_co_flush be static or not?  If not, you also get an 
additional entry point of dubious additional value, i.e. more complexity.

> Actually, I'm not so sure about qemu-img. I think we have thought of
> scenarios where converting it to a coroutine based version with a main
> loop would be helpful (can't remember the details, though).

qemu-img convert might benefit from multiple in-flight requests if on of 
the endpoints is remote or perhaps even sparse, I guess.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 11:54       ` Kevin Wolf
  2011-10-14 12:42         ` Paolo Bonzini
@ 2011-10-14 13:20         ` Stefan Hajnoczi
  2011-10-14 13:47           ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Hajnoczi @ 2011-10-14 13:20 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Paolo Bonzini, qemu-devel

On Fri, Oct 14, 2011 at 01:54:42PM +0200, Kevin Wolf wrote:
> Am 14.10.2011 13:30, schrieb Paolo Bonzini:
> > On 10/14/2011 01:08 PM, Kevin Wolf wrote:
> >> Am 14.10.2011 10:41, schrieb Paolo Bonzini:
> > Let me show how this might go.  Right now you have
> > 
> >          bdrv_read/write
> >          -> bdrv_rw_co
> >             -> create qiov
> >             -> create coroutine or just fast-path to bdrv_rw_co_entry
> > 	      -> bdrv_rw_co_entry
> >                   -> bdrv_co_do_readv/writev
> >                      -> driver
> > 
> >          bdrv_co_readv/writev
> >          -> bdrv_co_do_readv/writev
> >             -> driver
> > 
> > But starting from here, you might just as well reorganize it like this:
> > 
> >          bdrv_read/writev
> >          -> bdrv_rw_co
> >             -> create qiov
> >             -> bdrv_readv/writev
> > 
> >          bdrv_readv/writev
> >          -> create coroutine or just fast-path to bdrv_rw_co_entry
> >             -> bdrv_rw_co_entry
> >                -> bdrv_co_do_readv/writev
> >                   -> driver
> > 
> > and just drop bdrv_co_readv, since it would just be hard-coding the 
> > fast-path of bdrv_readv.
> 
> I guess it's all a matter of taste. Stefan, what do you think?
> 
> > Since some amount of synchronous I/O would likely always be there, for 
> > example in qemu-img, I think this unification would make more sense than 
> > providing two separate entrypoints for bdrv_co_flush and bdrv_flush.
> 
> Actually, I'm not so sure about qemu-img. I think we have thought of
> scenarios where converting it to a coroutine based version with a main
> loop would be helpful (can't remember the details, though).

I'd like to completely remove synchronous bdrv_*(), including for
qemu-img.

It's just too tempting to call these functions in contexts where it is
not okay to do so.  The bdrv_co_*() functions are all tagged as
coroutine_fn and make it clear that they can yield.

We already have an event loop in qemu-img except it's the nested event
loop in synchronous bdrv_*() emulation functions.  The nested event loop
is a mini event loop and can't really do things like timers.  It would
be nicer to remove it in favor of a single top-level event loop with the
qemu-img code running in a coroutine.

So I'm in favor of keeping bdrv_co_*() explicit for now and refactoring
both hw/ and qemu-tool users of synchronous functions.

Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 13:20         ` Stefan Hajnoczi
@ 2011-10-14 13:47           ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 13:47 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Kevin Wolf, qemu-devel

On 10/14/2011 03:20 PM, Stefan Hajnoczi wrote:
> It's just too tempting to call these functions in contexts where it is
> not okay to do so.  The bdrv_co_*() functions are all tagged as
> coroutine_fn and make it clear that they can yield.

Yes, I agree.

> We already have an event loop in qemu-img except it's the nested event
> loop in synchronous bdrv_*() emulation functions.  The nested event loop
> is a mini event loop and can't really do things like timers.  It would
> be nicer to remove it in favor of a single top-level event loop with the
> qemu-img code running in a coroutine.

Note that the nested event loop cannot go away because of 
qemu_aio_flush, though. :(

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 12:42         ` Paolo Bonzini
@ 2011-10-14 14:02           ` Kevin Wolf
  2011-10-14 14:04             ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-10-14 14:02 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.10.2011 14:42, schrieb Paolo Bonzini:
> On 10/14/2011 01:54 PM, Kevin Wolf wrote:
>> Am 14.10.2011 13:30, schrieb Paolo Bonzini:
>>> On 10/14/2011 01:08 PM, Kevin Wolf wrote:
>>>> Am 14.10.2011 10:41, schrieb Paolo Bonzini:
>>>>> Add coroutine support for flush and apply the same emulation that
>>>>> we already do for read/write.  bdrv_aio_flush is simplified to always
>>>>> go through a coroutine.
>>>>>
>>>>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>>>
>>>> To make the implementation more consistent with read/write operations,
>>>> wouldn't it make sense to provide a bdrv_co_flush() globally instead of
>>>> using the synchronous version as the preferred public interface?
>>>
>> What I was thinking of looks a bit different:
>>
>>      bdrv_flush
>>      ->  create coroutine or just fast-path to bdrv_flush_co_entry
>>         ->  bdrv_flush_co_entry
>>            ->  bdrv_co_flush
>>
>> and
>>
>>      bdrv_co_flush
>>      ->  driver
>>
>> And the reason for this is that bdrv_co_flush would be a function that
>> does only little more than passing the function to the driver (just like
>> most bdrv_* functions do), with no emulation going on at all.
> 
> It would still host the checks on BDRV_O_NO_FLUSH and bs->drv->*_flush. 
>   It would be the same as bdrv_flush_co_entry is now, minus the 
> marshalling in/out of the RwCo.

Right.

By the way, I like how you handle all three backends in the same
function. I think this is a lot more readable than the solution used by
read/write (changing the function pointers on driver registration).

>> Instead of taking a void* and working on a RwCo structure that is really
>> meant for emulation, bdrv_co_flush would take a BlockDriverState and
>> improve readability this way.
> 
> I see.  Yeah, that's doable, but I'd still need two coroutines (one for 
> bdrv_flush, one for bdrv_aio_flush) and the patch would be bigger overall...

You already have two of them (bdrv_co_flush for AIO and
bdrv_flush_co_entry for synchronous), so I don't think it makes a
difference there.

>> The more complicated and ugly code would be left separated and only used
>> for emulation. I think that would make it easier to understand the
>> common path without being distracted by emulation code.
> 
> ... and on the other hand the length of the call chain would increse. 
> It easily gets confusing, it already is for me in the read/write case.

Well, depends on what you're looking at. The call chain length would
increase for AIO and synchronous bdrv_flush, but it would become shorter
for bdrv_co_flush.

If we want to declare coroutines as the preferred interface, I think
such a change makes sense.

> Would bdrv_co_flush be static or not?  If not, you also get an 
> additional entry point of dubious additional value, i.e. more complexity.

I think I would make it public. The one that has to go eventually is the
synchronous bdrv_flush(). Which is another reason why I wouldn't design
everything around it.

>> Actually, I'm not so sure about qemu-img. I think we have thought of
>> scenarios where converting it to a coroutine based version with a main
>> loop would be helpful (can't remember the details, though).
> 
> qemu-img convert might benefit from multiple in-flight requests if on of 
> the endpoints is remote or perhaps even sparse, I guess.

Quite possible, yes.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
  2011-10-14 14:02           ` Kevin Wolf
@ 2011-10-14 14:04             ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 14:04 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On 10/14/2011 04:02 PM, Kevin Wolf wrote:
>> >
>> >  It would still host the checks on BDRV_O_NO_FLUSH and bs->drv->*_flush.
>> >     It would be the same as bdrv_flush_co_entry is now, minus the
>> >  marshalling in/out of the RwCo.
> Right.
>
> By the way, I like how you handle all three backends in the same
> function. I think this is a lot more readable than the solution used by
> read/write (changing the function pointers on driver registration).
>

Yeah, and it makes sense to handle all of them in the bdrv_co_* version.

Will resubmit.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
  2011-10-14  8:41 ` [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support Paolo Bonzini
@ 2011-10-14 14:23   ` Kevin Wolf
  2011-10-14 14:24     ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-10-14 14:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.10.2011 10:41, schrieb Paolo Bonzini:
> This similarly adds support for coroutine and asynchronous discard.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Do we really need bdrv_discard and bdrv_aio_discard in the backends? I
think it makes sense to have a bdrv_aio_discard() in block.h as AIO
generally fits well for device models, but I would just require
bdrv_co_discard for any block drivers implementing discard.

> 	I was not sure if qcow2 could be changed to co_discard, though
> 	I suspected yes.

As discussed on IRC: Yes, it just must make sure to take s->lock.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
  2011-10-14 14:23   ` Kevin Wolf
@ 2011-10-14 14:24     ` Paolo Bonzini
  2011-10-14 14:32       ` Kevin Wolf
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 14:24 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On 10/14/2011 04:23 PM, Kevin Wolf wrote:
> >  This similarly adds support for coroutine and asynchronous discard.
> >
> >  Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>
> Do we really need bdrv_discard and bdrv_aio_discard in the backends? I
> think it makes sense to have a bdrv_aio_discard() in block.h as AIO
> generally fits well for device models, but I would just require
> bdrv_co_discard for any block drivers implementing discard.

bdrv_discard is needed for now since I wouldn't like to conflate this 
patch with the qcow2 patch.

I can certainly drop aio_discard from the backends, but I'm not sure how 
heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE).  Probably not much, 
but I think there's no guarantee of O(1) behavior especially with 
filesystems like ecryptfs.  So you would need to go through the thread 
pool and aio_discard would come in handy.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
  2011-10-14 14:24     ` Paolo Bonzini
@ 2011-10-14 14:32       ` Kevin Wolf
  2011-10-14 14:35         ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2011-10-14 14:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, stefanha

Am 14.10.2011 16:24, schrieb Paolo Bonzini:
> On 10/14/2011 04:23 PM, Kevin Wolf wrote:
>>>  This similarly adds support for coroutine and asynchronous discard.
>>>
>>>  Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>
>> Do we really need bdrv_discard and bdrv_aio_discard in the backends? I
>> think it makes sense to have a bdrv_aio_discard() in block.h as AIO
>> generally fits well for device models, but I would just require
>> bdrv_co_discard for any block drivers implementing discard.
> 
> bdrv_discard is needed for now since I wouldn't like to conflate this 
> patch with the qcow2 patch.

Okay, that makes sense. Then I'll drop it when I convert qcow2.

> I can certainly drop aio_discard from the backends, but I'm not sure how 
> heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE).  Probably not much, 
> but I think there's no guarantee of O(1) behavior especially with 
> filesystems like ecryptfs.  So you would need to go through the thread 
> pool and aio_discard would come in handy.

Sure, but the coroutine interface should be just as good for
implementing it this way in raw-posix.

Kevin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
  2011-10-14 14:32       ` Kevin Wolf
@ 2011-10-14 14:35         ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2011-10-14 14:35 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, stefanha

On 10/14/2011 04:32 PM, Kevin Wolf wrote:
> >  I can certainly drop aio_discard from the backends, but I'm not sure how
> >  heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE).  Probably not much,
> >  but I think there's no guarantee of O(1) behavior especially with
> >  filesystems like ecryptfs.  So you would need to go through the thread
> >  pool and aio_discard would come in handy.
>
> Sure, but the coroutine interface should be just as good for
> implementing it this way in raw-posix.

I think until someone rewrites raw_aio_submit/paio_submit in terms of 
coroutines, it's better to keep aio_discard.

Paolo

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-10-14 14:35 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-14  8:41 [Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series) Paolo Bonzini
2011-10-14  8:41 ` [Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh Paolo Bonzini
2011-10-14  8:41 ` [Qemu-devel] [PATCH 2/4] block: unify flush implementations Paolo Bonzini
2011-10-14 11:08   ` Kevin Wolf
2011-10-14 11:30     ` Paolo Bonzini
2011-10-14 11:54       ` Kevin Wolf
2011-10-14 12:42         ` Paolo Bonzini
2011-10-14 14:02           ` Kevin Wolf
2011-10-14 14:04             ` Paolo Bonzini
2011-10-14 13:20         ` Stefan Hajnoczi
2011-10-14 13:47           ` Paolo Bonzini
2011-10-14  8:41 ` [Qemu-devel] [PATCH 3/4] block: drop redundant bdrv_flush implementation Paolo Bonzini
2011-10-14  8:41 ` [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support Paolo Bonzini
2011-10-14 14:23   ` Kevin Wolf
2011-10-14 14:24     ` Paolo Bonzini
2011-10-14 14:32       ` Kevin Wolf
2011-10-14 14:35         ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).