qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write
@ 2018-04-24 19:25 Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 1/6] block: Support byte-based aio callbacks Eric Blake
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, kwolf, qemu-block

While we would prefer that block drivers use coroutines instead
of aio callbacks, it is a fairly easy exercise to prove that
all existing drivers with aio callbacks are merely scaling
from bytes into sectors and back to bytes.  So, even though I
am not set up to completely run (or even compile-test) this
full series, it seems pretty straightforward to change the
signature to quit playing games with pointless scaling.

Incorporate Kevin's review on v1, which amounted to pretty much
rewriting the series to be saner (the block layer now defaults
to alignment of 1, so drivers that still need 512 for keeping the
patch conservative have to override that; and improve the code
in io.c to put byte-based access before sector-based fallbacks).

001/6:[0023] [FC] 'block: Support byte-based aio callbacks'
002/6:[0011] [FC] 'file-win32: Switch to byte-based callbacks'
003/6:[0007] [FC] 'null: Switch to byte-based read/write'
004/6:[0008] [FC] 'rbd: Switch to byte-based callbacks'
005/6:[0007] [FC] 'vxhs: Switch to byte-based callbacks'
006/6:[0093] [FC] 'block: Drop last of the sector-based aio callbacks'

Eric Blake (6):
  block: Support byte-based aio callbacks
  file-win32: Switch to byte-based callbacks
  null: Switch to byte-based read/write
  rbd: Switch to byte-based callbacks
  vxhs: Switch to byte-based callbacks
  block: Drop last of the sector-based aio callbacks

 include/block/block_int.h |  8 +++---
 include/block/raw-aio.h   |  2 +-
 block/io.c                | 64 ++++++++++++++++++++++++++---------------------
 block/file-win32.c        | 47 +++++++++++++++++++++-------------
 block/null.c              | 59 ++++++++++++++++++++++---------------------
 block/rbd.c               | 44 +++++++++++++++++---------------
 block/vxhs.c              | 43 +++++++++++++++----------------
 block/win32-aio.c         |  5 ++--
 8 files changed, 148 insertions(+), 124 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 1/6] block: Support byte-based aio callbacks
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks Eric Blake
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: jsnow, kwolf, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Add new sector-based aio callbacks for read and write,
to match the fact that bdrv_aio_pdiscard is already byte-based.

Ideally, drivers should be converted to use coroutine callbacks
rather than aio; but that is not quite as trivial (and if we were
to do that conversion, the null-aio driver would disappear), so for
the short term, converting the signature but keeping things with
aio is easier.  However, we CAN declare that a driver that uses
the byte-based aio interfaces now defaults to byte-based
operations, and must explicitly provide a refresh_limits override
to stick with larger alignments (making the alignment issues more
obvious directly in the drivers touched in the next few patches).

Once all drivers are converted, the sector-based aio callbacks will
be removed; in the meantime, a FIXME comment is added due to a
slight inefficiency that will be touched up as part of that later
cleanup.

Simplify some instances of 'bs->drv' into 'drv' while touching this,
since the local variable already exists to reduce typing.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: change default alignment to byte-based [Kevin], add comment
about temporary slight inefficiency [Kevin]
---
 include/block/block_int.h |  6 ++++++
 block/io.c                | 38 +++++++++++++++++++++++++++++---------
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index c4dd1d4bb8a..e772e3502b1 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -144,9 +144,15 @@ struct BlockDriver {
     BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
+    BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
+        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
+    BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
+        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs,
         BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs,
diff --git a/block/io.c b/block/io.c
index bd9a19a9c4d..407bc25df41 100644
--- a/block/io.c
+++ b/block/io.c
@@ -92,7 +92,8 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
     }

     /* Default alignment based on whether driver has byte interface */
-    bs->bl.request_alignment = drv->bdrv_co_preadv ? 1 : 512;
+    bs->bl.request_alignment = (drv->bdrv_co_preadv ||
+                                drv->bdrv_aio_preadv) ? 1 : 512;

     /* Take some limits from the children as a default */
     if (bs->file) {
@@ -924,12 +925,15 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
         return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
     }

+    /* FIXME - no need to calculate these if .bdrv_aio_preadv exists */
     sector_num = offset >> BDRV_SECTOR_BITS;
     nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    if (!drv->bdrv_aio_preadv) {
+        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    }

     if (drv->bdrv_co_readv) {
         return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
@@ -939,8 +943,13 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        acb = bs->drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
+        if (drv->bdrv_aio_preadv) {
+            acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
+                                       bdrv_co_io_em_complete, &co);
+        } else {
+            acb = drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
                                       bdrv_co_io_em_complete, &co);
+        }
         if (acb == NULL) {
             return -EIO;
         } else {
@@ -972,12 +981,15 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
         goto emulate_flags;
     }

+    /* FIXME - no need to calculate these if .bdrv_aio_pwritev exists */
     sector_num = offset >> BDRV_SECTOR_BITS;
     nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    if (!drv->bdrv_aio_pwritev) {
+        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    }

     if (drv->bdrv_co_writev_flags) {
         ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
@@ -992,8 +1004,16 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
             .coroutine = qemu_coroutine_self(),
         };

-        acb = bs->drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
+        if (drv->bdrv_aio_pwritev) {
+            acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
+                                        flags & bs->supported_write_flags,
+                                        bdrv_co_io_em_complete, &co);
+            flags &= ~bs->supported_write_flags;
+        } else {
+            assert(!bs->supported_write_flags);
+            acb = drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
                                        bdrv_co_io_em_complete, &co);
+        }
         if (acb == NULL) {
             ret = -EIO;
         } else {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 1/6] block: Support byte-based aio callbacks Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-25 10:57   ` Kevin Wolf
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 3/6] null: Switch to byte-based read/write Eric Blake
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, kwolf, qemu-block, Max Reitz, Stefan Weil

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the file-win32 driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, as I don't know if Windows is tolerant of
non-sector AIO operations, I went with the conservative approach
of modifying .bdrv_refresh_limits to override the block layer
defaults back to the pre-patch value of 512.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
Compile-tested via 'make docker-test-mingw@fedora', but I don't
have a sane way to test whether it actually works.

v2: override new block layer default alignment [Kevin]
---
 include/block/raw-aio.h |  2 +-
 block/file-win32.c      | 47 +++++++++++++++++++++++++++++------------------
 block/win32-aio.c       |  5 ++---
 3 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index a4cdbbf1b7a..9e47b8a629d 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -57,7 +57,7 @@ void win32_aio_cleanup(QEMUWin32AIOState *aio);
 int win32_aio_attach(QEMUWin32AIOState *aio, HANDLE hfile);
 BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
         QEMUWin32AIOState *aio, HANDLE hfile,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
         BlockCompletionFunc *cb, void *opaque, int type);
 void win32_aio_detach_aio_context(QEMUWin32AIOState *aio,
                                   AioContext *old_context);
diff --git a/block/file-win32.c b/block/file-win32.c
index 2e2f746bb19..3c67db43364 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -251,7 +251,11 @@ static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
                          &dg.Geometry.BytesPerSector,
                          &freeClusters, &totalClusters);
         bs->bl.request_alignment = dg.Geometry.BytesPerSector;
+        return;
     }
+
+    /* XXX Does Windows support AIO on less than 512-byte alignment? */
+    bs->bl.request_alignment = 512;
 }

 static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
@@ -410,32 +414,32 @@ fail:
     return ret;
 }

-static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
-                         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                         BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *raw_aio_preadv(BlockDriverState *bs,
+                                  uint64_t offset, uint64_t bytes,
+                                  QEMUIOVector *qiov, int flags,
+                                  BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_READ);
+        return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
+                                cb, opaque, QEMU_AIO_READ);
     } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
+        return paio_submit(bs, s->hfile, offset, qiov, bytes,
                            cb, opaque, QEMU_AIO_READ);
     }
 }

-static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
-                          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                          BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *raw_aio_pwritev(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
+                                   BlockCompletionFunc *cb, void *opaque)
 {
     BDRVRawState *s = bs->opaque;
     if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_WRITE);
+        return win32_aio_submit(bs, s->aio, s->hfile, offset, bytes, qiov,
+                                cb, opaque, QEMU_AIO_WRITE);
     } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
+        return paio_submit(bs, s->hfile, offset, qiov, bytes,
                            cb, opaque, QEMU_AIO_WRITE);
     }
 }
@@ -632,8 +636,8 @@ BlockDriver bdrv_file = {
     .bdrv_co_create_opts = raw_co_create_opts,
     .bdrv_has_zero_init = bdrv_has_zero_init_1,

-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_preadv    = raw_aio_preadv,
+    .bdrv_aio_pwritev   = raw_aio_pwritev,
     .bdrv_aio_flush     = raw_aio_flush,

     .bdrv_truncate	= raw_truncate,
@@ -708,6 +712,12 @@ static void hdev_parse_filename(const char *filename, QDict *options,
     bdrv_parse_filename_strip_prefix(filename, "host_device:", options);
 }

+static void hdev_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    /* XXX Does Windows support AIO on less than 512-byte alignment? */
+    bs->bl.request_alignment = 512;
+}
+
 static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
                      Error **errp)
 {
@@ -793,9 +803,10 @@ static BlockDriver bdrv_host_device = {
     .bdrv_probe_device	= hdev_probe_device,
     .bdrv_file_open	= hdev_open,
     .bdrv_close		= raw_close,
+    .bdrv_refresh_limits = hdev_refresh_limits,

-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_preadv    = raw_aio_preadv,
+    .bdrv_aio_pwritev   = raw_aio_pwritev,
     .bdrv_aio_flush     = raw_aio_flush,

     .bdrv_detach_aio_context = raw_detach_aio_context,
diff --git a/block/win32-aio.c b/block/win32-aio.c
index 3be8f458fab..9cd355d42f8 100644
--- a/block/win32-aio.c
+++ b/block/win32-aio.c
@@ -112,15 +112,14 @@ static const AIOCBInfo win32_aiocb_info = {

 BlockAIOCB *win32_aio_submit(BlockDriverState *bs,
         QEMUWin32AIOState *aio, HANDLE hfile,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+        uint64_t offset, uint64_t bytes, QEMUIOVector *qiov,
         BlockCompletionFunc *cb, void *opaque, int type)
 {
     struct QEMUWin32AIOCB *waiocb;
-    uint64_t offset = sector_num * 512;
     DWORD rc;

     waiocb = qemu_aio_get(&win32_aiocb_info, bs, cb, opaque);
-    waiocb->nbytes = nb_sectors * 512;
+    waiocb->nbytes = bytes;
     waiocb->qiov = qiov;
     waiocb->is_read = (type == QEMU_AIO_READ);

-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 3/6] null: Switch to byte-based read/write
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 1/6] block: Support byte-based aio callbacks Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks Eric Blake
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, kwolf, qemu-block, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the null-co and null-aio drivers.

Note that since the null driver does nothing on writes, it trivially
supports the BDRV_REQ_FUA flag (all writes have already landed to
the same bit-bucket without needing an extra flush call).  Also, since
the null driver does just as well with byte-based requests, we can
now avoid cycles wasted on read-modify-write by taking advantage of
the block layer now defaulting the alignment to 1 instead of 512.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: rely on new block layer default alignment [Kevin]
---
 block/null.c | 59 ++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/block/null.c b/block/null.c
index 806a8631e4d..8fbbda52ea1 100644
--- a/block/null.c
+++ b/block/null.c
@@ -93,6 +93,7 @@ static int null_file_open(BlockDriverState *bs, QDict *options, int flags,
     }
     s->read_zeroes = qemu_opt_get_bool(opts, NULL_OPT_ZEROES, false);
     qemu_opts_del(opts);
+    bs->supported_write_flags = BDRV_REQ_FUA;
     return ret;
 }

@@ -116,22 +117,22 @@ static coroutine_fn int null_co_common(BlockDriverState *bs)
     return 0;
 }

-static coroutine_fn int null_co_readv(BlockDriverState *bs,
-                                      int64_t sector_num, int nb_sectors,
-                                      QEMUIOVector *qiov)
+static coroutine_fn int null_co_preadv(BlockDriverState *bs,
+                                       uint64_t offset, uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags)
 {
     BDRVNullState *s = bs->opaque;

     if (s->read_zeroes) {
-        qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
+        qemu_iovec_memset(qiov, 0, 0, bytes);
     }

     return null_co_common(bs);
 }

-static coroutine_fn int null_co_writev(BlockDriverState *bs,
-                                       int64_t sector_num, int nb_sectors,
-                                       QEMUIOVector *qiov)
+static coroutine_fn int null_co_pwritev(BlockDriverState *bs,
+                                        uint64_t offset, uint64_t bytes,
+                                        QEMUIOVector *qiov, int flags)
 {
     return null_co_common(bs);
 }
@@ -186,26 +187,26 @@ static inline BlockAIOCB *null_aio_common(BlockDriverState *bs,
     return &acb->common;
 }

-static BlockAIOCB *null_aio_readv(BlockDriverState *bs,
-                                  int64_t sector_num, QEMUIOVector *qiov,
-                                  int nb_sectors,
-                                  BlockCompletionFunc *cb,
-                                  void *opaque)
-{
-    BDRVNullState *s = bs->opaque;
-
-    if (s->read_zeroes) {
-        qemu_iovec_memset(qiov, 0, 0, nb_sectors * BDRV_SECTOR_SIZE);
-    }
-
-    return null_aio_common(bs, cb, opaque);
-}
-
-static BlockAIOCB *null_aio_writev(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
+static BlockAIOCB *null_aio_preadv(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
                                    BlockCompletionFunc *cb,
                                    void *opaque)
+{
+    BDRVNullState *s = bs->opaque;
+
+    if (s->read_zeroes) {
+        qemu_iovec_memset(qiov, 0, 0, bytes);
+    }
+
+    return null_aio_common(bs, cb, opaque);
+}
+
+static BlockAIOCB *null_aio_pwritev(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t bytes,
+                                    QEMUIOVector *qiov, int flags,
+                                    BlockCompletionFunc *cb,
+                                    void *opaque)
 {
     return null_aio_common(bs, cb, opaque);
 }
@@ -266,8 +267,8 @@ static BlockDriver bdrv_null_co = {
     .bdrv_close             = null_close,
     .bdrv_getlength         = null_getlength,

-    .bdrv_co_readv          = null_co_readv,
-    .bdrv_co_writev         = null_co_writev,
+    .bdrv_co_preadv         = null_co_preadv,
+    .bdrv_co_pwritev        = null_co_pwritev,
     .bdrv_co_flush_to_disk  = null_co_flush,
     .bdrv_reopen_prepare    = null_reopen_prepare,

@@ -286,8 +287,8 @@ static BlockDriver bdrv_null_aio = {
     .bdrv_close             = null_close,
     .bdrv_getlength         = null_getlength,

-    .bdrv_aio_readv         = null_aio_readv,
-    .bdrv_aio_writev        = null_aio_writev,
+    .bdrv_aio_preadv        = null_aio_preadv,
+    .bdrv_aio_pwritev       = null_aio_pwritev,
     .bdrv_aio_flush         = null_aio_flush,
     .bdrv_reopen_prepare    = null_reopen_prepare,

-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
                   ` (2 preceding siblings ...)
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 3/6] null: Switch to byte-based read/write Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-24 19:53   ` [Qemu-devel] [Qemu-block] " Jason Dillaman
  2018-04-25 13:00   ` Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 5/6] vxhs: " Eric Blake
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, kwolf, qemu-block, Josh Durgin, Jeff Cody, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the rbd driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, as I don't know if RBD is tolerant of
non-sector AIO operations, I went with the conservate approach
of adding .bdrv_refresh_limits to override the block layer
defaults back to the pre-patch value of 512.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: override new block layer default alignment [Kevin]
---
 block/rbd.c | 44 ++++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index c9359d0ad84..638ecf8d986 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -231,6 +231,13 @@ done:
 }


+static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    /* XXX Does RBD support AIO on less than 512-byte alignment? */
+    bs->bl.request_alignment = 512;
+}
+
+
 static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
                              Error **errp)
 {
@@ -899,27 +906,23 @@ failed:
     return NULL;
 }

-static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
-                                      int64_t sector_num,
-                                      QEMUIOVector *qiov,
-                                      int nb_sectors,
-                                      BlockCompletionFunc *cb,
-                                      void *opaque)
-{
-    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
-                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
-                         RBD_AIO_READ);
-}
-
-static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
-                                       int64_t sector_num,
-                                       QEMUIOVector *qiov,
-                                       int nb_sectors,
+static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
+                                       uint64_t offset, uint64_t bytes,
+                                       QEMUIOVector *qiov, int flags,
                                        BlockCompletionFunc *cb,
                                        void *opaque)
 {
-    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
-                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
+    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
+                         RBD_AIO_READ);
+}
+
+static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
+                                        uint64_t offset, uint64_t bytes,
+                                        QEMUIOVector *qiov, int flags,
+                                        BlockCompletionFunc *cb,
+                                        void *opaque)
+{
+    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
                          RBD_AIO_WRITE);
 }

@@ -1158,6 +1161,7 @@ static BlockDriver bdrv_rbd = {
     .format_name            = "rbd",
     .instance_size          = sizeof(BDRVRBDState),
     .bdrv_parse_filename    = qemu_rbd_parse_filename,
+    .bdrv_refresh_limits    = qemu_rbd_refresh_limits,
     .bdrv_file_open         = qemu_rbd_open,
     .bdrv_close             = qemu_rbd_close,
     .bdrv_reopen_prepare    = qemu_rbd_reopen_prepare,
@@ -1170,8 +1174,8 @@ static BlockDriver bdrv_rbd = {
     .bdrv_truncate          = qemu_rbd_truncate,
     .protocol_name          = "rbd",

-    .bdrv_aio_readv         = qemu_rbd_aio_readv,
-    .bdrv_aio_writev        = qemu_rbd_aio_writev,
+    .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
+    .bdrv_aio_pwritev       = qemu_rbd_aio_pwritev,

 #ifdef LIBRBD_SUPPORTS_AIO_FLUSH
     .bdrv_aio_flush         = qemu_rbd_aio_flush,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 5/6] vxhs: Switch to byte-based callbacks
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
                   ` (3 preceding siblings ...)
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
  2018-04-25 10:59 ` [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Kevin Wolf
  6 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, kwolf, qemu-block, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Make the change for the last few sector-based callbacks
in the vxhs driver.

Note that the driver was already using byte-based calls for
performing actual I/O, so this just gets rid of a round trip
of scaling; however, as I don't know if VxHS is tolerant of
non-sector AIO operations, I went with the conservative approach
of adding .bdrv_refresh_limits to override the block layer
defaults back to the pre-patch value of 512.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
I was unable to get this to compile, as I do not have the vxhs
devel headers installed.  By inspection, it should work, but
a careful review is appreciated.

v2: override new block layer default alignment [Kevin]
---
 block/vxhs.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/block/vxhs.c b/block/vxhs.c
index 75cc6c8672c..64be5324dd5 100644
--- a/block/vxhs.c
+++ b/block/vxhs.c
@@ -216,6 +216,12 @@ static void vxhs_parse_filename(const char *filename, QDict *options,
     }
 }

+static void vxhs_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    /* XXX Does VXHS support AIO on less than 512-byte alignment? */
+    bs->bl.request_alignment = 512;
+}
+
 static int vxhs_init_and_ref(void)
 {
     if (vxhs_ref++ == 0) {
@@ -424,21 +430,17 @@ static const AIOCBInfo vxhs_aiocb_info = {
  * and is passed to QNIO. When QNIO completes the work,
  * it will be passed back through the callback.
  */
-static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
-                               QEMUIOVector *qiov, int nb_sectors,
+static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, uint64_t offset,
+                               QEMUIOVector *qiov, uint64_t size,
                                BlockCompletionFunc *cb, void *opaque,
                                VDISKAIOCmd iodir)
 {
     VXHSAIOCB *acb = NULL;
     BDRVVXHSState *s = bs->opaque;
-    size_t size;
-    uint64_t offset;
     int iio_flags = 0;
     int ret = 0;
     void *dev_handle = s->vdisk_hostinfo.dev_handle;

-    offset = sector_num * BDRV_SECTOR_SIZE;
-    size = nb_sectors * BDRV_SECTOR_SIZE;
     acb = qemu_aio_get(&vxhs_aiocb_info, bs, cb, opaque);

     /*
@@ -451,11 +453,11 @@ static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
     switch (iodir) {
     case VDISK_AIO_WRITE:
             ret = iio_writev(dev_handle, acb, qiov->iov, qiov->niov,
-                             offset, (uint64_t)size, iio_flags);
+                             offset, size, iio_flags);
             break;
     case VDISK_AIO_READ:
             ret = iio_readv(dev_handle, acb, qiov->iov, qiov->niov,
-                            offset, (uint64_t)size, iio_flags);
+                            offset, size, iio_flags);
             break;
     default:
             trace_vxhs_aio_rw_invalid(iodir);
@@ -474,22 +476,20 @@ errout:
     return NULL;
 }

-static BlockAIOCB *vxhs_aio_readv(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
+static BlockAIOCB *vxhs_aio_preadv(BlockDriverState *bs,
+                                   uint64_t offset, uint64_t bytes,
+                                   QEMUIOVector *qiov, int flags,
                                    BlockCompletionFunc *cb, void *opaque)
 {
-    return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors, cb,
-                       opaque, VDISK_AIO_READ);
+    return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_READ);
 }

-static BlockAIOCB *vxhs_aio_writev(BlockDriverState *bs,
-                                   int64_t sector_num, QEMUIOVector *qiov,
-                                   int nb_sectors,
-                                   BlockCompletionFunc *cb, void *opaque)
+static BlockAIOCB *vxhs_aio_pwritev(BlockDriverState *bs,
+                                    uint64_t offset, uint64_t bytes,
+                                    QEMUIOVector *qiov, int flags,
+                                    BlockCompletionFunc *cb, void *opaque)
 {
-    return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors,
-                       cb, opaque, VDISK_AIO_WRITE);
+    return vxhs_aio_rw(bs, offset, qiov, bytes, cb, opaque, VDISK_AIO_WRITE);
 }

 static void vxhs_close(BlockDriverState *bs)
@@ -561,10 +561,11 @@ static BlockDriver bdrv_vxhs = {
     .instance_size                = sizeof(BDRVVXHSState),
     .bdrv_file_open               = vxhs_open,
     .bdrv_parse_filename          = vxhs_parse_filename,
+    .bdrv_refresh_limits          = vxhs_refresh_limits,
     .bdrv_close                   = vxhs_close,
     .bdrv_getlength               = vxhs_getlength,
-    .bdrv_aio_readv               = vxhs_aio_readv,
-    .bdrv_aio_writev              = vxhs_aio_writev,
+    .bdrv_aio_preadv              = vxhs_aio_preadv,
+    .bdrv_aio_pwritev             = vxhs_aio_pwritev,
 };

 static void bdrv_vxhs_init(void)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [Qemu-devel] [PATCH v2 6/6] block: Drop last of the sector-based aio callbacks
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
                   ` (4 preceding siblings ...)
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 5/6] vxhs: " Eric Blake
@ 2018-04-24 19:25 ` Eric Blake
  2018-04-25 10:59 ` [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Kevin Wolf
  6 siblings, 0 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-24 19:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: jsnow, kwolf, qemu-block, Stefan Hajnoczi, Fam Zheng, Max Reitz

We are gradually moving away from sector-based interfaces, towards
byte-based.  Now that all drivers with aio callbacks are using the
byte-based interfaces, we can remove the sector-based versions.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: rearrange conditionals to put byte-based code first [Kevin]
---
 include/block/block_int.h |  6 ----
 block/io.c                | 86 ++++++++++++++++++++---------------------------
 2 files changed, 37 insertions(+), 55 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index e772e3502b1..0bba7ed024a 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -141,15 +141,9 @@ struct BlockDriver {
     void (*bdrv_refresh_filename)(BlockDriverState *bs, QDict *options);

     /* aio */
-    BlockAIOCB *(*bdrv_aio_readv)(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
         BlockCompletionFunc *cb, void *opaque);
-    BlockAIOCB *(*bdrv_aio_writev)(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque);
     BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags,
         BlockCompletionFunc *cb, void *opaque);
diff --git a/block/io.c b/block/io.c
index 407bc25df41..6b110b207a0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -925,31 +925,14 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
         return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
     }

-    /* FIXME - no need to calculate these if .bdrv_aio_preadv exists */
-    sector_num = offset >> BDRV_SECTOR_BITS;
-    nb_sectors = bytes >> BDRV_SECTOR_BITS;
-
-    if (!drv->bdrv_aio_preadv) {
-        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
-    }
-
-    if (drv->bdrv_co_readv) {
-        return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
-    } else {
+    if (drv->bdrv_aio_preadv) {
         BlockAIOCB *acb;
         CoroutineIOCompletion co = {
             .coroutine = qemu_coroutine_self(),
         };

-        if (drv->bdrv_aio_preadv) {
-            acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
-                                       bdrv_co_io_em_complete, &co);
-        } else {
-            acb = drv->bdrv_aio_readv(bs, sector_num, qiov, nb_sectors,
-                                      bdrv_co_io_em_complete, &co);
-        }
+        acb = drv->bdrv_aio_preadv(bs, offset, bytes, qiov, flags,
+                                   bdrv_co_io_em_complete, &co);
         if (acb == NULL) {
             return -EIO;
         } else {
@@ -957,6 +940,16 @@ static int coroutine_fn bdrv_driver_preadv(BlockDriverState *bs,
             return co.ret;
         }
     }
+
+    sector_num = offset >> BDRV_SECTOR_BITS;
+    nb_sectors = bytes >> BDRV_SECTOR_BITS;
+
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
+    assert(drv->bdrv_co_readv);
+
+    return drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
 }

 static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
@@ -981,45 +974,40 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriverState *bs,
         goto emulate_flags;
     }

-    /* FIXME - no need to calculate these if .bdrv_aio_pwritev exists */
+    if (drv->bdrv_aio_pwritev) {
+        BlockAIOCB *acb;
+        CoroutineIOCompletion co = {
+            .coroutine = qemu_coroutine_self(),
+        };
+
+        acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
+                                    flags & bs->supported_write_flags,
+                                    bdrv_co_io_em_complete, &co);
+        flags &= ~bs->supported_write_flags;
+        if (acb == NULL) {
+            ret = -EIO;
+        } else {
+            qemu_coroutine_yield();
+            ret = co.ret;
+        }
+        goto emulate_flags;
+    }
+
     sector_num = offset >> BDRV_SECTOR_BITS;
     nb_sectors = bytes >> BDRV_SECTOR_BITS;

-    if (!drv->bdrv_aio_pwritev) {
-        assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
-        assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);
-    }
+    assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert((bytes >> BDRV_SECTOR_BITS) <= BDRV_REQUEST_MAX_SECTORS);

     if (drv->bdrv_co_writev_flags) {
         ret = drv->bdrv_co_writev_flags(bs, sector_num, nb_sectors, qiov,
                                         flags & bs->supported_write_flags);
         flags &= ~bs->supported_write_flags;
-    } else if (drv->bdrv_co_writev) {
+    } else {
+        assert(drv->bdrv_co_writev);
         assert(!bs->supported_write_flags);
         ret = drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov);
-    } else {
-        BlockAIOCB *acb;
-        CoroutineIOCompletion co = {
-            .coroutine = qemu_coroutine_self(),
-        };
-
-        if (drv->bdrv_aio_pwritev) {
-            acb = drv->bdrv_aio_pwritev(bs, offset, bytes, qiov,
-                                        flags & bs->supported_write_flags,
-                                        bdrv_co_io_em_complete, &co);
-            flags &= ~bs->supported_write_flags;
-        } else {
-            assert(!bs->supported_write_flags);
-            acb = drv->bdrv_aio_writev(bs, sector_num, qiov, nb_sectors,
-                                       bdrv_co_io_em_complete, &co);
-        }
-        if (acb == NULL) {
-            ret = -EIO;
-        } else {
-            qemu_coroutine_yield();
-            ret = co.ret;
-        }
     }

 emulate_flags:
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 4/6] rbd: Switch to byte-based callbacks
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks Eric Blake
@ 2018-04-24 19:53   ` Jason Dillaman
  2018-04-25 10:58     ` Kevin Wolf
  2018-04-25 13:00   ` Eric Blake
  1 sibling, 1 reply; 12+ messages in thread
From: Jason Dillaman @ 2018-04-24 19:53 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, Kevin Wolf, qemu-block, Max Reitz

On Tue, Apr 24, 2018 at 3:25 PM, Eric Blake <eblake@redhat.com> wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based callbacks
> in the rbd driver.
>
> Note that the driver was already using byte-based calls for
> performing actual I/O, so this just gets rid of a round trip
> of scaling; however, as I don't know if RBD is tolerant of
> non-sector AIO operations, I went with the conservate approach
> of adding .bdrv_refresh_limits to override the block layer
> defaults back to the pre-patch value of 512.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
> v2: override new block layer default alignment [Kevin]
> ---
>  block/rbd.c | 44 ++++++++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 20 deletions(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index c9359d0ad84..638ecf8d986 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -231,6 +231,13 @@ done:
>  }
>
>
> +static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +    /* XXX Does RBD support AIO on less than 512-byte alignment? */

Yes, librbd internally supports 1-byte alignment for IO, but the
optimal alignment/length would be object size * stripe count.

> +    bs->bl.request_alignment = 512;
> +}
> +
> +
>  static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
>                               Error **errp)
>  {
> @@ -899,27 +906,23 @@ failed:
>      return NULL;
>  }
>
> -static BlockAIOCB *qemu_rbd_aio_readv(BlockDriverState *bs,
> -                                      int64_t sector_num,
> -                                      QEMUIOVector *qiov,
> -                                      int nb_sectors,
> -                                      BlockCompletionFunc *cb,
> -                                      void *opaque)
> -{
> -    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
> -                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
> -                         RBD_AIO_READ);
> -}
> -
> -static BlockAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
> -                                       int64_t sector_num,
> -                                       QEMUIOVector *qiov,
> -                                       int nb_sectors,
> +static BlockAIOCB *qemu_rbd_aio_preadv(BlockDriverState *bs,
> +                                       uint64_t offset, uint64_t bytes,
> +                                       QEMUIOVector *qiov, int flags,
>                                         BlockCompletionFunc *cb,
>                                         void *opaque)
>  {
> -    return rbd_start_aio(bs, sector_num << BDRV_SECTOR_BITS, qiov,
> -                         (int64_t) nb_sectors << BDRV_SECTOR_BITS, cb, opaque,
> +    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
> +                         RBD_AIO_READ);
> +}
> +
> +static BlockAIOCB *qemu_rbd_aio_pwritev(BlockDriverState *bs,
> +                                        uint64_t offset, uint64_t bytes,
> +                                        QEMUIOVector *qiov, int flags,
> +                                        BlockCompletionFunc *cb,
> +                                        void *opaque)
> +{
> +    return rbd_start_aio(bs, offset, qiov, bytes, cb, opaque,
>                           RBD_AIO_WRITE);
>  }
>
> @@ -1158,6 +1161,7 @@ static BlockDriver bdrv_rbd = {
>      .format_name            = "rbd",
>      .instance_size          = sizeof(BDRVRBDState),
>      .bdrv_parse_filename    = qemu_rbd_parse_filename,
> +    .bdrv_refresh_limits    = qemu_rbd_refresh_limits,
>      .bdrv_file_open         = qemu_rbd_open,
>      .bdrv_close             = qemu_rbd_close,
>      .bdrv_reopen_prepare    = qemu_rbd_reopen_prepare,
> @@ -1170,8 +1174,8 @@ static BlockDriver bdrv_rbd = {
>      .bdrv_truncate          = qemu_rbd_truncate,
>      .protocol_name          = "rbd",
>
> -    .bdrv_aio_readv         = qemu_rbd_aio_readv,
> -    .bdrv_aio_writev        = qemu_rbd_aio_writev,
> +    .bdrv_aio_preadv        = qemu_rbd_aio_preadv,
> +    .bdrv_aio_pwritev       = qemu_rbd_aio_pwritev,
>
>  #ifdef LIBRBD_SUPPORTS_AIO_FLUSH
>      .bdrv_aio_flush         = qemu_rbd_aio_flush,
> --
> 2.14.3
>
>



-- 
Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks Eric Blake
@ 2018-04-25 10:57   ` Kevin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2018-04-25 10:57 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, jsnow, qemu-block, Max Reitz, Stefan Weil

Am 24.04.2018 um 21:25 hat Eric Blake geschrieben:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based callbacks
> in the file-win32 driver.
> 
> Note that the driver was already using byte-based calls for
> performing actual I/O, so this just gets rid of a round trip
> of scaling; however, as I don't know if Windows is tolerant of
> non-sector AIO operations, I went with the conservative approach
> of modifying .bdrv_refresh_limits to override the block layer
> defaults back to the pre-patch value of 512.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> Compile-tested via 'make docker-test-mingw@fedora', but I don't
> have a sane way to test whether it actually works.

Tried to test it, and the only result is that something was broken even
before your patch:

$ ./qemu-img.exe create -f raw /tmp/test.raw 128M
Formatting '/tmp/test.raw', fmt=raw size=134217728
$ ./qemu-io.exe -f raw -c 'read 0 4k' /tmp/test.raw
read failed: Input/output error

For some reason, doing the same with qcow2 works fine. qemu-iotests
for qcow2 starts hanging in 013. Maybe someone should look into this,
qemu-iotests was working fairly well with mingw builds some time ago.
Nothing that will hold up this series, though.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 4/6] rbd: Switch to byte-based callbacks
  2018-04-24 19:53   ` [Qemu-devel] [Qemu-block] " Jason Dillaman
@ 2018-04-25 10:58     ` Kevin Wolf
  0 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2018-04-25 10:58 UTC (permalink / raw)
  To: dillaman; +Cc: Eric Blake, qemu-devel, qemu-block, Max Reitz

Am 24.04.2018 um 21:53 hat Jason Dillaman geschrieben:
> On Tue, Apr 24, 2018 at 3:25 PM, Eric Blake <eblake@redhat.com> wrote:
> > We are gradually moving away from sector-based interfaces, towards
> > byte-based.  Make the change for the last few sector-based callbacks
> > in the rbd driver.
> >
> > Note that the driver was already using byte-based calls for
> > performing actual I/O, so this just gets rid of a round trip
> > of scaling; however, as I don't know if RBD is tolerant of
> > non-sector AIO operations, I went with the conservate approach
> > of adding .bdrv_refresh_limits to override the block layer
> > defaults back to the pre-patch value of 512.
> >
> > Signed-off-by: Eric Blake <eblake@redhat.com>
> >
> > ---
> > v2: override new block layer default alignment [Kevin]
> > ---
> >  block/rbd.c | 44 ++++++++++++++++++++++++--------------------
> >  1 file changed, 24 insertions(+), 20 deletions(-)
> >
> > diff --git a/block/rbd.c b/block/rbd.c
> > index c9359d0ad84..638ecf8d986 100644
> > --- a/block/rbd.c
> > +++ b/block/rbd.c
> > @@ -231,6 +231,13 @@ done:
> >  }
> >
> >
> > +static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> > +{
> > +    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> 
> Yes, librbd internally supports 1-byte alignment for IO, but the
> optimal alignment/length would be object size * stripe count.

Would you like to post a follow-up patch to this series that removes the
.bdrv_refresh_limits implementation again with a commit message
explaining that RBD does support byte alignment?

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write
  2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
                   ` (5 preceding siblings ...)
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
@ 2018-04-25 10:59 ` Kevin Wolf
  6 siblings, 0 replies; 12+ messages in thread
From: Kevin Wolf @ 2018-04-25 10:59 UTC (permalink / raw)
  To: Eric Blake; +Cc: qemu-devel, jsnow, qemu-block

Am 24.04.2018 um 21:25 hat Eric Blake geschrieben:
> While we would prefer that block drivers use coroutines instead
> of aio callbacks, it is a fairly easy exercise to prove that
> all existing drivers with aio callbacks are merely scaling
> from bytes into sectors and back to bytes.  So, even though I
> am not set up to completely run (or even compile-test) this
> full series, it seems pretty straightforward to change the
> signature to quit playing games with pointless scaling.
> 
> Incorporate Kevin's review on v1, which amounted to pretty much
> rewriting the series to be saner (the block layer now defaults
> to alignment of 1, so drivers that still need 512 for keeping the
> patch conservative have to override that; and improve the code
> in io.c to put byte-based access before sector-based fallbacks).

Thanks, applied to the block-next branch.

Kevin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH v2 4/6] rbd: Switch to byte-based callbacks
  2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks Eric Blake
  2018-04-24 19:53   ` [Qemu-devel] [Qemu-block] " Jason Dillaman
@ 2018-04-25 13:00   ` Eric Blake
  1 sibling, 0 replies; 12+ messages in thread
From: Eric Blake @ 2018-04-25 13:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 797 bytes --]

On 04/24/2018 02:25 PM, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Make the change for the last few sector-based callbacks
> in the rbd driver.
> 
> Note that the driver was already using byte-based calls for
> performing actual I/O, so this just gets rid of a round trip
> of scaling; however, as I don't know if RBD is tolerant of
> non-sector AIO operations, I went with the conservate approach

s/conservate/conservative/

> of adding .bdrv_refresh_limits to override the block layer
> defaults back to the pre-patch value of 512.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-25 13:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-24 19:25 [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Eric Blake
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 1/6] block: Support byte-based aio callbacks Eric Blake
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 2/6] file-win32: Switch to byte-based callbacks Eric Blake
2018-04-25 10:57   ` Kevin Wolf
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 3/6] null: Switch to byte-based read/write Eric Blake
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 4/6] rbd: Switch to byte-based callbacks Eric Blake
2018-04-24 19:53   ` [Qemu-devel] [Qemu-block] " Jason Dillaman
2018-04-25 10:58     ` Kevin Wolf
2018-04-25 13:00   ` Eric Blake
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 5/6] vxhs: " Eric Blake
2018-04-24 19:25 ` [Qemu-devel] [PATCH v2 6/6] block: Drop last of the sector-based aio callbacks Eric Blake
2018-04-25 10:59 ` [Qemu-devel] [PATCH v2 0/6] block: byte-based AIO read/write Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).