qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/14] Block layer patches
@ 2017-01-09 13:44 Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 01/14] qemu-img: fix in-flight count for qemu-img bench Kevin Wolf
                   ` (14 more replies)
  0 siblings, 15 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

The following changes since commit ffe22bf51065dd33022cf91f77a821d1f11c250d:

  Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20161224' into staging (2017-01-06 15:18:09 +0000)

are available in the git repository at:


  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to c1bb86cd8ae67c14f79422b6e544d1e2bf40eeb2:

  block: Rename raw-{posix,win32} to file-*.c (2017-01-09 13:30:53 +0100)

----------------------------------------------------------------
Block layer patches

----------------------------------------------------------------
Eric Blake (2):
      block: Rename raw_bsd to raw-format.c
      block: Rename raw-{posix,win32} to file-*.c

Kevin Wolf (11):
      coroutine: Introduce qemu_coroutine_enter_if_inactive()
      quorum: Remove s from quorum_aio_get() arguments
      quorum: Implement .bdrv_co_readv/writev
      quorum: Do cleanup in caller coroutine
      quorum: Inline quorum_aio_cb()
      quorum: Avoid bdrv_aio_writev() for rewrites
      quorum: Implement .bdrv_co_preadv/pwritev()
      quorum: Inline quorum_fifo_aio_cb()
      quorum: Clean up quorum_aio_get()
      blkdebug: Implement bdrv_co_preadv/pwritev/flush
      blkverify: Implement bdrv_co_preadv/pwritev/flush

Paolo Bonzini (1):
      qemu-img: fix in-flight count for qemu-img bench

 MAINTAINERS                         |   6 +-
 block/Makefile.objs                 |   6 +-
 block/blkdebug.c                    |  86 ++++----
 block/blkverify.c                   | 201 +++++++++---------
 block/{raw-posix.c => file-posix.c} |   0
 block/{raw-win32.c => file-win32.c} |   0
 block/gluster.c                     |   4 +-
 block/quorum.c                      | 410 +++++++++++++++++++-----------------
 block/{raw_bsd.c => raw-format.c}   |   2 +-
 block/trace-events                  |   4 +-
 configure                           |   2 +-
 include/block/block_int.h           |   2 +-
 include/qemu/coroutine.h            |   6 +
 qemu-img.c                          |  17 +-
 tests/qemu-iotests/071.out          |   8 +-
 util/qemu-coroutine.c               |   7 +
 16 files changed, 392 insertions(+), 369 deletions(-)
 rename block/{raw-posix.c => file-posix.c} (100%)
 rename block/{raw-win32.c => file-win32.c} (100%)
 rename block/{raw_bsd.c => raw-format.c} (99%)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 01/14] qemu-img: fix in-flight count for qemu-img bench
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 02/14] coroutine: Introduce qemu_coroutine_enter_if_inactive() Kevin Wolf
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

From: Paolo Bonzini <pbonzini@redhat.com>

With aio=native (qemu-img bench -n) one or more requests can be completed
when a new request is submitted.  This in turn can cause bench_cb to
recurse before b->in_flight is updated.  This causes multiple I/Os
to be submitted with the same offset and, furthermore, the blk_aio_*
coroutines are never freed and qemu-img aborts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 qemu-img.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index 6949b73..5df66fe 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3559,20 +3559,23 @@ static void bench_cb(void *opaque, int ret)
     }
 
     while (b->n > b->in_flight && b->in_flight < b->nrreq) {
+        int64_t offset = b->offset;
+        /* blk_aio_* might look for completed I/Os and kick bench_cb
+         * again, so make sure this operation is counted by in_flight
+         * and b->offset is ready for the next submission.
+         */
+        b->in_flight++;
+        b->offset += b->step;
+        b->offset %= b->image_size;
         if (b->write) {
-            acb = blk_aio_pwritev(b->blk, b->offset, b->qiov, 0,
-                                  bench_cb, b);
+            acb = blk_aio_pwritev(b->blk, offset, b->qiov, 0, bench_cb, b);
         } else {
-            acb = blk_aio_preadv(b->blk, b->offset, b->qiov, 0,
-                                 bench_cb, b);
+            acb = blk_aio_preadv(b->blk, offset, b->qiov, 0, bench_cb, b);
         }
         if (!acb) {
             error_report("Failed to issue request");
             exit(EXIT_FAILURE);
         }
-        b->in_flight++;
-        b->offset += b->step;
-        b->offset %= b->image_size;
     }
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 02/14] coroutine: Introduce qemu_coroutine_enter_if_inactive()
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 01/14] qemu-img: fix in-flight count for qemu-img bench Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 03/14] quorum: Remove s from quorum_aio_get() arguments Kevin Wolf
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

In the context of asynchronous work, if we have a worker coroutine that
didn't yield, the parent coroutine cannot be reentered because it hasn't
yielded yet. In this case we don't even have to reenter the parent
because it will see that the work is already done and won't even yield.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 include/qemu/coroutine.h | 6 ++++++
 util/qemu-coroutine.c    | 7 +++++++
 2 files changed, 13 insertions(+)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index e6a60d5..12584ed 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -71,6 +71,12 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque);
 void qemu_coroutine_enter(Coroutine *coroutine);
 
 /**
+ * Transfer control to a coroutine if it's not active (i.e. part of the call
+ * stack of the running coroutine). Otherwise, do nothing.
+ */
+void qemu_coroutine_enter_if_inactive(Coroutine *co);
+
+/**
  * Transfer control back to a coroutine's caller
  *
  * This function does not return until the coroutine is re-entered using
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index 737bffa..a5d2f6c 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -131,6 +131,13 @@ void qemu_coroutine_enter(Coroutine *co)
     }
 }
 
+void qemu_coroutine_enter_if_inactive(Coroutine *co)
+{
+    if (!qemu_coroutine_entered(co)) {
+        qemu_coroutine_enter(co);
+    }
+}
+
 void coroutine_fn qemu_coroutine_yield(void)
 {
     Coroutine *self = qemu_coroutine_self();
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 03/14] quorum: Remove s from quorum_aio_get() arguments
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 01/14] qemu-img: fix in-flight count for qemu-img bench Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 02/14] coroutine: Introduce qemu_coroutine_enter_if_inactive() Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 04/14] quorum: Implement .bdrv_co_readv/writev Kevin Wolf
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

There is no point in passing the value of bs->opaque in order to
overwrite it with itself.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index d122299..dfa9fd3 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -171,18 +171,17 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
     return a->l == b->l;
 }
 
-static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
-                                   BlockDriverState *bs,
+static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
                                    QEMUIOVector *qiov,
                                    uint64_t sector_num,
                                    int nb_sectors,
                                    BlockCompletionFunc *cb,
                                    void *opaque)
 {
+    BDRVQuorumState *s = bs->opaque;
     QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque);
     int i;
 
-    acb->common.bs->opaque = s;
     acb->sector_num = sector_num;
     acb->nb_sectors = nb_sectors;
     acb->qiov = qiov;
@@ -691,7 +690,7 @@ static BlockAIOCB *quorum_aio_readv(BlockDriverState *bs,
                                     void *opaque)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num,
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num,
                                       nb_sectors, cb, opaque);
     acb->is_read = true;
     acb->children_read = 0;
@@ -711,7 +710,7 @@ static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
                                      void *opaque)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(s, bs, qiov, sector_num, nb_sectors,
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors,
                                       cb, opaque);
     int i;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 04/14] quorum: Implement .bdrv_co_readv/writev
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (2 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 03/14] quorum: Remove s from quorum_aio_get() arguments Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 05/14] quorum: Do cleanup in caller coroutine Kevin Wolf
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

This converts the quorum block driver from implementing callback-based
interfaces for read/write to coroutine-based ones. This is the first
step that will allow us further simplification of the code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 192 ++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 115 insertions(+), 77 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index dfa9fd3..6a7bd91 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -97,7 +97,7 @@ typedef struct QuorumAIOCB QuorumAIOCB;
  * $children_count QuorumChildRequest.
  */
 typedef struct QuorumChildRequest {
-    BlockAIOCB *aiocb;
+    BlockDriverState *bs;
     QEMUIOVector qiov;
     uint8_t *buf;
     int ret;
@@ -110,7 +110,8 @@ typedef struct QuorumChildRequest {
  * used to do operations on each children and track overall progress.
  */
 struct QuorumAIOCB {
-    BlockAIOCB common;
+    BlockDriverState *bs;
+    Coroutine *co;
 
     /* Request metadata */
     uint64_t sector_num;
@@ -129,36 +130,23 @@ struct QuorumAIOCB {
     QuorumVotes votes;
 
     bool is_read;
+    bool has_completed;
     int vote_ret;
     int children_read;          /* how many children have been read from */
 };
 
-static bool quorum_vote(QuorumAIOCB *acb);
-
-static void quorum_aio_cancel(BlockAIOCB *blockacb)
-{
-    QuorumAIOCB *acb = container_of(blockacb, QuorumAIOCB, common);
-    BDRVQuorumState *s = acb->common.bs->opaque;
-    int i;
-
-    /* cancel all callbacks */
-    for (i = 0; i < s->num_children; i++) {
-        if (acb->qcrs[i].aiocb) {
-            bdrv_aio_cancel_async(acb->qcrs[i].aiocb);
-        }
-    }
-}
+typedef struct QuorumCo {
+    QuorumAIOCB *acb;
+    int idx;
+} QuorumCo;
 
-static AIOCBInfo quorum_aiocb_info = {
-    .aiocb_size         = sizeof(QuorumAIOCB),
-    .cancel_async       = quorum_aio_cancel,
-};
+static bool quorum_vote(QuorumAIOCB *acb);
 
 static void quorum_aio_finalize(QuorumAIOCB *acb)
 {
-    acb->common.cb(acb->common.opaque, acb->vote_ret);
+    acb->has_completed = true;
     g_free(acb->qcrs);
-    qemu_aio_unref(acb);
+    qemu_coroutine_enter_if_inactive(acb->co);
 }
 
 static bool quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b)
@@ -174,14 +162,14 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
 static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
                                    QEMUIOVector *qiov,
                                    uint64_t sector_num,
-                                   int nb_sectors,
-                                   BlockCompletionFunc *cb,
-                                   void *opaque)
+                                   int nb_sectors)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = qemu_aio_get(&quorum_aiocb_info, bs, cb, opaque);
+    QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
     int i;
 
+    acb->co = qemu_coroutine_self();
+    acb->bs = bs;
     acb->sector_num = sector_num;
     acb->nb_sectors = nb_sectors;
     acb->qiov = qiov;
@@ -191,6 +179,7 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
     acb->rewrite_count = 0;
     acb->votes.compare = quorum_sha256_compare;
     QLIST_INIT(&acb->votes.vote_list);
+    acb->has_completed = false;
     acb->is_read = false;
     acb->vote_ret = 0;
 
@@ -217,7 +206,7 @@ static void quorum_report_bad(QuorumOpType type, uint64_t sector_num,
 
 static void quorum_report_failure(QuorumAIOCB *acb)
 {
-    const char *reference = bdrv_get_device_or_node_name(acb->common.bs);
+    const char *reference = bdrv_get_device_or_node_name(acb->bs);
     qapi_event_send_quorum_failure(reference, acb->sector_num,
                                    acb->nb_sectors, &error_abort);
 }
@@ -226,7 +215,7 @@ static int quorum_vote_error(QuorumAIOCB *acb);
 
 static bool quorum_has_too_much_io_failed(QuorumAIOCB *acb)
 {
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
 
     if (acb->success_count < s->threshold) {
         acb->vote_ret = quorum_vote_error(acb);
@@ -252,7 +241,7 @@ static void quorum_rewrite_aio_cb(void *opaque, int ret)
     quorum_aio_finalize(acb);
 }
 
-static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb);
+static int read_fifo_child(QuorumAIOCB *acb);
 
 static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
 {
@@ -272,14 +261,14 @@ static void quorum_report_bad_acb(QuorumChildRequest *sacb, int ret)
     QuorumAIOCB *acb = sacb->parent;
     QuorumOpType type = acb->is_read ? QUORUM_OP_TYPE_READ : QUORUM_OP_TYPE_WRITE;
     quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
-                      sacb->aiocb->bs->node_name, ret);
+                      sacb->bs->node_name, ret);
 }
 
-static void quorum_fifo_aio_cb(void *opaque, int ret)
+static int quorum_fifo_aio_cb(void *opaque, int ret)
 {
     QuorumChildRequest *sacb = opaque;
     QuorumAIOCB *acb = sacb->parent;
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
 
     assert(acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO);
 
@@ -288,8 +277,7 @@ static void quorum_fifo_aio_cb(void *opaque, int ret)
 
         /* We try to read next child in FIFO order if we fail to read */
         if (acb->children_read < s->num_children) {
-            read_fifo_child(acb);
-            return;
+            return read_fifo_child(acb);
         }
     }
 
@@ -297,13 +285,14 @@ static void quorum_fifo_aio_cb(void *opaque, int ret)
 
     /* FIXME: rewrite failed children if acb->children_read > 1? */
     quorum_aio_finalize(acb);
+    return ret;
 }
 
 static void quorum_aio_cb(void *opaque, int ret)
 {
     QuorumChildRequest *sacb = opaque;
     QuorumAIOCB *acb = sacb->parent;
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
     bool rewrite = false;
     int i;
 
@@ -518,7 +507,7 @@ static bool quorum_compare(QuorumAIOCB *acb,
                            QEMUIOVector *a,
                            QEMUIOVector *b)
 {
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
     ssize_t offset;
 
     /* This driver will replace blkverify in this particular case */
@@ -538,7 +527,7 @@ static bool quorum_compare(QuorumAIOCB *acb,
 /* Do a vote to get the error code */
 static int quorum_vote_error(QuorumAIOCB *acb)
 {
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
     QuorumVoteVersion *winner = NULL;
     QuorumVotes error_votes;
     QuorumVoteValue result_value;
@@ -573,7 +562,7 @@ static bool quorum_vote(QuorumAIOCB *acb)
     bool rewrite = false;
     int i, j, ret;
     QuorumVoteValue hash;
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
     QuorumVoteVersion *winner;
 
     if (quorum_has_too_much_io_failed(acb)) {
@@ -649,10 +638,25 @@ free_exit:
     return rewrite;
 }
 
-static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
+static void read_quorum_children_entry(void *opaque)
 {
-    BDRVQuorumState *s = acb->common.bs->opaque;
-    int i;
+    QuorumCo *co = opaque;
+    QuorumAIOCB *acb = co->acb;
+    BDRVQuorumState *s = acb->bs->opaque;
+    int i = co->idx;
+    int ret;
+
+    acb->qcrs[i].bs = s->children[i]->bs;
+    ret = bdrv_co_preadv(s->children[i], acb->sector_num * BDRV_SECTOR_SIZE,
+                         acb->nb_sectors * BDRV_SECTOR_SIZE,
+                         &acb->qcrs[i].qiov, 0);
+    quorum_aio_cb(&acb->qcrs[i], ret);
+}
+
+static int read_quorum_children(QuorumAIOCB *acb)
+{
+    BDRVQuorumState *s = acb->bs->opaque;
+    int i, ret;
 
     acb->children_read = s->num_children;
     for (i = 0; i < s->num_children; i++) {
@@ -662,65 +666,99 @@ static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
     }
 
     for (i = 0; i < s->num_children; i++) {
-        acb->qcrs[i].aiocb = bdrv_aio_readv(s->children[i], acb->sector_num,
-                                            &acb->qcrs[i].qiov, acb->nb_sectors,
-                                            quorum_aio_cb, &acb->qcrs[i]);
+        Coroutine *co;
+        QuorumCo data = {
+            .acb = acb,
+            .idx = i,
+        };
+
+        co = qemu_coroutine_create(read_quorum_children_entry, &data);
+        qemu_coroutine_enter(co);
     }
 
-    return &acb->common;
+    if (!acb->has_completed) {
+        qemu_coroutine_yield();
+    }
+
+    ret = acb->vote_ret;
+
+    return ret;
 }
 
-static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
+static int read_fifo_child(QuorumAIOCB *acb)
 {
-    BDRVQuorumState *s = acb->common.bs->opaque;
+    BDRVQuorumState *s = acb->bs->opaque;
     int n = acb->children_read++;
+    int ret;
 
-    acb->qcrs[n].aiocb = bdrv_aio_readv(s->children[n], acb->sector_num,
-                                        acb->qiov, acb->nb_sectors,
-                                        quorum_fifo_aio_cb, &acb->qcrs[n]);
+    acb->qcrs[n].bs = s->children[n]->bs;
+    ret = bdrv_co_preadv(s->children[n], acb->sector_num * BDRV_SECTOR_SIZE,
+                         acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
+    ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret);
 
-    return &acb->common;
+    return ret;
 }
 
-static BlockAIOCB *quorum_aio_readv(BlockDriverState *bs,
-                                    int64_t sector_num,
-                                    QEMUIOVector *qiov,
-                                    int nb_sectors,
-                                    BlockCompletionFunc *cb,
-                                    void *opaque)
+static int quorum_co_readv(BlockDriverState *bs,
+                           int64_t sector_num, int nb_sectors,
+                           QEMUIOVector *qiov)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num,
-                                      nb_sectors, cb, opaque);
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors);
+    int ret;
+
     acb->is_read = true;
     acb->children_read = 0;
 
     if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
-        return read_quorum_children(acb);
+        ret = read_quorum_children(acb);
+    } else {
+        ret = read_fifo_child(acb);
     }
+    g_free(acb);
+    return ret;
+}
 
-    return read_fifo_child(acb);
+static void write_quorum_entry(void *opaque)
+{
+    QuorumCo *co = opaque;
+    QuorumAIOCB *acb = co->acb;
+    BDRVQuorumState *s = acb->bs->opaque;
+    int i = co->idx;
+    int ret;
+
+    acb->qcrs[i].bs = s->children[i]->bs;
+    ret = bdrv_co_pwritev(s->children[i], acb->sector_num * BDRV_SECTOR_SIZE,
+                          acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
+    quorum_aio_cb(&acb->qcrs[i], ret);
 }
 
-static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
-                                     int64_t sector_num,
-                                     QEMUIOVector *qiov,
-                                     int nb_sectors,
-                                     BlockCompletionFunc *cb,
-                                     void *opaque)
+static int quorum_co_writev(BlockDriverState *bs,
+                            int64_t sector_num, int nb_sectors,
+                            QEMUIOVector *qiov)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors,
-                                      cb, opaque);
-    int i;
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors);
+    int i, ret;
 
     for (i = 0; i < s->num_children; i++) {
-        acb->qcrs[i].aiocb = bdrv_aio_writev(s->children[i], sector_num,
-                                             qiov, nb_sectors, &quorum_aio_cb,
-                                             &acb->qcrs[i]);
+        Coroutine *co;
+        QuorumCo data = {
+            .acb = acb,
+            .idx = i,
+        };
+
+        co = qemu_coroutine_create(write_quorum_entry, &data);
+        qemu_coroutine_enter(co);
     }
 
-    return &acb->common;
+    if (!acb->has_completed) {
+        qemu_coroutine_yield();
+    }
+
+    ret = acb->vote_ret;
+
+    return ret;
 }
 
 static int64_t quorum_getlength(BlockDriverState *bs)
@@ -1097,8 +1135,8 @@ static BlockDriver bdrv_quorum = {
 
     .bdrv_getlength                     = quorum_getlength,
 
-    .bdrv_aio_readv                     = quorum_aio_readv,
-    .bdrv_aio_writev                    = quorum_aio_writev,
+    .bdrv_co_readv                      = quorum_co_readv,
+    .bdrv_co_writev                     = quorum_co_writev,
 
     .bdrv_add_child                     = quorum_add_child,
     .bdrv_del_child                     = quorum_del_child,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 05/14] quorum: Do cleanup in caller coroutine
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (3 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 04/14] quorum: Implement .bdrv_co_readv/writev Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 06/14] quorum: Inline quorum_aio_cb() Kevin Wolf
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

Instead of calling quorum_aio_finalize() deeply nested in what used
to be an AIO callback, do it in the same functions that allocated the
AIOCB.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/quorum.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 6a7bd91..e044010 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -144,9 +144,8 @@ static bool quorum_vote(QuorumAIOCB *acb);
 
 static void quorum_aio_finalize(QuorumAIOCB *acb)
 {
-    acb->has_completed = true;
     g_free(acb->qcrs);
-    qemu_coroutine_enter_if_inactive(acb->co);
+    g_free(acb);
 }
 
 static bool quorum_sha256_compare(QuorumVoteValue *a, QuorumVoteValue *b)
@@ -238,7 +237,8 @@ static void quorum_rewrite_aio_cb(void *opaque, int ret)
         return;
     }
 
-    quorum_aio_finalize(acb);
+    acb->has_completed = true;
+    qemu_coroutine_enter_if_inactive(acb->co);
 }
 
 static int read_fifo_child(QuorumAIOCB *acb);
@@ -284,7 +284,7 @@ static int quorum_fifo_aio_cb(void *opaque, int ret)
     acb->vote_ret = ret;
 
     /* FIXME: rewrite failed children if acb->children_read > 1? */
-    quorum_aio_finalize(acb);
+
     return ret;
 }
 
@@ -322,7 +322,8 @@ static void quorum_aio_cb(void *opaque, int ret)
 
     /* if no rewrite is done the code will finish right away */
     if (!rewrite) {
-        quorum_aio_finalize(acb);
+        acb->has_completed = true;
+        qemu_coroutine_enter_if_inactive(acb->co);
     }
 }
 
@@ -715,7 +716,8 @@ static int quorum_co_readv(BlockDriverState *bs,
     } else {
         ret = read_fifo_child(acb);
     }
-    g_free(acb);
+    quorum_aio_finalize(acb);
+
     return ret;
 }
 
@@ -757,6 +759,7 @@ static int quorum_co_writev(BlockDriverState *bs,
     }
 
     ret = acb->vote_ret;
+    quorum_aio_finalize(acb);
 
     return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 06/14] quorum: Inline quorum_aio_cb()
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (4 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 05/14] quorum: Do cleanup in caller coroutine Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 07/14] quorum: Avoid bdrv_aio_writev() for rewrites Kevin Wolf
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

This is a conversion to a more natural coroutine style and improves the
readability of the driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/quorum.c | 128 ++++++++++++++++++++++++++-------------------------------
 1 file changed, 59 insertions(+), 69 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index e044010..2c280bb 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -130,7 +130,6 @@ struct QuorumAIOCB {
     QuorumVotes votes;
 
     bool is_read;
-    bool has_completed;
     int vote_ret;
     int children_read;          /* how many children have been read from */
 };
@@ -140,8 +139,6 @@ typedef struct QuorumCo {
     int idx;
 } QuorumCo;
 
-static bool quorum_vote(QuorumAIOCB *acb);
-
 static void quorum_aio_finalize(QuorumAIOCB *acb)
 {
     g_free(acb->qcrs);
@@ -178,7 +175,6 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
     acb->rewrite_count = 0;
     acb->votes.compare = quorum_sha256_compare;
     QLIST_INIT(&acb->votes.vote_list);
-    acb->has_completed = false;
     acb->is_read = false;
     acb->vote_ret = 0;
 
@@ -231,13 +227,6 @@ static void quorum_rewrite_aio_cb(void *opaque, int ret)
 
     /* one less rewrite to do */
     acb->rewrite_count--;
-
-    /* wait until all rewrite callbacks have completed */
-    if (acb->rewrite_count) {
-        return;
-    }
-
-    acb->has_completed = true;
     qemu_coroutine_enter_if_inactive(acb->co);
 }
 
@@ -288,45 +277,6 @@ static int quorum_fifo_aio_cb(void *opaque, int ret)
     return ret;
 }
 
-static void quorum_aio_cb(void *opaque, int ret)
-{
-    QuorumChildRequest *sacb = opaque;
-    QuorumAIOCB *acb = sacb->parent;
-    BDRVQuorumState *s = acb->bs->opaque;
-    bool rewrite = false;
-    int i;
-
-    sacb->ret = ret;
-    if (ret == 0) {
-        acb->success_count++;
-    } else {
-        quorum_report_bad_acb(sacb, ret);
-    }
-    acb->count++;
-    assert(acb->count <= s->num_children);
-    assert(acb->success_count <= s->num_children);
-    if (acb->count < s->num_children) {
-        return;
-    }
-
-    /* Do the vote on read */
-    if (acb->is_read) {
-        rewrite = quorum_vote(acb);
-        for (i = 0; i < s->num_children; i++) {
-            qemu_vfree(acb->qcrs[i].buf);
-            qemu_iovec_destroy(&acb->qcrs[i].qiov);
-        }
-    } else {
-        quorum_has_too_much_io_failed(acb);
-    }
-
-    /* if no rewrite is done the code will finish right away */
-    if (!rewrite) {
-        acb->has_completed = true;
-        qemu_coroutine_enter_if_inactive(acb->co);
-    }
-}
-
 static void quorum_report_bad_versions(BDRVQuorumState *s,
                                        QuorumAIOCB *acb,
                                        QuorumVoteValue *value)
@@ -557,17 +507,16 @@ static int quorum_vote_error(QuorumAIOCB *acb)
     return ret;
 }
 
-static bool quorum_vote(QuorumAIOCB *acb)
+static void quorum_vote(QuorumAIOCB *acb)
 {
     bool quorum = true;
-    bool rewrite = false;
     int i, j, ret;
     QuorumVoteValue hash;
     BDRVQuorumState *s = acb->bs->opaque;
     QuorumVoteVersion *winner;
 
     if (quorum_has_too_much_io_failed(acb)) {
-        return false;
+        return;
     }
 
     /* get the index of the first successful read */
@@ -595,7 +544,7 @@ static bool quorum_vote(QuorumAIOCB *acb)
     /* Every successful read agrees */
     if (quorum) {
         quorum_copy_qiov(acb->qiov, &acb->qcrs[i].qiov);
-        return false;
+        return;
     }
 
     /* compute hashes for each successful read, also store indexes */
@@ -630,13 +579,12 @@ static bool quorum_vote(QuorumAIOCB *acb)
 
     /* corruption correction is enabled */
     if (s->rewrite_corrupted) {
-        rewrite = quorum_rewrite_bad_versions(s, acb, &winner->value);
+        quorum_rewrite_bad_versions(s, acb, &winner->value);
     }
 
 free_exit:
     /* free lists */
     quorum_free_vote_list(&acb->votes);
-    return rewrite;
 }
 
 static void read_quorum_children_entry(void *opaque)
@@ -645,13 +593,28 @@ static void read_quorum_children_entry(void *opaque)
     QuorumAIOCB *acb = co->acb;
     BDRVQuorumState *s = acb->bs->opaque;
     int i = co->idx;
-    int ret;
+    QuorumChildRequest *sacb = &acb->qcrs[i];
+
+    sacb->bs = s->children[i]->bs;
+    sacb->ret = bdrv_co_preadv(s->children[i],
+                               acb->sector_num * BDRV_SECTOR_SIZE,
+                               acb->nb_sectors * BDRV_SECTOR_SIZE,
+                               &acb->qcrs[i].qiov, 0);
+
+    if (sacb->ret == 0) {
+        acb->success_count++;
+    } else {
+        quorum_report_bad_acb(sacb, sacb->ret);
+    }
 
-    acb->qcrs[i].bs = s->children[i]->bs;
-    ret = bdrv_co_preadv(s->children[i], acb->sector_num * BDRV_SECTOR_SIZE,
-                         acb->nb_sectors * BDRV_SECTOR_SIZE,
-                         &acb->qcrs[i].qiov, 0);
-    quorum_aio_cb(&acb->qcrs[i], ret);
+    acb->count++;
+    assert(acb->count <= s->num_children);
+    assert(acb->success_count <= s->num_children);
+
+    /* Wake up the caller after the last read */
+    if (acb->count == s->num_children) {
+        qemu_coroutine_enter_if_inactive(acb->co);
+    }
 }
 
 static int read_quorum_children(QuorumAIOCB *acb)
@@ -677,7 +640,18 @@ static int read_quorum_children(QuorumAIOCB *acb)
         qemu_coroutine_enter(co);
     }
 
-    if (!acb->has_completed) {
+    while (acb->count < s->num_children) {
+        qemu_coroutine_yield();
+    }
+
+    /* Do the vote on read */
+    quorum_vote(acb);
+    for (i = 0; i < s->num_children; i++) {
+        qemu_vfree(acb->qcrs[i].buf);
+        qemu_iovec_destroy(&acb->qcrs[i].qiov);
+    }
+
+    while (acb->rewrite_count) {
         qemu_coroutine_yield();
     }
 
@@ -727,12 +701,26 @@ static void write_quorum_entry(void *opaque)
     QuorumAIOCB *acb = co->acb;
     BDRVQuorumState *s = acb->bs->opaque;
     int i = co->idx;
-    int ret;
+    QuorumChildRequest *sacb = &acb->qcrs[i];
+
+    sacb->bs = s->children[i]->bs;
+    sacb->ret = bdrv_co_pwritev(s->children[i],
+                                acb->sector_num * BDRV_SECTOR_SIZE,
+                                acb->nb_sectors * BDRV_SECTOR_SIZE,
+                                acb->qiov, 0);
+    if (sacb->ret == 0) {
+        acb->success_count++;
+    } else {
+        quorum_report_bad_acb(sacb, sacb->ret);
+    }
+    acb->count++;
+    assert(acb->count <= s->num_children);
+    assert(acb->success_count <= s->num_children);
 
-    acb->qcrs[i].bs = s->children[i]->bs;
-    ret = bdrv_co_pwritev(s->children[i], acb->sector_num * BDRV_SECTOR_SIZE,
-                          acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
-    quorum_aio_cb(&acb->qcrs[i], ret);
+    /* Wake up the caller after the last write */
+    if (acb->count == s->num_children) {
+        qemu_coroutine_enter_if_inactive(acb->co);
+    }
 }
 
 static int quorum_co_writev(BlockDriverState *bs,
@@ -754,10 +742,12 @@ static int quorum_co_writev(BlockDriverState *bs,
         qemu_coroutine_enter(co);
     }
 
-    if (!acb->has_completed) {
+    while (acb->count < s->num_children) {
         qemu_coroutine_yield();
     }
 
+    quorum_has_too_much_io_failed(acb);
+
     ret = acb->vote_ret;
     quorum_aio_finalize(acb);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 07/14] quorum: Avoid bdrv_aio_writev() for rewrites
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (5 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 06/14] quorum: Inline quorum_aio_cb() Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 08/14] quorum: Implement .bdrv_co_preadv/pwritev() Kevin Wolf
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

Replacing it with bdrv_co_pwritev() prepares us for byte granularity
requests and gets us rid of the last bdrv_aio_*() user in quorum.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/quorum.c | 46 +++++++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 15 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 2c280bb..690fd36 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -221,15 +221,6 @@ static bool quorum_has_too_much_io_failed(QuorumAIOCB *acb)
     return false;
 }
 
-static void quorum_rewrite_aio_cb(void *opaque, int ret)
-{
-    QuorumAIOCB *acb = opaque;
-
-    /* one less rewrite to do */
-    acb->rewrite_count--;
-    qemu_coroutine_enter_if_inactive(acb->co);
-}
-
 static int read_fifo_child(QuorumAIOCB *acb);
 
 static void quorum_copy_qiov(QEMUIOVector *dest, QEMUIOVector *source)
@@ -296,7 +287,27 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
     }
 }
 
-static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
+static void quorum_rewrite_entry(void *opaque)
+{
+    QuorumCo *co = opaque;
+    QuorumAIOCB *acb = co->acb;
+    BDRVQuorumState *s = acb->bs->opaque;
+
+    /* Ignore any errors, it's just a correction attempt for already
+     * corrupted data. */
+    bdrv_co_pwritev(s->children[co->idx],
+                    acb->sector_num * BDRV_SECTOR_SIZE,
+                    acb->nb_sectors * BDRV_SECTOR_SIZE,
+                    acb->qiov, 0);
+
+    /* Wake up the caller after the last rewrite */
+    acb->rewrite_count--;
+    if (!acb->rewrite_count) {
+        qemu_coroutine_enter_if_inactive(acb->co);
+    }
+}
+
+static bool quorum_rewrite_bad_versions(QuorumAIOCB *acb,
                                         QuorumVoteValue *value)
 {
     QuorumVoteVersion *version;
@@ -315,7 +326,7 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
         }
     }
 
-    /* quorum_rewrite_aio_cb will count down this to zero */
+    /* quorum_rewrite_entry will count down this to zero */
     acb->rewrite_count = count;
 
     /* now fire the correcting rewrites */
@@ -324,9 +335,14 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, QuorumAIOCB *acb,
             continue;
         }
         QLIST_FOREACH(item, &version->items, next) {
-            bdrv_aio_writev(s->children[item->index], acb->sector_num,
-                            acb->qiov, acb->nb_sectors, quorum_rewrite_aio_cb,
-                            acb);
+            Coroutine *co;
+            QuorumCo data = {
+                .acb = acb,
+                .idx = item->index,
+            };
+
+            co = qemu_coroutine_create(quorum_rewrite_entry, &data);
+            qemu_coroutine_enter(co);
         }
     }
 
@@ -579,7 +595,7 @@ static void quorum_vote(QuorumAIOCB *acb)
 
     /* corruption correction is enabled */
     if (s->rewrite_corrupted) {
-        quorum_rewrite_bad_versions(s, acb, &winner->value);
+        quorum_rewrite_bad_versions(acb, &winner->value);
     }
 
 free_exit:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 08/14] quorum: Implement .bdrv_co_preadv/pwritev()
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (6 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 07/14] quorum: Avoid bdrv_aio_writev() for rewrites Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 09/14] quorum: Inline quorum_fifo_aio_cb() Kevin Wolf
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

This enables byte granularity requests on quorum nodes.

Note that the QMP events emitted by the driver are an external API that
we were careless enough to define as sector based. The offset and length
of requests reported in events are rounded therefore.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 81 +++++++++++++++++++++++++++-------------------------------
 1 file changed, 38 insertions(+), 43 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 690fd36..4bba9fd 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -114,8 +114,8 @@ struct QuorumAIOCB {
     Coroutine *co;
 
     /* Request metadata */
-    uint64_t sector_num;
-    int nb_sectors;
+    uint64_t offset;
+    uint64_t bytes;
 
     QEMUIOVector *qiov;         /* calling IOV */
 
@@ -157,8 +157,8 @@ static bool quorum_64bits_compare(QuorumVoteValue *a, QuorumVoteValue *b)
 
 static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
                                    QEMUIOVector *qiov,
-                                   uint64_t sector_num,
-                                   int nb_sectors)
+                                   uint64_t offset,
+                                   uint64_t bytes)
 {
     BDRVQuorumState *s = bs->opaque;
     QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
@@ -166,8 +166,8 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
 
     acb->co = qemu_coroutine_self();
     acb->bs = bs;
-    acb->sector_num = sector_num;
-    acb->nb_sectors = nb_sectors;
+    acb->offset = offset;
+    acb->bytes = bytes;
     acb->qiov = qiov;
     acb->qcrs = g_new0(QuorumChildRequest, s->num_children);
     acb->count = 0;
@@ -187,23 +187,30 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
     return acb;
 }
 
-static void quorum_report_bad(QuorumOpType type, uint64_t sector_num,
-                              int nb_sectors, char *node_name, int ret)
+static void quorum_report_bad(QuorumOpType type, uint64_t offset,
+                              uint64_t bytes, char *node_name, int ret)
 {
     const char *msg = NULL;
+    int64_t start_sector = offset / BDRV_SECTOR_SIZE;
+    int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
     if (ret < 0) {
         msg = strerror(-ret);
     }
 
-    qapi_event_send_quorum_report_bad(type, !!msg, msg, node_name,
-                                      sector_num, nb_sectors, &error_abort);
+    qapi_event_send_quorum_report_bad(type, !!msg, msg, node_name, start_sector,
+                                      end_sector - start_sector, &error_abort);
 }
 
 static void quorum_report_failure(QuorumAIOCB *acb)
 {
     const char *reference = bdrv_get_device_or_node_name(acb->bs);
-    qapi_event_send_quorum_failure(reference, acb->sector_num,
-                                   acb->nb_sectors, &error_abort);
+    int64_t start_sector = acb->offset / BDRV_SECTOR_SIZE;
+    int64_t end_sector = DIV_ROUND_UP(acb->offset + acb->bytes,
+                                      BDRV_SECTOR_SIZE);
+
+    qapi_event_send_quorum_failure(reference, start_sector,
+                                   end_sector - start_sector, &error_abort);
 }
 
 static int quorum_vote_error(QuorumAIOCB *acb);
@@ -240,8 +247,7 @@ static void quorum_report_bad_acb(QuorumChildRequest *sacb, int ret)
 {
     QuorumAIOCB *acb = sacb->parent;
     QuorumOpType type = acb->is_read ? QUORUM_OP_TYPE_READ : QUORUM_OP_TYPE_WRITE;
-    quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
-                      sacb->bs->node_name, ret);
+    quorum_report_bad(type, acb->offset, acb->bytes, sacb->bs->node_name, ret);
 }
 
 static int quorum_fifo_aio_cb(void *opaque, int ret)
@@ -280,8 +286,7 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
             continue;
         }
         QLIST_FOREACH(item, &version->items, next) {
-            quorum_report_bad(QUORUM_OP_TYPE_READ, acb->sector_num,
-                              acb->nb_sectors,
+            quorum_report_bad(QUORUM_OP_TYPE_READ, acb->offset, acb->bytes,
                               s->children[item->index]->bs->node_name, 0);
         }
     }
@@ -295,9 +300,7 @@ static void quorum_rewrite_entry(void *opaque)
 
     /* Ignore any errors, it's just a correction attempt for already
      * corrupted data. */
-    bdrv_co_pwritev(s->children[co->idx],
-                    acb->sector_num * BDRV_SECTOR_SIZE,
-                    acb->nb_sectors * BDRV_SECTOR_SIZE,
+    bdrv_co_pwritev(s->children[co->idx], acb->offset, acb->bytes,
                     acb->qiov, 0);
 
     /* Wake up the caller after the last rewrite */
@@ -462,8 +465,8 @@ static void GCC_FMT_ATTR(2, 3) quorum_err(QuorumAIOCB *acb,
     va_list ap;
 
     va_start(ap, fmt);
-    fprintf(stderr, "quorum: sector_num=%" PRId64 " nb_sectors=%d ",
-            acb->sector_num, acb->nb_sectors);
+    fprintf(stderr, "quorum: offset=%" PRIu64 " bytes=%" PRIu64 " ",
+            acb->offset, acb->bytes);
     vfprintf(stderr, fmt, ap);
     fprintf(stderr, "\n");
     va_end(ap);
@@ -481,9 +484,8 @@ static bool quorum_compare(QuorumAIOCB *acb,
     if (s->is_blkverify) {
         offset = qemu_iovec_compare(a, b);
         if (offset != -1) {
-            quorum_err(acb, "contents mismatch in sector %" PRId64,
-                       acb->sector_num +
-                       (uint64_t)(offset / BDRV_SECTOR_SIZE));
+            quorum_err(acb, "contents mismatch at offset %" PRIu64,
+                       acb->offset + offset);
         }
         return true;
     }
@@ -612,9 +614,7 @@ static void read_quorum_children_entry(void *opaque)
     QuorumChildRequest *sacb = &acb->qcrs[i];
 
     sacb->bs = s->children[i]->bs;
-    sacb->ret = bdrv_co_preadv(s->children[i],
-                               acb->sector_num * BDRV_SECTOR_SIZE,
-                               acb->nb_sectors * BDRV_SECTOR_SIZE,
+    sacb->ret = bdrv_co_preadv(s->children[i], acb->offset, acb->bytes,
                                &acb->qcrs[i].qiov, 0);
 
     if (sacb->ret == 0) {
@@ -683,19 +683,17 @@ static int read_fifo_child(QuorumAIOCB *acb)
     int ret;
 
     acb->qcrs[n].bs = s->children[n]->bs;
-    ret = bdrv_co_preadv(s->children[n], acb->sector_num * BDRV_SECTOR_SIZE,
-                         acb->nb_sectors * BDRV_SECTOR_SIZE, acb->qiov, 0);
+    ret = bdrv_co_preadv(s->children[n], acb->offset, acb->bytes, acb->qiov, 0);
     ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret);
 
     return ret;
 }
 
-static int quorum_co_readv(BlockDriverState *bs,
-                           int64_t sector_num, int nb_sectors,
-                           QEMUIOVector *qiov)
+static int quorum_co_preadv(BlockDriverState *bs, uint64_t offset,
+                            uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors);
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
     int ret;
 
     acb->is_read = true;
@@ -720,9 +718,7 @@ static void write_quorum_entry(void *opaque)
     QuorumChildRequest *sacb = &acb->qcrs[i];
 
     sacb->bs = s->children[i]->bs;
-    sacb->ret = bdrv_co_pwritev(s->children[i],
-                                acb->sector_num * BDRV_SECTOR_SIZE,
-                                acb->nb_sectors * BDRV_SECTOR_SIZE,
+    sacb->ret = bdrv_co_pwritev(s->children[i], acb->offset, acb->bytes,
                                 acb->qiov, 0);
     if (sacb->ret == 0) {
         acb->success_count++;
@@ -739,12 +735,11 @@ static void write_quorum_entry(void *opaque)
     }
 }
 
-static int quorum_co_writev(BlockDriverState *bs,
-                            int64_t sector_num, int nb_sectors,
-                            QEMUIOVector *qiov)
+static int quorum_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                             uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
     BDRVQuorumState *s = bs->opaque;
-    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, sector_num, nb_sectors);
+    QuorumAIOCB *acb = quorum_aio_get(bs, qiov, offset, bytes);
     int i, ret;
 
     for (i = 0; i < s->num_children; i++) {
@@ -811,7 +806,7 @@ static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
         result = bdrv_co_flush(s->children[i]->bs);
         if (result) {
             quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
-                              bdrv_nb_sectors(s->children[i]->bs),
+                              bdrv_getlength(s->children[i]->bs),
                               s->children[i]->bs->node_name, result);
             result_value.l = result;
             quorum_count_vote(&error_votes, &result_value, i);
@@ -1144,8 +1139,8 @@ static BlockDriver bdrv_quorum = {
 
     .bdrv_getlength                     = quorum_getlength,
 
-    .bdrv_co_readv                      = quorum_co_readv,
-    .bdrv_co_writev                     = quorum_co_writev,
+    .bdrv_co_preadv                     = quorum_co_preadv,
+    .bdrv_co_pwritev                    = quorum_co_pwritev,
 
     .bdrv_add_child                     = quorum_add_child,
     .bdrv_del_child                     = quorum_del_child,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 09/14] quorum: Inline quorum_fifo_aio_cb()
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (7 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 08/14] quorum: Implement .bdrv_co_preadv/pwritev() Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 10/14] quorum: Clean up quorum_aio_get() Kevin Wolf
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

Inlining the function removes some boilerplace code and replaces
recursion by a simple loop, so the code becomes somewhat easier to
understand.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/quorum.c | 42 +++++++++++++-----------------------------
 1 file changed, 13 insertions(+), 29 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 4bba9fd..e244389 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -250,30 +250,6 @@ static void quorum_report_bad_acb(QuorumChildRequest *sacb, int ret)
     quorum_report_bad(type, acb->offset, acb->bytes, sacb->bs->node_name, ret);
 }
 
-static int quorum_fifo_aio_cb(void *opaque, int ret)
-{
-    QuorumChildRequest *sacb = opaque;
-    QuorumAIOCB *acb = sacb->parent;
-    BDRVQuorumState *s = acb->bs->opaque;
-
-    assert(acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO);
-
-    if (ret < 0) {
-        quorum_report_bad_acb(sacb, ret);
-
-        /* We try to read next child in FIFO order if we fail to read */
-        if (acb->children_read < s->num_children) {
-            return read_fifo_child(acb);
-        }
-    }
-
-    acb->vote_ret = ret;
-
-    /* FIXME: rewrite failed children if acb->children_read > 1? */
-
-    return ret;
-}
-
 static void quorum_report_bad_versions(BDRVQuorumState *s,
                                        QuorumAIOCB *acb,
                                        QuorumVoteValue *value)
@@ -679,12 +655,20 @@ static int read_quorum_children(QuorumAIOCB *acb)
 static int read_fifo_child(QuorumAIOCB *acb)
 {
     BDRVQuorumState *s = acb->bs->opaque;
-    int n = acb->children_read++;
-    int ret;
+    int n, ret;
+
+    /* We try to read the next child in FIFO order if we failed to read */
+    do {
+        n = acb->children_read++;
+        acb->qcrs[n].bs = s->children[n]->bs;
+        ret = bdrv_co_preadv(s->children[n], acb->offset, acb->bytes,
+                             acb->qiov, 0);
+        if (ret < 0) {
+            quorum_report_bad_acb(&acb->qcrs[n], ret);
+        }
+    } while (ret < 0 && acb->children_read < s->num_children);
 
-    acb->qcrs[n].bs = s->children[n]->bs;
-    ret = bdrv_co_preadv(s->children[n], acb->offset, acb->bytes, acb->qiov, 0);
-    ret = quorum_fifo_aio_cb(&acb->qcrs[n], ret);
+    /* FIXME: rewrite failed children if acb->children_read > 1? */
 
     return ret;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 10/14] quorum: Clean up quorum_aio_get()
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (8 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 09/14] quorum: Inline quorum_fifo_aio_cb() Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 11/14] blkdebug: Implement bdrv_co_preadv/pwritev/flush Kevin Wolf
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

Make sure that all fields of the new QuorumAIOCB are zeroed when the
function returns even without explicitly setting them. This will protect
us when new fields are added, removes some explicit zero assignment and
makes the code a little nicer to read.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
---
 block/quorum.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index e244389..86e2072 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -164,20 +164,17 @@ static QuorumAIOCB *quorum_aio_get(BlockDriverState *bs,
     QuorumAIOCB *acb = g_new(QuorumAIOCB, 1);
     int i;
 
-    acb->co = qemu_coroutine_self();
-    acb->bs = bs;
-    acb->offset = offset;
-    acb->bytes = bytes;
-    acb->qiov = qiov;
-    acb->qcrs = g_new0(QuorumChildRequest, s->num_children);
-    acb->count = 0;
-    acb->success_count = 0;
-    acb->rewrite_count = 0;
-    acb->votes.compare = quorum_sha256_compare;
-    QLIST_INIT(&acb->votes.vote_list);
-    acb->is_read = false;
-    acb->vote_ret = 0;
+    *acb = (QuorumAIOCB) {
+        .co                 = qemu_coroutine_self(),
+        .bs                 = bs,
+        .offset             = offset,
+        .bytes              = bytes,
+        .qiov               = qiov,
+        .votes.compare      = quorum_sha256_compare,
+        .votes.vote_list    = QLIST_HEAD_INITIALIZER(acb.votes.vote_list),
+    };
 
+    acb->qcrs = g_new0(QuorumChildRequest, s->num_children);
     for (i = 0; i < s->num_children; i++) {
         acb->qcrs[i].buf = NULL;
         acb->qcrs[i].ret = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 11/14] blkdebug: Implement bdrv_co_preadv/pwritev/flush
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (9 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 10/14] quorum: Clean up quorum_aio_get() Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 12/14] blkverify: " Kevin Wolf
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

This enables byte granularity requests for blkdebug, and at the same
time gets us rid of another user of the BDS-level AIO emulation.

Note that unless align=512 is specified, this can behave subtly
different from the old behaviour because bdrv_co_preadv/pwritev don't
have to perform alignment adjustments any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/blkdebug.c | 86 ++++++++++++++++++++++++++------------------------------
 1 file changed, 40 insertions(+), 46 deletions(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index 4127571..acccf85 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -58,10 +58,6 @@ typedef struct BlkdebugSuspendedReq {
     QLIST_ENTRY(BlkdebugSuspendedReq) next;
 } BlkdebugSuspendedReq;
 
-static const AIOCBInfo blkdebug_aiocb_info = {
-    .aiocb_size    = sizeof(BlkdebugAIOCB),
-};
-
 enum {
     ACTION_INJECT_ERROR,
     ACTION_SET_STATE,
@@ -77,7 +73,7 @@ typedef struct BlkdebugRule {
             int error;
             int immediately;
             int once;
-            int64_t sector;
+            int64_t offset;
         } inject;
         struct {
             int new_state;
@@ -174,6 +170,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
     const char* event_name;
     BlkdebugEvent event;
     struct BlkdebugRule *rule;
+    int64_t sector;
 
     /* Find the right event for the rule */
     event_name = qemu_opt_get(opts, "event");
@@ -200,7 +197,9 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
         rule->options.inject.once  = qemu_opt_get_bool(opts, "once", 0);
         rule->options.inject.immediately =
             qemu_opt_get_bool(opts, "immediately", 0);
-        rule->options.inject.sector = qemu_opt_get_number(opts, "sector", -1);
+        sector = qemu_opt_get_number(opts, "sector", -1);
+        rule->options.inject.offset =
+            sector == -1 ? -1 : sector * BDRV_SECTOR_SIZE;
         break;
 
     case ACTION_SET_STATE:
@@ -408,17 +407,14 @@ out:
 
 static void error_callback_bh(void *opaque)
 {
-    struct BlkdebugAIOCB *acb = opaque;
-    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_aio_unref(acb);
+    Coroutine *co = opaque;
+    qemu_coroutine_enter(co);
 }
 
-static BlockAIOCB *inject_error(BlockDriverState *bs,
-    BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
+static int inject_error(BlockDriverState *bs, BlkdebugRule *rule)
 {
     BDRVBlkdebugState *s = bs->opaque;
     int error = rule->options.inject.error;
-    struct BlkdebugAIOCB *acb;
     bool immediately = rule->options.inject.immediately;
 
     if (rule->options.inject.once) {
@@ -426,81 +422,79 @@ static BlockAIOCB *inject_error(BlockDriverState *bs,
         remove_rule(rule);
     }
 
-    if (immediately) {
-        return NULL;
+    if (!immediately) {
+        aio_bh_schedule_oneshot(bdrv_get_aio_context(bs), error_callback_bh,
+                                qemu_coroutine_self());
+        qemu_coroutine_yield();
     }
 
-    acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
-    acb->ret = -error;
-
-    aio_bh_schedule_oneshot(bdrv_get_aio_context(bs), error_callback_bh, acb);
-
-    return &acb->common;
+    return -error;
 }
 
-static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
-    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-    BlockCompletionFunc *cb, void *opaque)
+static int coroutine_fn
+blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                   QEMUIOVector *qiov, int flags)
 {
     BDRVBlkdebugState *s = bs->opaque;
     BlkdebugRule *rule = NULL;
 
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
-        if (rule->options.inject.sector == -1 ||
-            (rule->options.inject.sector >= sector_num &&
-             rule->options.inject.sector < sector_num + nb_sectors)) {
+        uint64_t inject_offset = rule->options.inject.offset;
+
+        if (inject_offset == -1 ||
+            (inject_offset >= offset && inject_offset < offset + bytes))
+        {
             break;
         }
     }
 
     if (rule && rule->options.inject.error) {
-        return inject_error(bs, cb, opaque, rule);
+        return inject_error(bs, rule);
     }
 
-    return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors,
-                          cb, opaque);
+    return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
 }
 
-static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
-    int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-    BlockCompletionFunc *cb, void *opaque)
+static int coroutine_fn
+blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                    QEMUIOVector *qiov, int flags)
 {
     BDRVBlkdebugState *s = bs->opaque;
     BlkdebugRule *rule = NULL;
 
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
-        if (rule->options.inject.sector == -1 ||
-            (rule->options.inject.sector >= sector_num &&
-             rule->options.inject.sector < sector_num + nb_sectors)) {
+        uint64_t inject_offset = rule->options.inject.offset;
+
+        if (inject_offset == -1 ||
+            (inject_offset >= offset && inject_offset < offset + bytes))
+        {
             break;
         }
     }
 
     if (rule && rule->options.inject.error) {
-        return inject_error(bs, cb, opaque, rule);
+        return inject_error(bs, rule);
     }
 
-    return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors,
-                           cb, opaque);
+    return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
 }
 
-static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
-    BlockCompletionFunc *cb, void *opaque)
+static int blkdebug_co_flush(BlockDriverState *bs)
 {
     BDRVBlkdebugState *s = bs->opaque;
     BlkdebugRule *rule = NULL;
 
     QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
-        if (rule->options.inject.sector == -1) {
+        if (rule->options.inject.offset == -1) {
             break;
         }
     }
 
     if (rule && rule->options.inject.error) {
-        return inject_error(bs, cb, opaque, rule);
+        return inject_error(bs, rule);
     }
 
-    return bdrv_aio_flush(bs->file->bs, cb, opaque);
+    return bdrv_co_flush(bs->file->bs);
 }
 
 
@@ -752,9 +746,9 @@ static BlockDriver bdrv_blkdebug = {
     .bdrv_refresh_filename  = blkdebug_refresh_filename,
     .bdrv_refresh_limits    = blkdebug_refresh_limits,
 
-    .bdrv_aio_readv         = blkdebug_aio_readv,
-    .bdrv_aio_writev        = blkdebug_aio_writev,
-    .bdrv_aio_flush         = blkdebug_aio_flush,
+    .bdrv_co_preadv         = blkdebug_co_preadv,
+    .bdrv_co_pwritev        = blkdebug_co_pwritev,
+    .bdrv_co_flush_to_disk  = blkdebug_co_flush,
 
     .bdrv_debug_event           = blkdebug_debug_event,
     .bdrv_debug_breakpoint      = blkdebug_debug_breakpoint,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 12/14] blkverify: Implement bdrv_co_preadv/pwritev/flush
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (10 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 11/14] blkdebug: Implement bdrv_co_preadv/pwritev/flush Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 13/14] block: Rename raw_bsd to raw-format.c Kevin Wolf
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

This enables byte granularity requests for blkverify, and at the same
time gets us rid of another user of the BDS-level AIO emulation.

The reference output of a test case must be changed because the
verification failure message reports byte offsets instead of sectors
now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 block/blkverify.c          | 201 ++++++++++++++++++++++-----------------------
 tests/qemu-iotests/071.out |   8 +-
 2 files changed, 100 insertions(+), 109 deletions(-)

diff --git a/block/blkverify.c b/block/blkverify.c
index 28f9af6..43a940c 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -19,38 +19,36 @@ typedef struct {
     BdrvChild *test_file;
 } BDRVBlkverifyState;
 
-typedef struct BlkverifyAIOCB BlkverifyAIOCB;
-struct BlkverifyAIOCB {
-    BlockAIOCB common;
+typedef struct BlkverifyRequest {
+    Coroutine *co;
+    BlockDriverState *bs;
 
     /* Request metadata */
     bool is_write;
-    int64_t sector_num;
-    int nb_sectors;
+    uint64_t offset;
+    uint64_t bytes;
+    int flags;
 
-    int ret;                    /* first completed request's result */
-    unsigned int done;          /* completion counter */
+    int (*request_fn)(BdrvChild *, int64_t, unsigned int, QEMUIOVector *,
+                      BdrvRequestFlags);
 
-    QEMUIOVector *qiov;         /* user I/O vector */
-    QEMUIOVector raw_qiov;      /* cloned I/O vector for raw file */
-    void *buf;                  /* buffer for raw file I/O */
+    int ret;                    /* test image result */
+    int raw_ret;                /* raw image result */
 
-    void (*verify)(BlkverifyAIOCB *acb);
-};
+    unsigned int done;          /* completion counter */
 
-static const AIOCBInfo blkverify_aiocb_info = {
-    .aiocb_size         = sizeof(BlkverifyAIOCB),
-};
+    QEMUIOVector *qiov;         /* user I/O vector */
+    QEMUIOVector *raw_qiov;     /* cloned I/O vector for raw file */
+} BlkverifyRequest;
 
-static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
+static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyRequest *r,
                                              const char *fmt, ...)
 {
     va_list ap;
 
     va_start(ap, fmt);
-    fprintf(stderr, "blkverify: %s sector_num=%" PRId64 " nb_sectors=%d ",
-            acb->is_write ? "write" : "read", acb->sector_num,
-            acb->nb_sectors);
+    fprintf(stderr, "blkverify: %s offset=%" PRId64 " bytes=%" PRId64 " ",
+            r->is_write ? "write" : "read", r->offset, r->bytes);
     vfprintf(stderr, fmt, ap);
     fprintf(stderr, "\n");
     va_end(ap);
@@ -166,113 +164,106 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
     return bdrv_getlength(s->test_file->bs);
 }
 
-static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
-                                         int64_t sector_num, QEMUIOVector *qiov,
-                                         int nb_sectors,
-                                         BlockCompletionFunc *cb,
-                                         void *opaque)
+static void coroutine_fn blkverify_do_test_req(void *opaque)
 {
-    BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
-
-    acb->is_write = is_write;
-    acb->sector_num = sector_num;
-    acb->nb_sectors = nb_sectors;
-    acb->ret = -EINPROGRESS;
-    acb->done = 0;
-    acb->qiov = qiov;
-    acb->buf = NULL;
-    acb->verify = NULL;
-    return acb;
+    BlkverifyRequest *r = opaque;
+    BDRVBlkverifyState *s = r->bs->opaque;
+
+    r->ret = r->request_fn(s->test_file, r->offset, r->bytes, r->qiov,
+                           r->flags);
+    r->done++;
+    qemu_coroutine_enter_if_inactive(r->co);
 }
 
-static void blkverify_aio_bh(void *opaque)
+static void coroutine_fn blkverify_do_raw_req(void *opaque)
 {
-    BlkverifyAIOCB *acb = opaque;
+    BlkverifyRequest *r = opaque;
 
-    if (acb->buf) {
-        qemu_iovec_destroy(&acb->raw_qiov);
-        qemu_vfree(acb->buf);
-    }
-    acb->common.cb(acb->common.opaque, acb->ret);
-    qemu_aio_unref(acb);
+    r->raw_ret = r->request_fn(r->bs->file, r->offset, r->bytes, r->raw_qiov,
+                               r->flags);
+    r->done++;
+    qemu_coroutine_enter_if_inactive(r->co);
 }
 
-static void blkverify_aio_cb(void *opaque, int ret)
+static int coroutine_fn
+blkverify_co_prwv(BlockDriverState *bs, BlkverifyRequest *r, uint64_t offset,
+                  uint64_t bytes, QEMUIOVector *qiov, QEMUIOVector *raw_qiov,
+                  int flags, bool is_write)
 {
-    BlkverifyAIOCB *acb = opaque;
-
-    switch (++acb->done) {
-    case 1:
-        acb->ret = ret;
-        break;
-
-    case 2:
-        if (acb->ret != ret) {
-            blkverify_err(acb, "return value mismatch %d != %d", acb->ret, ret);
-        }
-
-        if (acb->verify) {
-            acb->verify(acb);
-        }
+    Coroutine *co_a, *co_b;
+
+    *r = (BlkverifyRequest) {
+        .co         = qemu_coroutine_self(),
+        .bs         = bs,
+        .offset     = offset,
+        .bytes      = bytes,
+        .qiov       = qiov,
+        .raw_qiov   = raw_qiov,
+        .flags      = flags,
+        .is_write   = is_write,
+        .request_fn = is_write ? bdrv_co_pwritev : bdrv_co_preadv,
+    };
+
+    co_a = qemu_coroutine_create(blkverify_do_test_req, r);
+    co_b = qemu_coroutine_create(blkverify_do_raw_req, r);
+
+    qemu_coroutine_enter(co_a);
+    qemu_coroutine_enter(co_b);
+
+    while (r->done < 2) {
+        qemu_coroutine_yield();
+    }
 
-        aio_bh_schedule_oneshot(bdrv_get_aio_context(acb->common.bs),
-                                blkverify_aio_bh, acb);
-        break;
+    if (r->ret != r->raw_ret) {
+        blkverify_err(r, "return value mismatch %d != %d", r->ret, r->raw_ret);
     }
+
+    return r->ret;
 }
 
-static void blkverify_verify_readv(BlkverifyAIOCB *acb)
+static int coroutine_fn
+blkverify_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                    QEMUIOVector *qiov, int flags)
 {
-    ssize_t offset = qemu_iovec_compare(acb->qiov, &acb->raw_qiov);
-    if (offset != -1) {
-        blkverify_err(acb, "contents mismatch in sector %" PRId64,
-                      acb->sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
+    BlkverifyRequest r;
+    QEMUIOVector raw_qiov;
+    void *buf;
+    ssize_t cmp_offset;
+    int ret;
+
+    buf = qemu_blockalign(bs->file->bs, qiov->size);
+    qemu_iovec_init(&raw_qiov, qiov->niov);
+    qemu_iovec_clone(&raw_qiov, qiov, buf);
+
+    ret = blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, flags,
+                            false);
+
+    cmp_offset = qemu_iovec_compare(qiov, &raw_qiov);
+    if (cmp_offset != -1) {
+        blkverify_err(&r, "contents mismatch at offset %" PRId64,
+                      offset + cmp_offset);
     }
-}
 
-static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVBlkverifyState *s = bs->opaque;
-    BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
-                                            nb_sectors, cb, opaque);
-
-    acb->verify = blkverify_verify_readv;
-    acb->buf = qemu_blockalign(bs->file->bs, qiov->size);
-    qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
-    qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
-
-    bdrv_aio_readv(s->test_file, sector_num, qiov, nb_sectors,
-                   blkverify_aio_cb, acb);
-    bdrv_aio_readv(bs->file, sector_num, &acb->raw_qiov, nb_sectors,
-                   blkverify_aio_cb, acb);
-    return &acb->common;
+    qemu_iovec_destroy(&raw_qiov);
+    qemu_vfree(buf);
+
+    return ret;
 }
 
-static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
-        int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-        BlockCompletionFunc *cb, void *opaque)
+static int coroutine_fn
+blkverify_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+                     QEMUIOVector *qiov, int flags)
 {
-    BDRVBlkverifyState *s = bs->opaque;
-    BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
-                                            nb_sectors, cb, opaque);
-
-    bdrv_aio_writev(s->test_file, sector_num, qiov, nb_sectors,
-                    blkverify_aio_cb, acb);
-    bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors,
-                    blkverify_aio_cb, acb);
-    return &acb->common;
+    BlkverifyRequest r;
+    return blkverify_co_prwv(bs, &r, offset, bytes, qiov, qiov, flags, true);
 }
 
-static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
-                                       BlockCompletionFunc *cb,
-                                       void *opaque)
+static int blkverify_co_flush(BlockDriverState *bs)
 {
     BDRVBlkverifyState *s = bs->opaque;
 
     /* Only flush test file, the raw file is not important */
-    return bdrv_aio_flush(s->test_file->bs, cb, opaque);
+    return bdrv_co_flush(s->test_file->bs);
 }
 
 static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
@@ -332,9 +323,9 @@ static BlockDriver bdrv_blkverify = {
     .bdrv_getlength                   = blkverify_getlength,
     .bdrv_refresh_filename            = blkverify_refresh_filename,
 
-    .bdrv_aio_readv                   = blkverify_aio_readv,
-    .bdrv_aio_writev                  = blkverify_aio_writev,
-    .bdrv_aio_flush                   = blkverify_aio_flush,
+    .bdrv_co_preadv                   = blkverify_co_preadv,
+    .bdrv_co_pwritev                  = blkverify_co_pwritev,
+    .bdrv_co_flush                    = blkverify_co_flush,
 
     .is_filter                        = true,
     .bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
diff --git a/tests/qemu-iotests/071.out b/tests/qemu-iotests/071.out
index 8ff423f..dd879f1 100644
--- a/tests/qemu-iotests/071.out
+++ b/tests/qemu-iotests/071.out
@@ -12,7 +12,7 @@ read 512/512 bytes at offset 229376
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0
+blkverify: read offset=0 bytes=512 contents mismatch at offset 0
 
 === Testing blkverify through file blockref ===
 
@@ -26,7 +26,7 @@ read 512/512 bytes at offset 229376
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0
+blkverify: read offset=0 bytes=512 contents mismatch at offset 0
 
 === Testing blkdebug through filename ===
 
@@ -56,7 +56,7 @@ QMP_VERSION
 {"return": {}}
 {"return": {}}
 {"return": {}}
-blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0
+blkverify: read offset=0 bytes=512 contents mismatch at offset 0
 
 
 === Testing blkverify on existing raw block device ===
@@ -66,7 +66,7 @@ QMP_VERSION
 {"return": {}}
 {"return": {}}
 {"return": {}}
-blkverify: read sector_num=0 nb_sectors=1 contents mismatch in sector 0
+blkverify: read offset=0 bytes=512 contents mismatch at offset 0
 
 
 === Testing blkdebug's set-state through QMP ===
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 13/14] block: Rename raw_bsd to raw-format.c
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (11 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 12/14] blkverify: " Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 13:44 ` [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c Kevin Wolf
  2017-01-09 15:30 ` [Qemu-devel] [PULL 00/14] Block layer patches Peter Maydell
  14 siblings, 0 replies; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

From: Eric Blake <eblake@redhat.com>

Given that we have raw-win32.c and raw-posix.c, my initial guess at
raw_bsd.c was that it was for dealing with raw files using code
specific to the BSD operating system (beyond what raw-posix could
do).  Not so - this name was chosen back in commit e1c66c6 to
distinguish that it was a BSD licensed file, in contrast to the
then-existing raw.c with an unclear and potentially unusable
license.  But since it has been more than three years since the
rewrite, it's time to pick a more useful name for this file to
avoid this type of confusion to future contributors that don't know
the backstory, as none of our other files are named solely by the
license they use.

In reality, this file deals with the raw format, which is useful
with any number of protocols, while raw-{win32,posix} deal with
the file protocol (and in turn, that protocol is not limited to
use with the raw format).  So rename raw_bsd to raw-format.c.  We
could have also used the shorter name raw.c, except that collides
with the earlier use of that filename for a different license,
and it's better to be safe than risk license pollution.

The next patch will also rename raw-win32.c and raw-posix.c to
further distinguish the difference in roles.

It doesn't hurt that this gets rid of an underscore in the filename,
thereby making tab-completion on 'ra<TAB>' easier (now I don't have
to type the shift key, which slows things down :)

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 MAINTAINERS         |   2 +-
 block/Makefile.objs |   2 +-
 block/raw-format.c  | 490 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 block/raw_bsd.c     | 490 ----------------------------------------------------
 4 files changed, 492 insertions(+), 492 deletions(-)
 create mode 100644 block/raw-format.c
 delete mode 100644 block/raw_bsd.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 585cd5a..044a324 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1722,7 +1722,7 @@ F: block/linux-aio.c
 F: include/block/raw-aio.h
 F: block/raw-posix.c
 F: block/raw-win32.c
-F: block/raw_bsd.c
+F: block/raw-format.c
 F: block/win32-aio.c
 
 qcow2
diff --git a/block/Makefile.objs b/block/Makefile.objs
index 67a036a..bde742f 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -1,4 +1,4 @@
-block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o dmg.o
+block-obj-y += raw-format.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o dmg.o
 block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
 block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
 block-obj-y += qed-check.o
diff --git a/block/raw-format.c b/block/raw-format.c
new file mode 100644
index 0000000..8404a82
--- /dev/null
+++ b/block/raw-format.c
@@ -0,0 +1,490 @@
+/* BlockDriver implementation for "raw" format driver
+ *
+ * Copyright (C) 2010-2016 Red Hat, Inc.
+ * Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
+ * Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
+ *
+ * Author:
+ *   Laszlo Ersek <lersek@redhat.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+#include "qapi/error.h"
+#include "qemu/option.h"
+
+typedef struct BDRVRawState {
+    uint64_t offset;
+    uint64_t size;
+    bool has_size;
+} BDRVRawState;
+
+static QemuOptsList raw_runtime_opts = {
+    .name = "raw",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
+    .desc = {
+        {
+            .name = "offset",
+            .type = QEMU_OPT_SIZE,
+            .help = "offset in the disk where the image starts",
+        },
+        {
+            .name = "size",
+            .type = QEMU_OPT_SIZE,
+            .help = "virtual disk size",
+        },
+        { /* end of list */ }
+    },
+};
+
+static QemuOptsList raw_create_opts = {
+    .name = "raw-create-opts",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
+    .desc = {
+        {
+            .name = BLOCK_OPT_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "Virtual disk size"
+        },
+        { /* end of list */ }
+    }
+};
+
+static int raw_read_options(QDict *options, BlockDriverState *bs,
+    BDRVRawState *s, Error **errp)
+{
+    Error *local_err = NULL;
+    QemuOpts *opts = NULL;
+    int64_t real_size = 0;
+    int ret;
+
+    real_size = bdrv_getlength(bs->file->bs);
+    if (real_size < 0) {
+        error_setg_errno(errp, -real_size, "Could not get image size");
+        return real_size;
+    }
+
+    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto end;
+    }
+
+    s->offset = qemu_opt_get_size(opts, "offset", 0);
+    if (s->offset > real_size) {
+        error_setg(errp, "Offset (%" PRIu64 ") cannot be greater than "
+            "size of the containing file (%" PRId64 ")",
+            s->offset, real_size);
+        ret = -EINVAL;
+        goto end;
+    }
+
+    if (qemu_opt_find(opts, "size") != NULL) {
+        s->size = qemu_opt_get_size(opts, "size", 0);
+        s->has_size = true;
+    } else {
+        s->has_size = false;
+        s->size = real_size - s->offset;
+    }
+
+    /* Check size and offset */
+    if ((real_size - s->offset) < s->size) {
+        error_setg(errp, "The sum of offset (%" PRIu64 ") and size "
+            "(%" PRIu64 ") has to be smaller or equal to the "
+            " actual size of the containing file (%" PRId64 ")",
+            s->offset, s->size, real_size);
+        ret = -EINVAL;
+        goto end;
+    }
+
+    /* Make sure size is multiple of BDRV_SECTOR_SIZE to prevent rounding
+     * up and leaking out of the specified area. */
+    if (s->has_size && !QEMU_IS_ALIGNED(s->size, BDRV_SECTOR_SIZE)) {
+        error_setg(errp, "Specified size is not multiple of %llu",
+            BDRV_SECTOR_SIZE);
+        ret = -EINVAL;
+        goto end;
+    }
+
+    ret = 0;
+
+end:
+
+    qemu_opts_del(opts);
+
+    return ret;
+}
+
+static int raw_reopen_prepare(BDRVReopenState *reopen_state,
+                              BlockReopenQueue *queue, Error **errp)
+{
+    assert(reopen_state != NULL);
+    assert(reopen_state->bs != NULL);
+
+    reopen_state->opaque = g_new0(BDRVRawState, 1);
+
+    return raw_read_options(
+        reopen_state->options,
+        reopen_state->bs,
+        reopen_state->opaque,
+        errp);
+}
+
+static void raw_reopen_commit(BDRVReopenState *state)
+{
+    BDRVRawState *new_s = state->opaque;
+    BDRVRawState *s = state->bs->opaque;
+
+    memcpy(s, new_s, sizeof(BDRVRawState));
+
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+static void raw_reopen_abort(BDRVReopenState *state)
+{
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
+                                      uint64_t bytes, QEMUIOVector *qiov,
+                                      int flags)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (offset > UINT64_MAX - s->offset) {
+        return -EINVAL;
+    }
+    offset += s->offset;
+
+    BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
+    return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
+}
+
+static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                                       uint64_t bytes, QEMUIOVector *qiov,
+                                       int flags)
+{
+    BDRVRawState *s = bs->opaque;
+    void *buf = NULL;
+    BlockDriver *drv;
+    QEMUIOVector local_qiov;
+    int ret;
+
+    if (s->has_size && (offset > s->size || bytes > (s->size - offset))) {
+        /* There's not enough space for the data. Don't write anything and just
+         * fail to prevent leaking out of the size specified in options. */
+        return -ENOSPC;
+    }
+
+    if (offset > UINT64_MAX - s->offset) {
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) {
+        /* Handling partial writes would be a pain - so we just
+         * require that guests have 512-byte request alignment if
+         * probing occurred */
+        QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
+        QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
+        assert(offset == 0 && bytes >= BLOCK_PROBE_BUF_SIZE);
+
+        buf = qemu_try_blockalign(bs->file->bs, 512);
+        if (!buf) {
+            ret = -ENOMEM;
+            goto fail;
+        }
+
+        ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
+        if (ret != 512) {
+            ret = -EINVAL;
+            goto fail;
+        }
+
+        drv = bdrv_probe_all(buf, 512, NULL);
+        if (drv != bs->drv) {
+            ret = -EPERM;
+            goto fail;
+        }
+
+        /* Use the checked buffer, a malicious guest might be overwriting its
+         * original buffer in the background. */
+        qemu_iovec_init(&local_qiov, qiov->niov + 1);
+        qemu_iovec_add(&local_qiov, buf, 512);
+        qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512);
+        qiov = &local_qiov;
+    }
+
+    offset += s->offset;
+
+    BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
+    ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+
+fail:
+    if (qiov == &local_qiov) {
+        qemu_iovec_destroy(&local_qiov);
+    }
+    qemu_vfree(buf);
+    return ret;
+}
+
+static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
+                                            int64_t sector_num,
+                                            int nb_sectors, int *pnum,
+                                            BlockDriverState **file)
+{
+    BDRVRawState *s = bs->opaque;
+    *pnum = nb_sectors;
+    *file = bs->file->bs;
+    sector_num += s->offset / BDRV_SECTOR_SIZE;
+    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
+           (sector_num << BDRV_SECTOR_BITS);
+}
+
+static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
+                                             int64_t offset, int count,
+                                             BdrvRequestFlags flags)
+{
+    BDRVRawState *s = bs->opaque;
+    if (offset > UINT64_MAX - s->offset) {
+        return -EINVAL;
+    }
+    offset += s->offset;
+    return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
+}
+
+static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs,
+                                        int64_t offset, int count)
+{
+    BDRVRawState *s = bs->opaque;
+    if (offset > UINT64_MAX - s->offset) {
+        return -EINVAL;
+    }
+    offset += s->offset;
+    return bdrv_co_pdiscard(bs->file->bs, offset, count);
+}
+
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    int64_t len;
+    BDRVRawState *s = bs->opaque;
+
+    /* Update size. It should not change unless the file was externally
+     * modified. */
+    len = bdrv_getlength(bs->file->bs);
+    if (len < 0) {
+        return len;
+    }
+
+    if (len < s->offset) {
+        s->size = 0;
+    } else {
+        if (s->has_size) {
+            /* Try to honour the size */
+            s->size = MIN(s->size, len - s->offset);
+        } else {
+            s->size = len - s->offset;
+        }
+    }
+
+    return s->size;
+}
+
+static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+    return bdrv_get_info(bs->file->bs, bdi);
+}
+
+static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    if (bs->probed) {
+        /* To make it easier to protect the first sector, any probed
+         * image is restricted to read-modify-write on sub-sector
+         * operations. */
+        bs->bl.request_alignment = BDRV_SECTOR_SIZE;
+    }
+}
+
+static int raw_truncate(BlockDriverState *bs, int64_t offset)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->has_size) {
+        return -ENOTSUP;
+    }
+
+    if (INT64_MAX - offset < s->offset) {
+        return -EINVAL;
+    }
+
+    s->size = offset;
+    offset += s->offset;
+    return bdrv_truncate(bs->file->bs, offset);
+}
+
+static int raw_media_changed(BlockDriverState *bs)
+{
+    return bdrv_media_changed(bs->file->bs);
+}
+
+static void raw_eject(BlockDriverState *bs, bool eject_flag)
+{
+    bdrv_eject(bs->file->bs, eject_flag);
+}
+
+static void raw_lock_medium(BlockDriverState *bs, bool locked)
+{
+    bdrv_lock_medium(bs->file->bs, locked);
+}
+
+static int raw_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
+{
+    BDRVRawState *s = bs->opaque;
+    if (s->offset || s->has_size) {
+        return -ENOTSUP;
+    }
+    return bdrv_co_ioctl(bs->file->bs, req, buf);
+}
+
+static int raw_has_zero_init(BlockDriverState *bs)
+{
+    return bdrv_has_zero_init(bs->file->bs);
+}
+
+static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
+{
+    return bdrv_create_file(filename, opts, errp);
+}
+
+static int raw_open(BlockDriverState *bs, QDict *options, int flags,
+                    Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    int ret;
+
+    bs->sg = bs->file->bs->sg;
+    bs->supported_write_flags = BDRV_REQ_FUA &
+        bs->file->bs->supported_write_flags;
+    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
+        bs->file->bs->supported_zero_flags;
+
+    if (bs->probed && !bdrv_is_read_only(bs)) {
+        fprintf(stderr,
+                "WARNING: Image format was not specified for '%s' and probing "
+                "guessed raw.\n"
+                "         Automatically detecting the format is dangerous for "
+                "raw images, write operations on block 0 will be restricted.\n"
+                "         Specify the 'raw' format explicitly to remove the "
+                "restrictions.\n",
+                bs->file->bs->filename);
+    }
+
+    ret = raw_read_options(options, bs, s, errp);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (bs->sg && (s->offset || s->has_size)) {
+        error_setg(errp, "Cannot use offset/size with SCSI generic devices");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static void raw_close(BlockDriverState *bs)
+{
+}
+
+static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
+{
+    /* smallest possible positive score so that raw is used if and only if no
+     * other block driver works
+     */
+    return 1;
+}
+
+static int raw_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
+{
+    BDRVRawState *s = bs->opaque;
+    int ret;
+
+    ret = bdrv_probe_blocksizes(bs->file->bs, bsz);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (!QEMU_IS_ALIGNED(s->offset, MAX(bsz->log, bsz->phys))) {
+        return -ENOTSUP;
+    }
+
+    return 0;
+}
+
+static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
+{
+    BDRVRawState *s = bs->opaque;
+    if (s->offset || s->has_size) {
+        return -ENOTSUP;
+    }
+    return bdrv_probe_geometry(bs->file->bs, geo);
+}
+
+BlockDriver bdrv_raw = {
+    .format_name          = "raw",
+    .instance_size        = sizeof(BDRVRawState),
+    .bdrv_probe           = &raw_probe,
+    .bdrv_reopen_prepare  = &raw_reopen_prepare,
+    .bdrv_reopen_commit   = &raw_reopen_commit,
+    .bdrv_reopen_abort    = &raw_reopen_abort,
+    .bdrv_open            = &raw_open,
+    .bdrv_close           = &raw_close,
+    .bdrv_create          = &raw_create,
+    .bdrv_co_preadv       = &raw_co_preadv,
+    .bdrv_co_pwritev      = &raw_co_pwritev,
+    .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
+    .bdrv_co_pdiscard     = &raw_co_pdiscard,
+    .bdrv_co_get_block_status = &raw_co_get_block_status,
+    .bdrv_truncate        = &raw_truncate,
+    .bdrv_getlength       = &raw_getlength,
+    .has_variable_length  = true,
+    .bdrv_get_info        = &raw_get_info,
+    .bdrv_refresh_limits  = &raw_refresh_limits,
+    .bdrv_probe_blocksizes = &raw_probe_blocksizes,
+    .bdrv_probe_geometry  = &raw_probe_geometry,
+    .bdrv_media_changed   = &raw_media_changed,
+    .bdrv_eject           = &raw_eject,
+    .bdrv_lock_medium     = &raw_lock_medium,
+    .bdrv_co_ioctl        = &raw_co_ioctl,
+    .create_opts          = &raw_create_opts,
+    .bdrv_has_zero_init   = &raw_has_zero_init
+};
+
+static void bdrv_raw_init(void)
+{
+    bdrv_register(&bdrv_raw);
+}
+
+block_init(bdrv_raw_init);
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
deleted file mode 100644
index 8a5b9b0..0000000
--- a/block/raw_bsd.c
+++ /dev/null
@@ -1,490 +0,0 @@
-/* BlockDriver implementation for "raw"
- *
- * Copyright (C) 2010-2016 Red Hat, Inc.
- * Copyright (C) 2010, Blue Swirl <blauwirbel@gmail.com>
- * Copyright (C) 2009, Anthony Liguori <aliguori@us.ibm.com>
- *
- * Author:
- *   Laszlo Ersek <lersek@redhat.com>
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to
- * deal in the Software without restriction, including without limitation the
- * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
- * sell copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
- * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-#include "qemu/osdep.h"
-#include "block/block_int.h"
-#include "qapi/error.h"
-#include "qemu/option.h"
-
-typedef struct BDRVRawState {
-    uint64_t offset;
-    uint64_t size;
-    bool has_size;
-} BDRVRawState;
-
-static QemuOptsList raw_runtime_opts = {
-    .name = "raw",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
-    .desc = {
-        {
-            .name = "offset",
-            .type = QEMU_OPT_SIZE,
-            .help = "offset in the disk where the image starts",
-        },
-        {
-            .name = "size",
-            .type = QEMU_OPT_SIZE,
-            .help = "virtual disk size",
-        },
-        { /* end of list */ }
-    },
-};
-
-static QemuOptsList raw_create_opts = {
-    .name = "raw-create-opts",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
-    .desc = {
-        {
-            .name = BLOCK_OPT_SIZE,
-            .type = QEMU_OPT_SIZE,
-            .help = "Virtual disk size"
-        },
-        { /* end of list */ }
-    }
-};
-
-static int raw_read_options(QDict *options, BlockDriverState *bs,
-    BDRVRawState *s, Error **errp)
-{
-    Error *local_err = NULL;
-    QemuOpts *opts = NULL;
-    int64_t real_size = 0;
-    int ret;
-
-    real_size = bdrv_getlength(bs->file->bs);
-    if (real_size < 0) {
-        error_setg_errno(errp, -real_size, "Could not get image size");
-        return real_size;
-    }
-
-    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto end;
-    }
-
-    s->offset = qemu_opt_get_size(opts, "offset", 0);
-    if (s->offset > real_size) {
-        error_setg(errp, "Offset (%" PRIu64 ") cannot be greater than "
-            "size of the containing file (%" PRId64 ")",
-            s->offset, real_size);
-        ret = -EINVAL;
-        goto end;
-    }
-
-    if (qemu_opt_find(opts, "size") != NULL) {
-        s->size = qemu_opt_get_size(opts, "size", 0);
-        s->has_size = true;
-    } else {
-        s->has_size = false;
-        s->size = real_size - s->offset;
-    }
-
-    /* Check size and offset */
-    if ((real_size - s->offset) < s->size) {
-        error_setg(errp, "The sum of offset (%" PRIu64 ") and size "
-            "(%" PRIu64 ") has to be smaller or equal to the "
-            " actual size of the containing file (%" PRId64 ")",
-            s->offset, s->size, real_size);
-        ret = -EINVAL;
-        goto end;
-    }
-
-    /* Make sure size is multiple of BDRV_SECTOR_SIZE to prevent rounding
-     * up and leaking out of the specified area. */
-    if (s->has_size && !QEMU_IS_ALIGNED(s->size, BDRV_SECTOR_SIZE)) {
-        error_setg(errp, "Specified size is not multiple of %llu",
-            BDRV_SECTOR_SIZE);
-        ret = -EINVAL;
-        goto end;
-    }
-
-    ret = 0;
-
-end:
-
-    qemu_opts_del(opts);
-
-    return ret;
-}
-
-static int raw_reopen_prepare(BDRVReopenState *reopen_state,
-                              BlockReopenQueue *queue, Error **errp)
-{
-    assert(reopen_state != NULL);
-    assert(reopen_state->bs != NULL);
-
-    reopen_state->opaque = g_new0(BDRVRawState, 1);
-
-    return raw_read_options(
-        reopen_state->options,
-        reopen_state->bs,
-        reopen_state->opaque,
-        errp);
-}
-
-static void raw_reopen_commit(BDRVReopenState *state)
-{
-    BDRVRawState *new_s = state->opaque;
-    BDRVRawState *s = state->bs->opaque;
-
-    memcpy(s, new_s, sizeof(BDRVRawState));
-
-    g_free(state->opaque);
-    state->opaque = NULL;
-}
-
-static void raw_reopen_abort(BDRVReopenState *state)
-{
-    g_free(state->opaque);
-    state->opaque = NULL;
-}
-
-static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
-                                      uint64_t bytes, QEMUIOVector *qiov,
-                                      int flags)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (offset > UINT64_MAX - s->offset) {
-        return -EINVAL;
-    }
-    offset += s->offset;
-
-    BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
-    return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
-}
-
-static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
-                                       uint64_t bytes, QEMUIOVector *qiov,
-                                       int flags)
-{
-    BDRVRawState *s = bs->opaque;
-    void *buf = NULL;
-    BlockDriver *drv;
-    QEMUIOVector local_qiov;
-    int ret;
-
-    if (s->has_size && (offset > s->size || bytes > (s->size - offset))) {
-        /* There's not enough space for the data. Don't write anything and just
-         * fail to prevent leaking out of the size specified in options. */
-        return -ENOSPC;
-    }
-
-    if (offset > UINT64_MAX - s->offset) {
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    if (bs->probed && offset < BLOCK_PROBE_BUF_SIZE && bytes) {
-        /* Handling partial writes would be a pain - so we just
-         * require that guests have 512-byte request alignment if
-         * probing occurred */
-        QEMU_BUILD_BUG_ON(BLOCK_PROBE_BUF_SIZE != 512);
-        QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != 512);
-        assert(offset == 0 && bytes >= BLOCK_PROBE_BUF_SIZE);
-
-        buf = qemu_try_blockalign(bs->file->bs, 512);
-        if (!buf) {
-            ret = -ENOMEM;
-            goto fail;
-        }
-
-        ret = qemu_iovec_to_buf(qiov, 0, buf, 512);
-        if (ret != 512) {
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        drv = bdrv_probe_all(buf, 512, NULL);
-        if (drv != bs->drv) {
-            ret = -EPERM;
-            goto fail;
-        }
-
-        /* Use the checked buffer, a malicious guest might be overwriting its
-         * original buffer in the background. */
-        qemu_iovec_init(&local_qiov, qiov->niov + 1);
-        qemu_iovec_add(&local_qiov, buf, 512);
-        qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512);
-        qiov = &local_qiov;
-    }
-
-    offset += s->offset;
-
-    BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
-    ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
-
-fail:
-    if (qiov == &local_qiov) {
-        qemu_iovec_destroy(&local_qiov);
-    }
-    qemu_vfree(buf);
-    return ret;
-}
-
-static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
-                                            int64_t sector_num,
-                                            int nb_sectors, int *pnum,
-                                            BlockDriverState **file)
-{
-    BDRVRawState *s = bs->opaque;
-    *pnum = nb_sectors;
-    *file = bs->file->bs;
-    sector_num += s->offset / BDRV_SECTOR_SIZE;
-    return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
-           (sector_num << BDRV_SECTOR_BITS);
-}
-
-static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
-                                             int64_t offset, int count,
-                                             BdrvRequestFlags flags)
-{
-    BDRVRawState *s = bs->opaque;
-    if (offset > UINT64_MAX - s->offset) {
-        return -EINVAL;
-    }
-    offset += s->offset;
-    return bdrv_co_pwrite_zeroes(bs->file, offset, count, flags);
-}
-
-static int coroutine_fn raw_co_pdiscard(BlockDriverState *bs,
-                                        int64_t offset, int count)
-{
-    BDRVRawState *s = bs->opaque;
-    if (offset > UINT64_MAX - s->offset) {
-        return -EINVAL;
-    }
-    offset += s->offset;
-    return bdrv_co_pdiscard(bs->file->bs, offset, count);
-}
-
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    int64_t len;
-    BDRVRawState *s = bs->opaque;
-
-    /* Update size. It should not change unless the file was externally
-     * modified. */
-    len = bdrv_getlength(bs->file->bs);
-    if (len < 0) {
-        return len;
-    }
-
-    if (len < s->offset) {
-        s->size = 0;
-    } else {
-        if (s->has_size) {
-            /* Try to honour the size */
-            s->size = MIN(s->size, len - s->offset);
-        } else {
-            s->size = len - s->offset;
-        }
-    }
-
-    return s->size;
-}
-
-static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
-{
-    return bdrv_get_info(bs->file->bs, bdi);
-}
-
-static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-    if (bs->probed) {
-        /* To make it easier to protect the first sector, any probed
-         * image is restricted to read-modify-write on sub-sector
-         * operations. */
-        bs->bl.request_alignment = BDRV_SECTOR_SIZE;
-    }
-}
-
-static int raw_truncate(BlockDriverState *bs, int64_t offset)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->has_size) {
-        return -ENOTSUP;
-    }
-
-    if (INT64_MAX - offset < s->offset) {
-        return -EINVAL;
-    }
-
-    s->size = offset;
-    offset += s->offset;
-    return bdrv_truncate(bs->file->bs, offset);
-}
-
-static int raw_media_changed(BlockDriverState *bs)
-{
-    return bdrv_media_changed(bs->file->bs);
-}
-
-static void raw_eject(BlockDriverState *bs, bool eject_flag)
-{
-    bdrv_eject(bs->file->bs, eject_flag);
-}
-
-static void raw_lock_medium(BlockDriverState *bs, bool locked)
-{
-    bdrv_lock_medium(bs->file->bs, locked);
-}
-
-static int raw_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-    BDRVRawState *s = bs->opaque;
-    if (s->offset || s->has_size) {
-        return -ENOTSUP;
-    }
-    return bdrv_co_ioctl(bs->file->bs, req, buf);
-}
-
-static int raw_has_zero_init(BlockDriverState *bs)
-{
-    return bdrv_has_zero_init(bs->file->bs);
-}
-
-static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
-{
-    return bdrv_create_file(filename, opts, errp);
-}
-
-static int raw_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-
-    bs->sg = bs->file->bs->sg;
-    bs->supported_write_flags = BDRV_REQ_FUA &
-        bs->file->bs->supported_write_flags;
-    bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
-        bs->file->bs->supported_zero_flags;
-
-    if (bs->probed && !bdrv_is_read_only(bs)) {
-        fprintf(stderr,
-                "WARNING: Image format was not specified for '%s' and probing "
-                "guessed raw.\n"
-                "         Automatically detecting the format is dangerous for "
-                "raw images, write operations on block 0 will be restricted.\n"
-                "         Specify the 'raw' format explicitly to remove the "
-                "restrictions.\n",
-                bs->file->bs->filename);
-    }
-
-    ret = raw_read_options(options, bs, s, errp);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (bs->sg && (s->offset || s->has_size)) {
-        error_setg(errp, "Cannot use offset/size with SCSI generic devices");
-        return -EINVAL;
-    }
-
-    return 0;
-}
-
-static void raw_close(BlockDriverState *bs)
-{
-}
-
-static int raw_probe(const uint8_t *buf, int buf_size, const char *filename)
-{
-    /* smallest possible positive score so that raw is used if and only if no
-     * other block driver works
-     */
-    return 1;
-}
-
-static int raw_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-
-    ret = bdrv_probe_blocksizes(bs->file->bs, bsz);
-    if (ret < 0) {
-        return ret;
-    }
-
-    if (!QEMU_IS_ALIGNED(s->offset, MAX(bsz->log, bsz->phys))) {
-        return -ENOTSUP;
-    }
-
-    return 0;
-}
-
-static int raw_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
-{
-    BDRVRawState *s = bs->opaque;
-    if (s->offset || s->has_size) {
-        return -ENOTSUP;
-    }
-    return bdrv_probe_geometry(bs->file->bs, geo);
-}
-
-BlockDriver bdrv_raw = {
-    .format_name          = "raw",
-    .instance_size        = sizeof(BDRVRawState),
-    .bdrv_probe           = &raw_probe,
-    .bdrv_reopen_prepare  = &raw_reopen_prepare,
-    .bdrv_reopen_commit   = &raw_reopen_commit,
-    .bdrv_reopen_abort    = &raw_reopen_abort,
-    .bdrv_open            = &raw_open,
-    .bdrv_close           = &raw_close,
-    .bdrv_create          = &raw_create,
-    .bdrv_co_preadv       = &raw_co_preadv,
-    .bdrv_co_pwritev      = &raw_co_pwritev,
-    .bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
-    .bdrv_co_pdiscard     = &raw_co_pdiscard,
-    .bdrv_co_get_block_status = &raw_co_get_block_status,
-    .bdrv_truncate        = &raw_truncate,
-    .bdrv_getlength       = &raw_getlength,
-    .has_variable_length  = true,
-    .bdrv_get_info        = &raw_get_info,
-    .bdrv_refresh_limits  = &raw_refresh_limits,
-    .bdrv_probe_blocksizes = &raw_probe_blocksizes,
-    .bdrv_probe_geometry  = &raw_probe_geometry,
-    .bdrv_media_changed   = &raw_media_changed,
-    .bdrv_eject           = &raw_eject,
-    .bdrv_lock_medium     = &raw_lock_medium,
-    .bdrv_co_ioctl        = &raw_co_ioctl,
-    .create_opts          = &raw_create_opts,
-    .bdrv_has_zero_init   = &raw_has_zero_init
-};
-
-static void bdrv_raw_init(void)
-{
-    bdrv_register(&bdrv_raw);
-}
-
-block_init(bdrv_raw_init);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (12 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 13/14] block: Rename raw_bsd to raw-format.c Kevin Wolf
@ 2017-01-09 13:44 ` Kevin Wolf
  2017-01-09 14:32   ` Eric Blake
  2017-01-09 15:30 ` [Qemu-devel] [PULL 00/14] Block layer patches Peter Maydell
  14 siblings, 1 reply; 17+ messages in thread
From: Kevin Wolf @ 2017-01-09 13:44 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, qemu-devel

From: Eric Blake <eblake@redhat.com>

These files deal with the file protocol, not the raw format (the
file protocol is often used with other formats, and the raw
format is not forced to use the file protocol).  Rename things
to make it a bit easier to follow.

Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 MAINTAINERS               |    4 +-
 block/Makefile.objs       |    4 +-
 block/file-posix.c        | 2616 +++++++++++++++++++++++++++++++++++++++++++++
 block/file-win32.c        |  781 ++++++++++++++
 block/gluster.c           |    4 +-
 block/raw-posix.c         | 2616 ---------------------------------------------
 block/raw-win32.c         |  781 --------------
 block/trace-events        |    4 +-
 configure                 |    2 +-
 include/block/block_int.h |    2 +-
 10 files changed, 3407 insertions(+), 3407 deletions(-)
 create mode 100644 block/file-posix.c
 create mode 100644 block/file-win32.c
 delete mode 100644 block/raw-posix.c
 delete mode 100644 block/raw-win32.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 044a324..7868758 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1720,9 +1720,9 @@ L: qemu-block@nongnu.org
 S: Supported
 F: block/linux-aio.c
 F: include/block/raw-aio.h
-F: block/raw-posix.c
-F: block/raw-win32.c
 F: block/raw-format.c
+F: block/file-posix.c
+F: block/file-win32.c
 F: block/win32-aio.c
 
 qcow2
diff --git a/block/Makefile.objs b/block/Makefile.objs
index bde742f..0b8fd06 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -6,8 +6,8 @@ block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
 block-obj-y += quorum.o
 block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
 block-obj-y += block-backend.o snapshot.o qapi.o
-block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
-block-obj-$(CONFIG_POSIX) += raw-posix.o
+block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
+block-obj-$(CONFIG_POSIX) += file-posix.o
 block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 block-obj-y += null.o mirror.o commit.o io.o
 block-obj-y += throttle-groups.o
diff --git a/block/file-posix.c b/block/file-posix.c
new file mode 100644
index 0000000..28b47d9
--- /dev/null
+++ b/block/file-posix.c
@@ -0,0 +1,2616 @@
+/*
+ * Block driver for RAW files (posix)
+ *
+ * Copyright (c) 2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/timer.h"
+#include "qemu/log.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+#include "trace.h"
+#include "block/thread-pool.h"
+#include "qemu/iov.h"
+#include "block/raw-aio.h"
+#include "qapi/util.h"
+#include "qapi/qmp/qstring.h"
+
+#if defined(__APPLE__) && (__MACH__)
+#include <paths.h>
+#include <sys/param.h>
+#include <IOKit/IOKitLib.h>
+#include <IOKit/IOBSD.h>
+#include <IOKit/storage/IOMediaBSDClient.h>
+#include <IOKit/storage/IOMedia.h>
+#include <IOKit/storage/IOCDMedia.h>
+//#include <IOKit/storage/IOCDTypes.h>
+#include <IOKit/storage/IODVDMedia.h>
+#include <CoreFoundation/CoreFoundation.h>
+#endif
+
+#ifdef __sun__
+#define _POSIX_PTHREAD_SEMANTICS 1
+#include <sys/dkio.h>
+#endif
+#ifdef __linux__
+#include <sys/ioctl.h>
+#include <sys/param.h>
+#include <linux/cdrom.h>
+#include <linux/fd.h>
+#include <linux/fs.h>
+#include <linux/hdreg.h>
+#include <scsi/sg.h>
+#ifdef __s390__
+#include <asm/dasd.h>
+#endif
+#ifndef FS_NOCOW_FL
+#define FS_NOCOW_FL                     0x00800000 /* Do not cow file */
+#endif
+#endif
+#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || defined(CONFIG_FALLOCATE_ZERO_RANGE)
+#include <linux/falloc.h>
+#endif
+#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
+#include <sys/disk.h>
+#include <sys/cdio.h>
+#endif
+
+#ifdef __OpenBSD__
+#include <sys/ioctl.h>
+#include <sys/disklabel.h>
+#include <sys/dkio.h>
+#endif
+
+#ifdef __NetBSD__
+#include <sys/ioctl.h>
+#include <sys/disklabel.h>
+#include <sys/dkio.h>
+#include <sys/disk.h>
+#endif
+
+#ifdef __DragonFly__
+#include <sys/ioctl.h>
+#include <sys/diskslice.h>
+#endif
+
+#ifdef CONFIG_XFS
+#include <xfs/xfs.h>
+#endif
+
+//#define DEBUG_BLOCK
+
+#ifdef DEBUG_BLOCK
+# define DEBUG_BLOCK_PRINT 1
+#else
+# define DEBUG_BLOCK_PRINT 0
+#endif
+#define DPRINTF(fmt, ...) \
+do { \
+    if (DEBUG_BLOCK_PRINT) { \
+        printf(fmt, ## __VA_ARGS__); \
+    } \
+} while (0)
+
+/* OS X does not have O_DSYNC */
+#ifndef O_DSYNC
+#ifdef O_SYNC
+#define O_DSYNC O_SYNC
+#elif defined(O_FSYNC)
+#define O_DSYNC O_FSYNC
+#endif
+#endif
+
+/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
+#ifndef O_DIRECT
+#define O_DIRECT O_DSYNC
+#endif
+
+#define FTYPE_FILE   0
+#define FTYPE_CD     1
+
+#define MAX_BLOCKSIZE	4096
+
+typedef struct BDRVRawState {
+    int fd;
+    int type;
+    int open_flags;
+    size_t buf_align;
+
+#ifdef CONFIG_XFS
+    bool is_xfs:1;
+#endif
+    bool has_discard:1;
+    bool has_write_zeroes:1;
+    bool discard_zeroes:1;
+    bool use_linux_aio:1;
+    bool has_fallocate;
+    bool needs_alignment;
+} BDRVRawState;
+
+typedef struct BDRVRawReopenState {
+    int fd;
+    int open_flags;
+} BDRVRawReopenState;
+
+static int fd_open(BlockDriverState *bs);
+static int64_t raw_getlength(BlockDriverState *bs);
+
+typedef struct RawPosixAIOData {
+    BlockDriverState *bs;
+    int aio_fildes;
+    union {
+        struct iovec *aio_iov;
+        void *aio_ioctl_buf;
+    };
+    int aio_niov;
+    uint64_t aio_nbytes;
+#define aio_ioctl_cmd   aio_nbytes /* for QEMU_AIO_IOCTL */
+    off_t aio_offset;
+    int aio_type;
+} RawPosixAIOData;
+
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+static int cdrom_reopen(BlockDriverState *bs);
+#endif
+
+#if defined(__NetBSD__)
+static int raw_normalize_devicepath(const char **filename)
+{
+    static char namebuf[PATH_MAX];
+    const char *dp, *fname;
+    struct stat sb;
+
+    fname = *filename;
+    dp = strrchr(fname, '/');
+    if (lstat(fname, &sb) < 0) {
+        fprintf(stderr, "%s: stat failed: %s\n",
+            fname, strerror(errno));
+        return -errno;
+    }
+
+    if (!S_ISBLK(sb.st_mode)) {
+        return 0;
+    }
+
+    if (dp == NULL) {
+        snprintf(namebuf, PATH_MAX, "r%s", fname);
+    } else {
+        snprintf(namebuf, PATH_MAX, "%.*s/r%s",
+            (int)(dp - fname), fname, dp + 1);
+    }
+    fprintf(stderr, "%s is a block device", fname);
+    *filename = namebuf;
+    fprintf(stderr, ", using %s\n", *filename);
+
+    return 0;
+}
+#else
+static int raw_normalize_devicepath(const char **filename)
+{
+    return 0;
+}
+#endif
+
+/*
+ * Get logical block size via ioctl. On success store it in @sector_size_p.
+ */
+static int probe_logical_blocksize(int fd, unsigned int *sector_size_p)
+{
+    unsigned int sector_size;
+    bool success = false;
+
+    errno = ENOTSUP;
+
+    /* Try a few ioctls to get the right size */
+#ifdef BLKSSZGET
+    if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
+        *sector_size_p = sector_size;
+        success = true;
+    }
+#endif
+#ifdef DKIOCGETBLOCKSIZE
+    if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
+        *sector_size_p = sector_size;
+        success = true;
+    }
+#endif
+#ifdef DIOCGSECTORSIZE
+    if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
+        *sector_size_p = sector_size;
+        success = true;
+    }
+#endif
+
+    return success ? 0 : -errno;
+}
+
+/**
+ * Get physical block size of @fd.
+ * On success, store it in @blk_size and return 0.
+ * On failure, return -errno.
+ */
+static int probe_physical_blocksize(int fd, unsigned int *blk_size)
+{
+#ifdef BLKPBSZGET
+    if (ioctl(fd, BLKPBSZGET, blk_size) < 0) {
+        return -errno;
+    }
+    return 0;
+#else
+    return -ENOTSUP;
+#endif
+}
+
+/* Check if read is allowed with given memory buffer and length.
+ *
+ * This function is used to check O_DIRECT memory buffer and request alignment.
+ */
+static bool raw_is_io_aligned(int fd, void *buf, size_t len)
+{
+    ssize_t ret = pread(fd, buf, len, 0);
+
+    if (ret >= 0) {
+        return true;
+    }
+
+#ifdef __linux__
+    /* The Linux kernel returns EINVAL for misaligned O_DIRECT reads.  Ignore
+     * other errors (e.g. real I/O error), which could happen on a failed
+     * drive, since we only care about probing alignment.
+     */
+    if (errno != EINVAL) {
+        return true;
+    }
+#endif
+
+    return false;
+}
+
+static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    char *buf;
+    size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
+
+    /* For SCSI generic devices the alignment is not really used.
+       With buffered I/O, we don't have any restrictions. */
+    if (bdrv_is_sg(bs) || !s->needs_alignment) {
+        bs->bl.request_alignment = 1;
+        s->buf_align = 1;
+        return;
+    }
+
+    bs->bl.request_alignment = 0;
+    s->buf_align = 0;
+    /* Let's try to use the logical blocksize for the alignment. */
+    if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) {
+        bs->bl.request_alignment = 0;
+    }
+#ifdef CONFIG_XFS
+    if (s->is_xfs) {
+        struct dioattr da;
+        if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) {
+            bs->bl.request_alignment = da.d_miniosz;
+            /* The kernel returns wrong information for d_mem */
+            /* s->buf_align = da.d_mem; */
+        }
+    }
+#endif
+
+    /* If we could not get the sizes so far, we can only guess them */
+    if (!s->buf_align) {
+        size_t align;
+        buf = qemu_memalign(max_align, 2 * max_align);
+        for (align = 512; align <= max_align; align <<= 1) {
+            if (raw_is_io_aligned(fd, buf + align, max_align)) {
+                s->buf_align = align;
+                break;
+            }
+        }
+        qemu_vfree(buf);
+    }
+
+    if (!bs->bl.request_alignment) {
+        size_t align;
+        buf = qemu_memalign(s->buf_align, max_align);
+        for (align = 512; align <= max_align; align <<= 1) {
+            if (raw_is_io_aligned(fd, buf, align)) {
+                bs->bl.request_alignment = align;
+                break;
+            }
+        }
+        qemu_vfree(buf);
+    }
+
+    if (!s->buf_align || !bs->bl.request_alignment) {
+        error_setg(errp, "Could not find working O_DIRECT alignment");
+        error_append_hint(errp, "Try cache.direct=off\n");
+    }
+}
+
+static void raw_parse_flags(int bdrv_flags, int *open_flags)
+{
+    assert(open_flags != NULL);
+
+    *open_flags |= O_BINARY;
+    *open_flags &= ~O_ACCMODE;
+    if (bdrv_flags & BDRV_O_RDWR) {
+        *open_flags |= O_RDWR;
+    } else {
+        *open_flags |= O_RDONLY;
+    }
+
+    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
+     * and O_DIRECT for no caching. */
+    if ((bdrv_flags & BDRV_O_NOCACHE)) {
+        *open_flags |= O_DIRECT;
+    }
+}
+
+static void raw_parse_filename(const char *filename, QDict *options,
+                               Error **errp)
+{
+    /* The filename does not have to be prefixed by the protocol name, since
+     * "file" is the default protocol; therefore, the return value of this
+     * function call can be ignored. */
+    strstart(filename, "file:", &filename);
+
+    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
+}
+
+static QemuOptsList raw_runtime_opts = {
+    .name = "raw",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
+    .desc = {
+        {
+            .name = "filename",
+            .type = QEMU_OPT_STRING,
+            .help = "File name of the image",
+        },
+        {
+            .name = "aio",
+            .type = QEMU_OPT_STRING,
+            .help = "host AIO implementation (threads, native)",
+        },
+        { /* end of list */ }
+    },
+};
+
+static int raw_open_common(BlockDriverState *bs, QDict *options,
+                           int bdrv_flags, int open_flags, Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    QemuOpts *opts;
+    Error *local_err = NULL;
+    const char *filename = NULL;
+    BlockdevAioOptions aio, aio_default;
+    int fd, ret;
+    struct stat st;
+
+    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    filename = qemu_opt_get(opts, "filename");
+
+    ret = raw_normalize_devicepath(&filename);
+    if (ret != 0) {
+        error_setg_errno(errp, -ret, "Could not normalize device path");
+        goto fail;
+    }
+
+    aio_default = (bdrv_flags & BDRV_O_NATIVE_AIO)
+                  ? BLOCKDEV_AIO_OPTIONS_NATIVE
+                  : BLOCKDEV_AIO_OPTIONS_THREADS;
+    aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
+                          BLOCKDEV_AIO_OPTIONS__MAX, aio_default, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto fail;
+    }
+    s->use_linux_aio = (aio == BLOCKDEV_AIO_OPTIONS_NATIVE);
+
+    s->open_flags = open_flags;
+    raw_parse_flags(bdrv_flags, &s->open_flags);
+
+    s->fd = -1;
+    fd = qemu_open(filename, s->open_flags, 0644);
+    if (fd < 0) {
+        ret = -errno;
+        error_setg_errno(errp, errno, "Could not open '%s'", filename);
+        if (ret == -EROFS) {
+            ret = -EACCES;
+        }
+        goto fail;
+    }
+    s->fd = fd;
+
+#ifdef CONFIG_LINUX_AIO
+     /* Currently Linux does AIO only for files opened with O_DIRECT */
+    if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
+        error_setg(errp, "aio=native was specified, but it requires "
+                         "cache.direct=on, which was not specified.");
+        ret = -EINVAL;
+        goto fail;
+    }
+#else
+    if (s->use_linux_aio) {
+        error_setg(errp, "aio=native was specified, but is not supported "
+                         "in this build.");
+        ret = -EINVAL;
+        goto fail;
+    }
+#endif /* !defined(CONFIG_LINUX_AIO) */
+
+    s->has_discard = true;
+    s->has_write_zeroes = true;
+    bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
+    if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
+        s->needs_alignment = true;
+    }
+
+    if (fstat(s->fd, &st) < 0) {
+        ret = -errno;
+        error_setg_errno(errp, errno, "Could not stat file");
+        goto fail;
+    }
+    if (S_ISREG(st.st_mode)) {
+        s->discard_zeroes = true;
+        s->has_fallocate = true;
+    }
+    if (S_ISBLK(st.st_mode)) {
+#ifdef BLKDISCARDZEROES
+        unsigned int arg;
+        if (ioctl(s->fd, BLKDISCARDZEROES, &arg) == 0 && arg) {
+            s->discard_zeroes = true;
+        }
+#endif
+#ifdef __linux__
+        /* On Linux 3.10, BLKDISCARD leaves stale data in the page cache.  Do
+         * not rely on the contents of discarded blocks unless using O_DIRECT.
+         * Same for BLKZEROOUT.
+         */
+        if (!(bs->open_flags & BDRV_O_NOCACHE)) {
+            s->discard_zeroes = false;
+            s->has_write_zeroes = false;
+        }
+#endif
+    }
+#ifdef __FreeBSD__
+    if (S_ISCHR(st.st_mode)) {
+        /*
+         * The file is a char device (disk), which on FreeBSD isn't behind
+         * a pager, so force all requests to be aligned. This is needed
+         * so QEMU makes sure all IO operations on the device are aligned
+         * to sector size, or else FreeBSD will reject them with EINVAL.
+         */
+        s->needs_alignment = true;
+    }
+#endif
+
+#ifdef CONFIG_XFS
+    if (platform_test_xfs_fd(s->fd)) {
+        s->is_xfs = true;
+    }
+#endif
+
+    ret = 0;
+fail:
+    if (filename && (bdrv_flags & BDRV_O_TEMPORARY)) {
+        unlink(filename);
+    }
+    qemu_opts_del(opts);
+    return ret;
+}
+
+static int raw_open(BlockDriverState *bs, QDict *options, int flags,
+                    Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+
+    s->type = FTYPE_FILE;
+    return raw_open_common(bs, options, flags, 0, errp);
+}
+
+static int raw_reopen_prepare(BDRVReopenState *state,
+                              BlockReopenQueue *queue, Error **errp)
+{
+    BDRVRawState *s;
+    BDRVRawReopenState *rs;
+    int ret = 0;
+    Error *local_err = NULL;
+
+    assert(state != NULL);
+    assert(state->bs != NULL);
+
+    s = state->bs->opaque;
+
+    state->opaque = g_new0(BDRVRawReopenState, 1);
+    rs = state->opaque;
+
+    if (s->type == FTYPE_CD) {
+        rs->open_flags |= O_NONBLOCK;
+    }
+
+    raw_parse_flags(state->flags, &rs->open_flags);
+
+    rs->fd = -1;
+
+    int fcntl_flags = O_APPEND | O_NONBLOCK;
+#ifdef O_NOATIME
+    fcntl_flags |= O_NOATIME;
+#endif
+
+#ifdef O_ASYNC
+    /* Not all operating systems have O_ASYNC, and those that don't
+     * will not let us track the state into rs->open_flags (typically
+     * you achieve the same effect with an ioctl, for example I_SETSIG
+     * on Solaris). But we do not use O_ASYNC, so that's fine.
+     */
+    assert((s->open_flags & O_ASYNC) == 0);
+#endif
+
+    if ((rs->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
+        /* dup the original fd */
+        rs->fd = qemu_dup(s->fd);
+        if (rs->fd >= 0) {
+            ret = fcntl_setfl(rs->fd, rs->open_flags);
+            if (ret) {
+                qemu_close(rs->fd);
+                rs->fd = -1;
+            }
+        }
+    }
+
+    /* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
+    if (rs->fd == -1) {
+        const char *normalized_filename = state->bs->filename;
+        ret = raw_normalize_devicepath(&normalized_filename);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Could not normalize device path");
+        } else {
+            assert(!(rs->open_flags & O_CREAT));
+            rs->fd = qemu_open(normalized_filename, rs->open_flags);
+            if (rs->fd == -1) {
+                error_setg_errno(errp, errno, "Could not reopen file");
+                ret = -1;
+            }
+        }
+    }
+
+    /* Fail already reopen_prepare() if we can't get a working O_DIRECT
+     * alignment with the new fd. */
+    if (rs->fd != -1) {
+        raw_probe_alignment(state->bs, rs->fd, &local_err);
+        if (local_err) {
+            qemu_close(rs->fd);
+            rs->fd = -1;
+            error_propagate(errp, local_err);
+            ret = -EINVAL;
+        }
+    }
+
+    return ret;
+}
+
+static void raw_reopen_commit(BDRVReopenState *state)
+{
+    BDRVRawReopenState *rs = state->opaque;
+    BDRVRawState *s = state->bs->opaque;
+
+    s->open_flags = rs->open_flags;
+
+    qemu_close(s->fd);
+    s->fd = rs->fd;
+
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+
+static void raw_reopen_abort(BDRVReopenState *state)
+{
+    BDRVRawReopenState *rs = state->opaque;
+
+     /* nothing to do if NULL, we didn't get far enough */
+    if (rs == NULL) {
+        return;
+    }
+
+    if (rs->fd >= 0) {
+        qemu_close(rs->fd);
+        rs->fd = -1;
+    }
+    g_free(state->opaque);
+    state->opaque = NULL;
+}
+
+static int hdev_get_max_transfer_length(int fd)
+{
+#ifdef BLKSECTGET
+    int max_sectors = 0;
+    if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
+        return max_sectors;
+    } else {
+        return -errno;
+    }
+#else
+    return -ENOSYS;
+#endif
+}
+
+static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    struct stat st;
+
+    if (!fstat(s->fd, &st)) {
+        if (S_ISBLK(st.st_mode)) {
+            int ret = hdev_get_max_transfer_length(s->fd);
+            if (ret > 0 && ret <= BDRV_REQUEST_MAX_SECTORS) {
+                bs->bl.max_transfer = pow2floor(ret << BDRV_SECTOR_BITS);
+            }
+        }
+    }
+
+    raw_probe_alignment(bs, s->fd, errp);
+    bs->bl.min_mem_alignment = s->buf_align;
+    bs->bl.opt_mem_alignment = MAX(s->buf_align, getpagesize());
+}
+
+static int check_for_dasd(int fd)
+{
+#ifdef BIODASDINFO2
+    struct dasd_information2_t info = {0};
+
+    return ioctl(fd, BIODASDINFO2, &info);
+#else
+    return -1;
+#endif
+}
+
+/**
+ * Try to get @bs's logical and physical block size.
+ * On success, store them in @bsz and return zero.
+ * On failure, return negative errno.
+ */
+static int hdev_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
+{
+    BDRVRawState *s = bs->opaque;
+    int ret;
+
+    /* If DASD, get blocksizes */
+    if (check_for_dasd(s->fd) < 0) {
+        return -ENOTSUP;
+    }
+    ret = probe_logical_blocksize(s->fd, &bsz->log);
+    if (ret < 0) {
+        return ret;
+    }
+    return probe_physical_blocksize(s->fd, &bsz->phys);
+}
+
+/**
+ * Try to get @bs's geometry: cyls, heads, sectors.
+ * On success, store them in @geo and return 0.
+ * On failure return -errno.
+ * (Allows block driver to assign default geometry values that guest sees)
+ */
+#ifdef __linux__
+static int hdev_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
+{
+    BDRVRawState *s = bs->opaque;
+    struct hd_geometry ioctl_geo = {0};
+
+    /* If DASD, get its geometry */
+    if (check_for_dasd(s->fd) < 0) {
+        return -ENOTSUP;
+    }
+    if (ioctl(s->fd, HDIO_GETGEO, &ioctl_geo) < 0) {
+        return -errno;
+    }
+    /* HDIO_GETGEO may return success even though geo contains zeros
+       (e.g. certain multipath setups) */
+    if (!ioctl_geo.heads || !ioctl_geo.sectors || !ioctl_geo.cylinders) {
+        return -ENOTSUP;
+    }
+    /* Do not return a geometry for partition */
+    if (ioctl_geo.start != 0) {
+        return -ENOTSUP;
+    }
+    geo->heads = ioctl_geo.heads;
+    geo->sectors = ioctl_geo.sectors;
+    geo->cylinders = ioctl_geo.cylinders;
+
+    return 0;
+}
+#else /* __linux__ */
+static int hdev_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
+{
+    return -ENOTSUP;
+}
+#endif
+
+static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
+{
+    int ret;
+
+    ret = ioctl(aiocb->aio_fildes, aiocb->aio_ioctl_cmd, aiocb->aio_ioctl_buf);
+    if (ret == -1) {
+        return -errno;
+    }
+
+    return 0;
+}
+
+static ssize_t handle_aiocb_flush(RawPosixAIOData *aiocb)
+{
+    int ret;
+
+    ret = qemu_fdatasync(aiocb->aio_fildes);
+    if (ret == -1) {
+        return -errno;
+    }
+    return 0;
+}
+
+#ifdef CONFIG_PREADV
+
+static bool preadv_present = true;
+
+static ssize_t
+qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
+{
+    return preadv(fd, iov, nr_iov, offset);
+}
+
+static ssize_t
+qemu_pwritev(int fd, const struct iovec *iov, int nr_iov, off_t offset)
+{
+    return pwritev(fd, iov, nr_iov, offset);
+}
+
+#else
+
+static bool preadv_present = false;
+
+static ssize_t
+qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
+{
+    return -ENOSYS;
+}
+
+static ssize_t
+qemu_pwritev(int fd, const struct iovec *iov, int nr_iov, off_t offset)
+{
+    return -ENOSYS;
+}
+
+#endif
+
+static ssize_t handle_aiocb_rw_vector(RawPosixAIOData *aiocb)
+{
+    ssize_t len;
+
+    do {
+        if (aiocb->aio_type & QEMU_AIO_WRITE)
+            len = qemu_pwritev(aiocb->aio_fildes,
+                               aiocb->aio_iov,
+                               aiocb->aio_niov,
+                               aiocb->aio_offset);
+         else
+            len = qemu_preadv(aiocb->aio_fildes,
+                              aiocb->aio_iov,
+                              aiocb->aio_niov,
+                              aiocb->aio_offset);
+    } while (len == -1 && errno == EINTR);
+
+    if (len == -1) {
+        return -errno;
+    }
+    return len;
+}
+
+/*
+ * Read/writes the data to/from a given linear buffer.
+ *
+ * Returns the number of bytes handles or -errno in case of an error. Short
+ * reads are only returned if the end of the file is reached.
+ */
+static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf)
+{
+    ssize_t offset = 0;
+    ssize_t len;
+
+    while (offset < aiocb->aio_nbytes) {
+        if (aiocb->aio_type & QEMU_AIO_WRITE) {
+            len = pwrite(aiocb->aio_fildes,
+                         (const char *)buf + offset,
+                         aiocb->aio_nbytes - offset,
+                         aiocb->aio_offset + offset);
+        } else {
+            len = pread(aiocb->aio_fildes,
+                        buf + offset,
+                        aiocb->aio_nbytes - offset,
+                        aiocb->aio_offset + offset);
+        }
+        if (len == -1 && errno == EINTR) {
+            continue;
+        } else if (len == -1 && errno == EINVAL &&
+                   (aiocb->bs->open_flags & BDRV_O_NOCACHE) &&
+                   !(aiocb->aio_type & QEMU_AIO_WRITE) &&
+                   offset > 0) {
+            /* O_DIRECT pread() may fail with EINVAL when offset is unaligned
+             * after a short read.  Assume that O_DIRECT short reads only occur
+             * at EOF.  Therefore this is a short read, not an I/O error.
+             */
+            break;
+        } else if (len == -1) {
+            offset = -errno;
+            break;
+        } else if (len == 0) {
+            break;
+        }
+        offset += len;
+    }
+
+    return offset;
+}
+
+static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
+{
+    ssize_t nbytes;
+    char *buf;
+
+    if (!(aiocb->aio_type & QEMU_AIO_MISALIGNED)) {
+        /*
+         * If there is just a single buffer, and it is properly aligned
+         * we can just use plain pread/pwrite without any problems.
+         */
+        if (aiocb->aio_niov == 1) {
+             return handle_aiocb_rw_linear(aiocb, aiocb->aio_iov->iov_base);
+        }
+        /*
+         * We have more than one iovec, and all are properly aligned.
+         *
+         * Try preadv/pwritev first and fall back to linearizing the
+         * buffer if it's not supported.
+         */
+        if (preadv_present) {
+            nbytes = handle_aiocb_rw_vector(aiocb);
+            if (nbytes == aiocb->aio_nbytes ||
+                (nbytes < 0 && nbytes != -ENOSYS)) {
+                return nbytes;
+            }
+            preadv_present = false;
+        }
+
+        /*
+         * XXX(hch): short read/write.  no easy way to handle the reminder
+         * using these interfaces.  For now retry using plain
+         * pread/pwrite?
+         */
+    }
+
+    /*
+     * Ok, we have to do it the hard way, copy all segments into
+     * a single aligned buffer.
+     */
+    buf = qemu_try_blockalign(aiocb->bs, aiocb->aio_nbytes);
+    if (buf == NULL) {
+        return -ENOMEM;
+    }
+
+    if (aiocb->aio_type & QEMU_AIO_WRITE) {
+        char *p = buf;
+        int i;
+
+        for (i = 0; i < aiocb->aio_niov; ++i) {
+            memcpy(p, aiocb->aio_iov[i].iov_base, aiocb->aio_iov[i].iov_len);
+            p += aiocb->aio_iov[i].iov_len;
+        }
+        assert(p - buf == aiocb->aio_nbytes);
+    }
+
+    nbytes = handle_aiocb_rw_linear(aiocb, buf);
+    if (!(aiocb->aio_type & QEMU_AIO_WRITE)) {
+        char *p = buf;
+        size_t count = aiocb->aio_nbytes, copy;
+        int i;
+
+        for (i = 0; i < aiocb->aio_niov && count; ++i) {
+            copy = count;
+            if (copy > aiocb->aio_iov[i].iov_len) {
+                copy = aiocb->aio_iov[i].iov_len;
+            }
+            memcpy(aiocb->aio_iov[i].iov_base, p, copy);
+            assert(count >= copy);
+            p     += copy;
+            count -= copy;
+        }
+        assert(count == 0);
+    }
+    qemu_vfree(buf);
+
+    return nbytes;
+}
+
+#ifdef CONFIG_XFS
+static int xfs_write_zeroes(BDRVRawState *s, int64_t offset, uint64_t bytes)
+{
+    struct xfs_flock64 fl;
+    int err;
+
+    memset(&fl, 0, sizeof(fl));
+    fl.l_whence = SEEK_SET;
+    fl.l_start = offset;
+    fl.l_len = bytes;
+
+    if (xfsctl(NULL, s->fd, XFS_IOC_ZERO_RANGE, &fl) < 0) {
+        err = errno;
+        DPRINTF("cannot write zero range (%s)\n", strerror(errno));
+        return -err;
+    }
+
+    return 0;
+}
+
+static int xfs_discard(BDRVRawState *s, int64_t offset, uint64_t bytes)
+{
+    struct xfs_flock64 fl;
+    int err;
+
+    memset(&fl, 0, sizeof(fl));
+    fl.l_whence = SEEK_SET;
+    fl.l_start = offset;
+    fl.l_len = bytes;
+
+    if (xfsctl(NULL, s->fd, XFS_IOC_UNRESVSP64, &fl) < 0) {
+        err = errno;
+        DPRINTF("cannot punch hole (%s)\n", strerror(errno));
+        return -err;
+    }
+
+    return 0;
+}
+#endif
+
+static int translate_err(int err)
+{
+    if (err == -ENODEV || err == -ENOSYS || err == -EOPNOTSUPP ||
+        err == -ENOTTY) {
+        err = -ENOTSUP;
+    }
+    return err;
+}
+
+#ifdef CONFIG_FALLOCATE
+static int do_fallocate(int fd, int mode, off_t offset, off_t len)
+{
+    do {
+        if (fallocate(fd, mode, offset, len) == 0) {
+            return 0;
+        }
+    } while (errno == EINTR);
+    return translate_err(-errno);
+}
+#endif
+
+static ssize_t handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
+{
+    int ret = -ENOTSUP;
+    BDRVRawState *s = aiocb->bs->opaque;
+
+    if (!s->has_write_zeroes) {
+        return -ENOTSUP;
+    }
+
+#ifdef BLKZEROOUT
+    do {
+        uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
+        if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
+            return 0;
+        }
+    } while (errno == EINTR);
+
+    ret = translate_err(-errno);
+#endif
+
+    if (ret == -ENOTSUP) {
+        s->has_write_zeroes = false;
+    }
+    return ret;
+}
+
+static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
+{
+#if defined(CONFIG_FALLOCATE) || defined(CONFIG_XFS)
+    BDRVRawState *s = aiocb->bs->opaque;
+#endif
+
+    if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
+        return handle_aiocb_write_zeroes_block(aiocb);
+    }
+
+#ifdef CONFIG_XFS
+    if (s->is_xfs) {
+        return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
+    }
+#endif
+
+#ifdef CONFIG_FALLOCATE_ZERO_RANGE
+    if (s->has_write_zeroes) {
+        int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
+                               aiocb->aio_offset, aiocb->aio_nbytes);
+        if (ret == 0 || ret != -ENOTSUP) {
+            return ret;
+        }
+        s->has_write_zeroes = false;
+    }
+#endif
+
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+    if (s->has_discard && s->has_fallocate) {
+        int ret = do_fallocate(s->fd,
+                               FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+                               aiocb->aio_offset, aiocb->aio_nbytes);
+        if (ret == 0) {
+            ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
+            if (ret == 0 || ret != -ENOTSUP) {
+                return ret;
+            }
+            s->has_fallocate = false;
+        } else if (ret != -ENOTSUP) {
+            return ret;
+        } else {
+            s->has_discard = false;
+        }
+    }
+#endif
+
+#ifdef CONFIG_FALLOCATE
+    if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) {
+        int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
+        if (ret == 0 || ret != -ENOTSUP) {
+            return ret;
+        }
+        s->has_fallocate = false;
+    }
+#endif
+
+    return -ENOTSUP;
+}
+
+static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
+{
+    int ret = -EOPNOTSUPP;
+    BDRVRawState *s = aiocb->bs->opaque;
+
+    if (!s->has_discard) {
+        return -ENOTSUP;
+    }
+
+    if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
+#ifdef BLKDISCARD
+        do {
+            uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
+            if (ioctl(aiocb->aio_fildes, BLKDISCARD, range) == 0) {
+                return 0;
+            }
+        } while (errno == EINTR);
+
+        ret = -errno;
+#endif
+    } else {
+#ifdef CONFIG_XFS
+        if (s->is_xfs) {
+            return xfs_discard(s, aiocb->aio_offset, aiocb->aio_nbytes);
+        }
+#endif
+
+#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+        ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+                           aiocb->aio_offset, aiocb->aio_nbytes);
+#endif
+    }
+
+    ret = translate_err(ret);
+    if (ret == -ENOTSUP) {
+        s->has_discard = false;
+    }
+    return ret;
+}
+
+static int aio_worker(void *arg)
+{
+    RawPosixAIOData *aiocb = arg;
+    ssize_t ret = 0;
+
+    switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
+    case QEMU_AIO_READ:
+        ret = handle_aiocb_rw(aiocb);
+        if (ret >= 0 && ret < aiocb->aio_nbytes) {
+            iov_memset(aiocb->aio_iov, aiocb->aio_niov, ret,
+                      0, aiocb->aio_nbytes - ret);
+
+            ret = aiocb->aio_nbytes;
+        }
+        if (ret == aiocb->aio_nbytes) {
+            ret = 0;
+        } else if (ret >= 0 && ret < aiocb->aio_nbytes) {
+            ret = -EINVAL;
+        }
+        break;
+    case QEMU_AIO_WRITE:
+        ret = handle_aiocb_rw(aiocb);
+        if (ret == aiocb->aio_nbytes) {
+            ret = 0;
+        } else if (ret >= 0 && ret < aiocb->aio_nbytes) {
+            ret = -EINVAL;
+        }
+        break;
+    case QEMU_AIO_FLUSH:
+        ret = handle_aiocb_flush(aiocb);
+        break;
+    case QEMU_AIO_IOCTL:
+        ret = handle_aiocb_ioctl(aiocb);
+        break;
+    case QEMU_AIO_DISCARD:
+        ret = handle_aiocb_discard(aiocb);
+        break;
+    case QEMU_AIO_WRITE_ZEROES:
+        ret = handle_aiocb_write_zeroes(aiocb);
+        break;
+    default:
+        fprintf(stderr, "invalid aio request (0x%x)\n", aiocb->aio_type);
+        ret = -EINVAL;
+        break;
+    }
+
+    g_free(aiocb);
+    return ret;
+}
+
+static int paio_submit_co(BlockDriverState *bs, int fd,
+                          int64_t offset, QEMUIOVector *qiov,
+                          int count, int type)
+{
+    RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
+    ThreadPool *pool;
+
+    acb->bs = bs;
+    acb->aio_type = type;
+    acb->aio_fildes = fd;
+
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;
+
+    if (qiov) {
+        acb->aio_iov = qiov->iov;
+        acb->aio_niov = qiov->niov;
+        assert(qiov->size == count);
+    }
+
+    trace_paio_submit_co(offset, count, type);
+    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    return thread_pool_submit_co(pool, aio_worker, acb);
+}
+
+static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
+        int64_t offset, QEMUIOVector *qiov, int count,
+        BlockCompletionFunc *cb, void *opaque, int type)
+{
+    RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
+    ThreadPool *pool;
+
+    acb->bs = bs;
+    acb->aio_type = type;
+    acb->aio_fildes = fd;
+
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;
+
+    if (qiov) {
+        acb->aio_iov = qiov->iov;
+        acb->aio_niov = qiov->niov;
+        assert(qiov->size == acb->aio_nbytes);
+    }
+
+    trace_paio_submit(acb, opaque, offset, count, type);
+    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
+}
+
+static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
+                                   uint64_t bytes, QEMUIOVector *qiov, int type)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (fd_open(bs) < 0)
+        return -EIO;
+
+    /*
+     * Check if the underlying device requires requests to be aligned,
+     * and if the request we are trying to submit is aligned or not.
+     * If this is the case tell the low-level driver that it needs
+     * to copy the buffer.
+     */
+    if (s->needs_alignment) {
+        if (!bdrv_qiov_is_aligned(bs, qiov)) {
+            type |= QEMU_AIO_MISALIGNED;
+#ifdef CONFIG_LINUX_AIO
+        } else if (s->use_linux_aio) {
+            LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
+            assert(qiov->size == bytes);
+            return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
+#endif
+        }
+    }
+
+    return paio_submit_co(bs, s->fd, offset, qiov, bytes, type);
+}
+
+static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
+                                      uint64_t bytes, QEMUIOVector *qiov,
+                                      int flags)
+{
+    return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ);
+}
+
+static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
+                                       uint64_t bytes, QEMUIOVector *qiov,
+                                       int flags)
+{
+    assert(flags == 0);
+    return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE);
+}
+
+static void raw_aio_plug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_linux_aio) {
+        LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
+        laio_io_plug(bs, aio);
+    }
+#endif
+}
+
+static void raw_aio_unplug(BlockDriverState *bs)
+{
+#ifdef CONFIG_LINUX_AIO
+    BDRVRawState *s = bs->opaque;
+    if (s->use_linux_aio) {
+        LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
+        laio_io_unplug(bs, aio);
+    }
+#endif
+}
+
+static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
+        BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (fd_open(bs) < 0)
+        return NULL;
+
+    return paio_submit(bs, s->fd, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
+}
+
+static void raw_close(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->fd >= 0) {
+        qemu_close(s->fd);
+        s->fd = -1;
+    }
+}
+
+static int raw_truncate(BlockDriverState *bs, int64_t offset)
+{
+    BDRVRawState *s = bs->opaque;
+    struct stat st;
+
+    if (fstat(s->fd, &st)) {
+        return -errno;
+    }
+
+    if (S_ISREG(st.st_mode)) {
+        if (ftruncate(s->fd, offset) < 0) {
+            return -errno;
+        }
+    } else if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
+       if (offset > raw_getlength(bs)) {
+           return -EINVAL;
+       }
+    } else {
+        return -ENOTSUP;
+    }
+
+    return 0;
+}
+
+#ifdef __OpenBSD__
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int fd = s->fd;
+    struct stat st;
+
+    if (fstat(fd, &st))
+        return -errno;
+    if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
+        struct disklabel dl;
+
+        if (ioctl(fd, DIOCGDINFO, &dl))
+            return -errno;
+        return (uint64_t)dl.d_secsize *
+            dl.d_partitions[DISKPART(st.st_rdev)].p_size;
+    } else
+        return st.st_size;
+}
+#elif defined(__NetBSD__)
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int fd = s->fd;
+    struct stat st;
+
+    if (fstat(fd, &st))
+        return -errno;
+    if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
+        struct dkwedge_info dkw;
+
+        if (ioctl(fd, DIOCGWEDGEINFO, &dkw) != -1) {
+            return dkw.dkw_size * 512;
+        } else {
+            struct disklabel dl;
+
+            if (ioctl(fd, DIOCGDINFO, &dl))
+                return -errno;
+            return (uint64_t)dl.d_secsize *
+                dl.d_partitions[DISKPART(st.st_rdev)].p_size;
+        }
+    } else
+        return st.st_size;
+}
+#elif defined(__sun__)
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    struct dk_minfo minfo;
+    int ret;
+    int64_t size;
+
+    ret = fd_open(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    /*
+     * Use the DKIOCGMEDIAINFO ioctl to read the size.
+     */
+    ret = ioctl(s->fd, DKIOCGMEDIAINFO, &minfo);
+    if (ret != -1) {
+        return minfo.dki_lbsize * minfo.dki_capacity;
+    }
+
+    /*
+     * There are reports that lseek on some devices fails, but
+     * irc discussion said that contingency on contingency was overkill.
+     */
+    size = lseek(s->fd, 0, SEEK_END);
+    if (size < 0) {
+        return -errno;
+    }
+    return size;
+}
+#elif defined(CONFIG_BSD)
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int fd = s->fd;
+    int64_t size;
+    struct stat sb;
+#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
+    int reopened = 0;
+#endif
+    int ret;
+
+    ret = fd_open(bs);
+    if (ret < 0)
+        return ret;
+
+#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
+again:
+#endif
+    if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
+#ifdef DIOCGMEDIASIZE
+	if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
+#elif defined(DIOCGPART)
+        {
+                struct partinfo pi;
+                if (ioctl(fd, DIOCGPART, &pi) == 0)
+                        size = pi.media_size;
+                else
+                        size = 0;
+        }
+        if (size == 0)
+#endif
+#if defined(__APPLE__) && defined(__MACH__)
+        {
+            uint64_t sectors = 0;
+            uint32_t sector_size = 0;
+
+            if (ioctl(fd, DKIOCGETBLOCKCOUNT, &sectors) == 0
+               && ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) == 0) {
+                size = sectors * sector_size;
+            } else {
+                size = lseek(fd, 0LL, SEEK_END);
+                if (size < 0) {
+                    return -errno;
+                }
+            }
+        }
+#else
+        size = lseek(fd, 0LL, SEEK_END);
+        if (size < 0) {
+            return -errno;
+        }
+#endif
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+        switch(s->type) {
+        case FTYPE_CD:
+            /* XXX FreeBSD acd returns UINT_MAX sectors for an empty drive */
+            if (size == 2048LL * (unsigned)-1)
+                size = 0;
+            /* XXX no disc?  maybe we need to reopen... */
+            if (size <= 0 && !reopened && cdrom_reopen(bs) >= 0) {
+                reopened = 1;
+                goto again;
+            }
+        }
+#endif
+    } else {
+        size = lseek(fd, 0, SEEK_END);
+        if (size < 0) {
+            return -errno;
+        }
+    }
+    return size;
+}
+#else
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int ret;
+    int64_t size;
+
+    ret = fd_open(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    size = lseek(s->fd, 0, SEEK_END);
+    if (size < 0) {
+        return -errno;
+    }
+    return size;
+}
+#endif
+
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+    struct stat st;
+    BDRVRawState *s = bs->opaque;
+
+    if (fstat(s->fd, &st) < 0) {
+        return -errno;
+    }
+    return (int64_t)st.st_blocks * 512;
+}
+
+static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
+{
+    int fd;
+    int result = 0;
+    int64_t total_size = 0;
+    bool nocow = false;
+    PreallocMode prealloc;
+    char *buf = NULL;
+    Error *local_err = NULL;
+
+    strstart(filename, "file:", &filename);
+
+    /* Read out options */
+    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                          BDRV_SECTOR_SIZE);
+    nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
+    buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
+    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
+                               PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
+                               &local_err);
+    g_free(buf);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        result = -EINVAL;
+        goto out;
+    }
+
+    fd = qemu_open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
+    if (fd < 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not create file");
+        goto out;
+    }
+
+    if (nocow) {
+#ifdef __linux__
+        /* Set NOCOW flag to solve performance issue on fs like btrfs.
+         * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
+         * will be ignored since any failure of this operation should not
+         * block the left work.
+         */
+        int attr;
+        if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
+            attr |= FS_NOCOW_FL;
+            ioctl(fd, FS_IOC_SETFLAGS, &attr);
+        }
+#endif
+    }
+
+    if (ftruncate(fd, total_size) != 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not resize file");
+        goto out_close;
+    }
+
+    switch (prealloc) {
+#ifdef CONFIG_POSIX_FALLOCATE
+    case PREALLOC_MODE_FALLOC:
+        /* posix_fallocate() doesn't set errno. */
+        result = -posix_fallocate(fd, 0, total_size);
+        if (result != 0) {
+            error_setg_errno(errp, -result,
+                             "Could not preallocate data for the new file");
+        }
+        break;
+#endif
+    case PREALLOC_MODE_FULL:
+    {
+        int64_t num = 0, left = total_size;
+        buf = g_malloc0(65536);
+
+        while (left > 0) {
+            num = MIN(left, 65536);
+            result = write(fd, buf, num);
+            if (result < 0) {
+                result = -errno;
+                error_setg_errno(errp, -result,
+                                 "Could not write to the new file");
+                break;
+            }
+            left -= result;
+        }
+        if (result >= 0) {
+            result = fsync(fd);
+            if (result < 0) {
+                result = -errno;
+                error_setg_errno(errp, -result,
+                                 "Could not flush new file to disk");
+            }
+        }
+        g_free(buf);
+        break;
+    }
+    case PREALLOC_MODE_OFF:
+        break;
+    default:
+        result = -EINVAL;
+        error_setg(errp, "Unsupported preallocation mode: %s",
+                   PreallocMode_lookup[prealloc]);
+        break;
+    }
+
+out_close:
+    if (qemu_close(fd) != 0 && result == 0) {
+        result = -errno;
+        error_setg_errno(errp, -result, "Could not close the new file");
+    }
+out:
+    return result;
+}
+
+/*
+ * Find allocation range in @bs around offset @start.
+ * May change underlying file descriptor's file offset.
+ * If @start is not in a hole, store @start in @data, and the
+ * beginning of the next hole in @hole, and return 0.
+ * If @start is in a non-trailing hole, store @start in @hole and the
+ * beginning of the next non-hole in @data, and return 0.
+ * If @start is in a trailing hole or beyond EOF, return -ENXIO.
+ * If we can't find out, return a negative errno other than -ENXIO.
+ */
+static int find_allocation(BlockDriverState *bs, off_t start,
+                           off_t *data, off_t *hole)
+{
+#if defined SEEK_HOLE && defined SEEK_DATA
+    BDRVRawState *s = bs->opaque;
+    off_t offs;
+
+    /*
+     * SEEK_DATA cases:
+     * D1. offs == start: start is in data
+     * D2. offs > start: start is in a hole, next data at offs
+     * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
+     *                              or start is beyond EOF
+     *     If the latter happens, the file has been truncated behind
+     *     our back since we opened it.  All bets are off then.
+     *     Treating like a trailing hole is simplest.
+     * D4. offs < 0, errno != ENXIO: we learned nothing
+     */
+    offs = lseek(s->fd, start, SEEK_DATA);
+    if (offs < 0) {
+        return -errno;          /* D3 or D4 */
+    }
+    assert(offs >= start);
+
+    if (offs > start) {
+        /* D2: in hole, next data at offs */
+        *hole = start;
+        *data = offs;
+        return 0;
+    }
+
+    /* D1: in data, end not yet known */
+
+    /*
+     * SEEK_HOLE cases:
+     * H1. offs == start: start is in a hole
+     *     If this happens here, a hole has been dug behind our back
+     *     since the previous lseek().
+     * H2. offs > start: either start is in data, next hole at offs,
+     *                   or start is in trailing hole, EOF at offs
+     *     Linux treats trailing holes like any other hole: offs ==
+     *     start.  Solaris seeks to EOF instead: offs > start (blech).
+     *     If that happens here, a hole has been dug behind our back
+     *     since the previous lseek().
+     * H3. offs < 0, errno = ENXIO: start is beyond EOF
+     *     If this happens, the file has been truncated behind our
+     *     back since we opened it.  Treat it like a trailing hole.
+     * H4. offs < 0, errno != ENXIO: we learned nothing
+     *     Pretend we know nothing at all, i.e. "forget" about D1.
+     */
+    offs = lseek(s->fd, start, SEEK_HOLE);
+    if (offs < 0) {
+        return -errno;          /* D1 and (H3 or H4) */
+    }
+    assert(offs >= start);
+
+    if (offs > start) {
+        /*
+         * D1 and H2: either in data, next hole at offs, or it was in
+         * data but is now in a trailing hole.  In the latter case,
+         * all bets are off.  Treating it as if it there was data all
+         * the way to EOF is safe, so simply do that.
+         */
+        *data = start;
+        *hole = offs;
+        return 0;
+    }
+
+    /* D1 and H1 */
+    return -EBUSY;
+#else
+    return -ENOTSUP;
+#endif
+}
+
+/*
+ * Returns the allocation status of the specified sectors.
+ *
+ * If 'sector_num' is beyond the end of the disk image the return value is 0
+ * and 'pnum' is set to 0.
+ *
+ * 'pnum' is set to the number of sectors (including and immediately following
+ * the specified sector) that are known to be in the same
+ * allocated/unallocated state.
+ *
+ * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
+ * beyond the end of the disk image it will be clamped.
+ */
+static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
+                                                    int64_t sector_num,
+                                                    int nb_sectors, int *pnum,
+                                                    BlockDriverState **file)
+{
+    off_t start, data = 0, hole = 0;
+    int64_t total_size;
+    int ret;
+
+    ret = fd_open(bs);
+    if (ret < 0) {
+        return ret;
+    }
+
+    start = sector_num * BDRV_SECTOR_SIZE;
+    total_size = bdrv_getlength(bs);
+    if (total_size < 0) {
+        return total_size;
+    } else if (start >= total_size) {
+        *pnum = 0;
+        return 0;
+    } else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
+        nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
+    }
+
+    ret = find_allocation(bs, start, &data, &hole);
+    if (ret == -ENXIO) {
+        /* Trailing hole */
+        *pnum = nb_sectors;
+        ret = BDRV_BLOCK_ZERO;
+    } else if (ret < 0) {
+        /* No info available, so pretend there are no holes */
+        *pnum = nb_sectors;
+        ret = BDRV_BLOCK_DATA;
+    } else if (data == start) {
+        /* On a data extent, compute sectors to the end of the extent,
+         * possibly including a partial sector at EOF. */
+        *pnum = MIN(nb_sectors, DIV_ROUND_UP(hole - start, BDRV_SECTOR_SIZE));
+        ret = BDRV_BLOCK_DATA;
+    } else {
+        /* On a hole, compute sectors to the beginning of the next extent.  */
+        assert(hole == start);
+        *pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
+        ret = BDRV_BLOCK_ZERO;
+    }
+    *file = bs;
+    return ret | BDRV_BLOCK_OFFSET_VALID | start;
+}
+
+static coroutine_fn BlockAIOCB *raw_aio_pdiscard(BlockDriverState *bs,
+    int64_t offset, int count,
+    BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+
+    return paio_submit(bs, s->fd, offset, NULL, count,
+                       cb, opaque, QEMU_AIO_DISCARD);
+}
+
+static int coroutine_fn raw_co_pwrite_zeroes(
+    BlockDriverState *bs, int64_t offset,
+    int count, BdrvRequestFlags flags)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (!(flags & BDRV_REQ_MAY_UNMAP)) {
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
+                              QEMU_AIO_WRITE_ZEROES);
+    } else if (s->discard_zeroes) {
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
+                              QEMU_AIO_DISCARD);
+    }
+    return -ENOTSUP;
+}
+
+static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
+{
+    BDRVRawState *s = bs->opaque;
+
+    bdi->unallocated_blocks_are_zero = s->discard_zeroes;
+    bdi->can_write_zeroes_with_unmap = s->discard_zeroes;
+    return 0;
+}
+
+static QemuOptsList raw_create_opts = {
+    .name = "raw-create-opts",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
+    .desc = {
+        {
+            .name = BLOCK_OPT_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "Virtual disk size"
+        },
+        {
+            .name = BLOCK_OPT_NOCOW,
+            .type = QEMU_OPT_BOOL,
+            .help = "Turn off copy-on-write (valid only on btrfs)"
+        },
+        {
+            .name = BLOCK_OPT_PREALLOC,
+            .type = QEMU_OPT_STRING,
+            .help = "Preallocation mode (allowed values: off, falloc, full)"
+        },
+        { /* end of list */ }
+    }
+};
+
+BlockDriver bdrv_file = {
+    .format_name = "file",
+    .protocol_name = "file",
+    .instance_size = sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_probe = NULL, /* no probe for protocols */
+    .bdrv_parse_filename = raw_parse_filename,
+    .bdrv_file_open = raw_open,
+    .bdrv_reopen_prepare = raw_reopen_prepare,
+    .bdrv_reopen_commit = raw_reopen_commit,
+    .bdrv_reopen_abort = raw_reopen_abort,
+    .bdrv_close = raw_close,
+    .bdrv_create = raw_create,
+    .bdrv_has_zero_init = bdrv_has_zero_init_1,
+    .bdrv_co_get_block_status = raw_co_get_block_status,
+    .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,
+
+    .bdrv_co_preadv         = raw_co_preadv,
+    .bdrv_co_pwritev        = raw_co_pwritev,
+    .bdrv_aio_flush = raw_aio_flush,
+    .bdrv_aio_pdiscard = raw_aio_pdiscard,
+    .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+
+    .bdrv_truncate = raw_truncate,
+    .bdrv_getlength = raw_getlength,
+    .bdrv_get_info = raw_get_info,
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+
+    .create_opts = &raw_create_opts,
+};
+
+/***********************************************/
+/* host device */
+
+#if defined(__APPLE__) && defined(__MACH__)
+static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
+                                CFIndex maxPathSize, int flags);
+static char *FindEjectableOpticalMedia(io_iterator_t *mediaIterator)
+{
+    kern_return_t kernResult = KERN_FAILURE;
+    mach_port_t     masterPort;
+    CFMutableDictionaryRef  classesToMatch;
+    const char *matching_array[] = {kIODVDMediaClass, kIOCDMediaClass};
+    char *mediaType = NULL;
+
+    kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
+    if ( KERN_SUCCESS != kernResult ) {
+        printf( "IOMasterPort returned %d\n", kernResult );
+    }
+
+    int index;
+    for (index = 0; index < ARRAY_SIZE(matching_array); index++) {
+        classesToMatch = IOServiceMatching(matching_array[index]);
+        if (classesToMatch == NULL) {
+            error_report("IOServiceMatching returned NULL for %s",
+                         matching_array[index]);
+            continue;
+        }
+        CFDictionarySetValue(classesToMatch, CFSTR(kIOMediaEjectableKey),
+                             kCFBooleanTrue);
+        kernResult = IOServiceGetMatchingServices(masterPort, classesToMatch,
+                                                  mediaIterator);
+        if (kernResult != KERN_SUCCESS) {
+            error_report("Note: IOServiceGetMatchingServices returned %d",
+                         kernResult);
+            continue;
+        }
+
+        /* If a match was found, leave the loop */
+        if (*mediaIterator != 0) {
+            DPRINTF("Matching using %s\n", matching_array[index]);
+            mediaType = g_strdup(matching_array[index]);
+            break;
+        }
+    }
+    return mediaType;
+}
+
+kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
+                         CFIndex maxPathSize, int flags)
+{
+    io_object_t     nextMedia;
+    kern_return_t   kernResult = KERN_FAILURE;
+    *bsdPath = '\0';
+    nextMedia = IOIteratorNext( mediaIterator );
+    if ( nextMedia )
+    {
+        CFTypeRef   bsdPathAsCFString;
+    bsdPathAsCFString = IORegistryEntryCreateCFProperty( nextMedia, CFSTR( kIOBSDNameKey ), kCFAllocatorDefault, 0 );
+        if ( bsdPathAsCFString ) {
+            size_t devPathLength;
+            strcpy( bsdPath, _PATH_DEV );
+            if (flags & BDRV_O_NOCACHE) {
+                strcat(bsdPath, "r");
+            }
+            devPathLength = strlen( bsdPath );
+            if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) {
+                kernResult = KERN_SUCCESS;
+            }
+            CFRelease( bsdPathAsCFString );
+        }
+        IOObjectRelease( nextMedia );
+    }
+
+    return kernResult;
+}
+
+/* Sets up a real cdrom for use in QEMU */
+static bool setup_cdrom(char *bsd_path, Error **errp)
+{
+    int index, num_of_test_partitions = 2, fd;
+    char test_partition[MAXPATHLEN];
+    bool partition_found = false;
+
+    /* look for a working partition */
+    for (index = 0; index < num_of_test_partitions; index++) {
+        snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
+                 index);
+        fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
+        if (fd >= 0) {
+            partition_found = true;
+            qemu_close(fd);
+            break;
+        }
+    }
+
+    /* if a working partition on the device was not found */
+    if (partition_found == false) {
+        error_setg(errp, "Failed to find a working partition on disc");
+    } else {
+        DPRINTF("Using %s as optical disc\n", test_partition);
+        pstrcpy(bsd_path, MAXPATHLEN, test_partition);
+    }
+    return partition_found;
+}
+
+/* Prints directions on mounting and unmounting a device */
+static void print_unmounting_directions(const char *file_name)
+{
+    error_report("If device %s is mounted on the desktop, unmount"
+                 " it first before using it in QEMU", file_name);
+    error_report("Command to unmount device: diskutil unmountDisk %s",
+                 file_name);
+    error_report("Command to mount device: diskutil mountDisk %s", file_name);
+}
+
+#endif /* defined(__APPLE__) && defined(__MACH__) */
+
+static int hdev_probe_device(const char *filename)
+{
+    struct stat st;
+
+    /* allow a dedicated CD-ROM driver to match with a higher priority */
+    if (strstart(filename, "/dev/cdrom", NULL))
+        return 50;
+
+    if (stat(filename, &st) >= 0 &&
+            (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
+        return 100;
+    }
+
+    return 0;
+}
+
+static int check_hdev_writable(BDRVRawState *s)
+{
+#if defined(BLKROGET)
+    /* Linux block devices can be configured "read-only" using blockdev(8).
+     * This is independent of device node permissions and therefore open(2)
+     * with O_RDWR succeeds.  Actual writes fail with EPERM.
+     *
+     * bdrv_open() is supposed to fail if the disk is read-only.  Explicitly
+     * check for read-only block devices so that Linux block devices behave
+     * properly.
+     */
+    struct stat st;
+    int readonly = 0;
+
+    if (fstat(s->fd, &st)) {
+        return -errno;
+    }
+
+    if (!S_ISBLK(st.st_mode)) {
+        return 0;
+    }
+
+    if (ioctl(s->fd, BLKROGET, &readonly) < 0) {
+        return -errno;
+    }
+
+    if (readonly) {
+        return -EACCES;
+    }
+#endif /* defined(BLKROGET) */
+    return 0;
+}
+
+static void hdev_parse_filename(const char *filename, QDict *options,
+                                Error **errp)
+{
+    /* The prefix is optional, just as for "file". */
+    strstart(filename, "host_device:", &filename);
+
+    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
+}
+
+static bool hdev_is_sg(BlockDriverState *bs)
+{
+
+#if defined(__linux__)
+
+    BDRVRawState *s = bs->opaque;
+    struct stat st;
+    struct sg_scsi_id scsiid;
+    int sg_version;
+    int ret;
+
+    if (stat(bs->filename, &st) < 0 || !S_ISCHR(st.st_mode)) {
+        return false;
+    }
+
+    ret = ioctl(s->fd, SG_GET_VERSION_NUM, &sg_version);
+    if (ret < 0) {
+        return false;
+    }
+
+    ret = ioctl(s->fd, SG_GET_SCSI_ID, &scsiid);
+    if (ret >= 0) {
+        DPRINTF("SG device found: type=%d, version=%d\n",
+            scsiid.scsi_type, sg_version);
+        return true;
+    }
+
+#endif
+
+    return false;
+}
+
+static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
+                     Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    Error *local_err = NULL;
+    int ret;
+
+#if defined(__APPLE__) && defined(__MACH__)
+    const char *filename = qdict_get_str(options, "filename");
+    char bsd_path[MAXPATHLEN] = "";
+    bool error_occurred = false;
+
+    /* If using a real cdrom */
+    if (strcmp(filename, "/dev/cdrom") == 0) {
+        char *mediaType = NULL;
+        kern_return_t ret_val;
+        io_iterator_t mediaIterator = 0;
+
+        mediaType = FindEjectableOpticalMedia(&mediaIterator);
+        if (mediaType == NULL) {
+            error_setg(errp, "Please make sure your CD/DVD is in the optical"
+                       " drive");
+            error_occurred = true;
+            goto hdev_open_Mac_error;
+        }
+
+        ret_val = GetBSDPath(mediaIterator, bsd_path, sizeof(bsd_path), flags);
+        if (ret_val != KERN_SUCCESS) {
+            error_setg(errp, "Could not get BSD path for optical drive");
+            error_occurred = true;
+            goto hdev_open_Mac_error;
+        }
+
+        /* If a real optical drive was not found */
+        if (bsd_path[0] == '\0') {
+            error_setg(errp, "Failed to obtain bsd path for optical drive");
+            error_occurred = true;
+            goto hdev_open_Mac_error;
+        }
+
+        /* If using a cdrom disc and finding a partition on the disc failed */
+        if (strncmp(mediaType, kIOCDMediaClass, 9) == 0 &&
+            setup_cdrom(bsd_path, errp) == false) {
+            print_unmounting_directions(bsd_path);
+            error_occurred = true;
+            goto hdev_open_Mac_error;
+        }
+
+        qdict_put(options, "filename", qstring_from_str(bsd_path));
+
+hdev_open_Mac_error:
+        g_free(mediaType);
+        if (mediaIterator) {
+            IOObjectRelease(mediaIterator);
+        }
+        if (error_occurred) {
+            return -ENOENT;
+        }
+    }
+#endif /* defined(__APPLE__) && defined(__MACH__) */
+
+    s->type = FTYPE_FILE;
+
+    ret = raw_open_common(bs, options, flags, 0, &local_err);
+    if (ret < 0) {
+        error_propagate(errp, local_err);
+#if defined(__APPLE__) && defined(__MACH__)
+        if (*bsd_path) {
+            filename = bsd_path;
+        }
+        /* if a physical device experienced an error while being opened */
+        if (strncmp(filename, "/dev/", 5) == 0) {
+            print_unmounting_directions(filename);
+        }
+#endif /* defined(__APPLE__) && defined(__MACH__) */
+        return ret;
+    }
+
+    /* Since this does ioctl the device must be already opened */
+    bs->sg = hdev_is_sg(bs);
+
+    if (flags & BDRV_O_RDWR) {
+        ret = check_hdev_writable(s);
+        if (ret < 0) {
+            raw_close(bs);
+            error_setg_errno(errp, -ret, "The device is not writable");
+            return ret;
+        }
+    }
+
+    return ret;
+}
+
+#if defined(__linux__)
+
+static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
+        unsigned long int req, void *buf,
+        BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+    RawPosixAIOData *acb;
+    ThreadPool *pool;
+
+    if (fd_open(bs) < 0)
+        return NULL;
+
+    acb = g_new(RawPosixAIOData, 1);
+    acb->bs = bs;
+    acb->aio_type = QEMU_AIO_IOCTL;
+    acb->aio_fildes = s->fd;
+    acb->aio_offset = 0;
+    acb->aio_ioctl_buf = buf;
+    acb->aio_ioctl_cmd = req;
+    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
+}
+#endif /* linux */
+
+static int fd_open(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    /* this is just to ensure s->fd is sane (its called by io ops) */
+    if (s->fd >= 0)
+        return 0;
+    return -EIO;
+}
+
+static coroutine_fn BlockAIOCB *hdev_aio_pdiscard(BlockDriverState *bs,
+    int64_t offset, int count,
+    BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (fd_open(bs) < 0) {
+        return NULL;
+    }
+    return paio_submit(bs, s->fd, offset, NULL, count,
+                       cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
+}
+
+static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs,
+    int64_t offset, int count, BdrvRequestFlags flags)
+{
+    BDRVRawState *s = bs->opaque;
+    int rc;
+
+    rc = fd_open(bs);
+    if (rc < 0) {
+        return rc;
+    }
+    if (!(flags & BDRV_REQ_MAY_UNMAP)) {
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
+                              QEMU_AIO_WRITE_ZEROES|QEMU_AIO_BLKDEV);
+    } else if (s->discard_zeroes) {
+        return paio_submit_co(bs, s->fd, offset, NULL, count,
+                              QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
+    }
+    return -ENOTSUP;
+}
+
+static int hdev_create(const char *filename, QemuOpts *opts,
+                       Error **errp)
+{
+    int fd;
+    int ret = 0;
+    struct stat stat_buf;
+    int64_t total_size = 0;
+    bool has_prefix;
+
+    /* This function is used by both protocol block drivers and therefore either
+     * of these prefixes may be given.
+     * The return value has to be stored somewhere, otherwise this is an error
+     * due to -Werror=unused-value. */
+    has_prefix =
+        strstart(filename, "host_device:", &filename) ||
+        strstart(filename, "host_cdrom:" , &filename);
+
+    (void)has_prefix;
+
+    ret = raw_normalize_devicepath(&filename);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Could not normalize device path");
+        return ret;
+    }
+
+    /* Read out options */
+    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                          BDRV_SECTOR_SIZE);
+
+    fd = qemu_open(filename, O_WRONLY | O_BINARY);
+    if (fd < 0) {
+        ret = -errno;
+        error_setg_errno(errp, -ret, "Could not open device");
+        return ret;
+    }
+
+    if (fstat(fd, &stat_buf) < 0) {
+        ret = -errno;
+        error_setg_errno(errp, -ret, "Could not stat device");
+    } else if (!S_ISBLK(stat_buf.st_mode) && !S_ISCHR(stat_buf.st_mode)) {
+        error_setg(errp,
+                   "The given file is neither a block nor a character device");
+        ret = -ENODEV;
+    } else if (lseek(fd, 0, SEEK_END) < total_size) {
+        error_setg(errp, "Device is too small");
+        ret = -ENOSPC;
+    }
+
+    qemu_close(fd);
+    return ret;
+}
+
+static BlockDriver bdrv_host_device = {
+    .format_name        = "host_device",
+    .protocol_name        = "host_device",
+    .instance_size      = sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_probe_device  = hdev_probe_device,
+    .bdrv_parse_filename = hdev_parse_filename,
+    .bdrv_file_open     = hdev_open,
+    .bdrv_close         = raw_close,
+    .bdrv_reopen_prepare = raw_reopen_prepare,
+    .bdrv_reopen_commit  = raw_reopen_commit,
+    .bdrv_reopen_abort   = raw_reopen_abort,
+    .bdrv_create         = hdev_create,
+    .create_opts         = &raw_create_opts,
+    .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
+
+    .bdrv_co_preadv         = raw_co_preadv,
+    .bdrv_co_pwritev        = raw_co_pwritev,
+    .bdrv_aio_flush	= raw_aio_flush,
+    .bdrv_aio_pdiscard   = hdev_aio_pdiscard,
+    .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+
+    .bdrv_truncate      = raw_truncate,
+    .bdrv_getlength	= raw_getlength,
+    .bdrv_get_info = raw_get_info,
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+    .bdrv_probe_blocksizes = hdev_probe_blocksizes,
+    .bdrv_probe_geometry = hdev_probe_geometry,
+
+    /* generic scsi device */
+#ifdef __linux__
+    .bdrv_aio_ioctl     = hdev_aio_ioctl,
+#endif
+};
+
+#if defined(__linux__) || defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+static void cdrom_parse_filename(const char *filename, QDict *options,
+                                 Error **errp)
+{
+    /* The prefix is optional, just as for "file". */
+    strstart(filename, "host_cdrom:", &filename);
+
+    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
+}
+#endif
+
+#ifdef __linux__
+static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
+                      Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+
+    s->type = FTYPE_CD;
+
+    /* open will not fail even if no CD is inserted, so add O_NONBLOCK */
+    return raw_open_common(bs, options, flags, O_NONBLOCK, errp);
+}
+
+static int cdrom_probe_device(const char *filename)
+{
+    int fd, ret;
+    int prio = 0;
+    struct stat st;
+
+    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
+    if (fd < 0) {
+        goto out;
+    }
+    ret = fstat(fd, &st);
+    if (ret == -1 || !S_ISBLK(st.st_mode)) {
+        goto outc;
+    }
+
+    /* Attempt to detect via a CDROM specific ioctl */
+    ret = ioctl(fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
+    if (ret >= 0)
+        prio = 100;
+
+outc:
+    qemu_close(fd);
+out:
+    return prio;
+}
+
+static bool cdrom_is_inserted(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int ret;
+
+    ret = ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
+    return ret == CDS_DISC_OK;
+}
+
+static void cdrom_eject(BlockDriverState *bs, bool eject_flag)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (eject_flag) {
+        if (ioctl(s->fd, CDROMEJECT, NULL) < 0)
+            perror("CDROMEJECT");
+    } else {
+        if (ioctl(s->fd, CDROMCLOSETRAY, NULL) < 0)
+            perror("CDROMEJECT");
+    }
+}
+
+static void cdrom_lock_medium(BlockDriverState *bs, bool locked)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (ioctl(s->fd, CDROM_LOCKDOOR, locked) < 0) {
+        /*
+         * Note: an error can happen if the distribution automatically
+         * mounts the CD-ROM
+         */
+        /* perror("CDROM_LOCKDOOR"); */
+    }
+}
+
+static BlockDriver bdrv_host_cdrom = {
+    .format_name        = "host_cdrom",
+    .protocol_name      = "host_cdrom",
+    .instance_size      = sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_probe_device	= cdrom_probe_device,
+    .bdrv_parse_filename = cdrom_parse_filename,
+    .bdrv_file_open     = cdrom_open,
+    .bdrv_close         = raw_close,
+    .bdrv_reopen_prepare = raw_reopen_prepare,
+    .bdrv_reopen_commit  = raw_reopen_commit,
+    .bdrv_reopen_abort   = raw_reopen_abort,
+    .bdrv_create         = hdev_create,
+    .create_opts         = &raw_create_opts,
+
+
+    .bdrv_co_preadv         = raw_co_preadv,
+    .bdrv_co_pwritev        = raw_co_pwritev,
+    .bdrv_aio_flush	= raw_aio_flush,
+    .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+
+    .bdrv_truncate      = raw_truncate,
+    .bdrv_getlength      = raw_getlength,
+    .has_variable_length = true,
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+
+    /* removable device support */
+    .bdrv_is_inserted   = cdrom_is_inserted,
+    .bdrv_eject         = cdrom_eject,
+    .bdrv_lock_medium   = cdrom_lock_medium,
+
+    /* generic scsi device */
+    .bdrv_aio_ioctl     = hdev_aio_ioctl,
+};
+#endif /* __linux__ */
+
+#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
+static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
+                      Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    Error *local_err = NULL;
+    int ret;
+
+    s->type = FTYPE_CD;
+
+    ret = raw_open_common(bs, options, flags, 0, &local_err);
+    if (ret) {
+        error_propagate(errp, local_err);
+        return ret;
+    }
+
+    /* make sure the door isn't locked at this time */
+    ioctl(s->fd, CDIOCALLOW);
+    return 0;
+}
+
+static int cdrom_probe_device(const char *filename)
+{
+    if (strstart(filename, "/dev/cd", NULL) ||
+            strstart(filename, "/dev/acd", NULL))
+        return 100;
+    return 0;
+}
+
+static int cdrom_reopen(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    int fd;
+
+    /*
+     * Force reread of possibly changed/newly loaded disc,
+     * FreeBSD seems to not notice sometimes...
+     */
+    if (s->fd >= 0)
+        qemu_close(s->fd);
+    fd = qemu_open(bs->filename, s->open_flags, 0644);
+    if (fd < 0) {
+        s->fd = -1;
+        return -EIO;
+    }
+    s->fd = fd;
+
+    /* make sure the door isn't locked at this time */
+    ioctl(s->fd, CDIOCALLOW);
+    return 0;
+}
+
+static bool cdrom_is_inserted(BlockDriverState *bs)
+{
+    return raw_getlength(bs) > 0;
+}
+
+static void cdrom_eject(BlockDriverState *bs, bool eject_flag)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->fd < 0)
+        return;
+
+    (void) ioctl(s->fd, CDIOCALLOW);
+
+    if (eject_flag) {
+        if (ioctl(s->fd, CDIOCEJECT) < 0)
+            perror("CDIOCEJECT");
+    } else {
+        if (ioctl(s->fd, CDIOCCLOSE) < 0)
+            perror("CDIOCCLOSE");
+    }
+
+    cdrom_reopen(bs);
+}
+
+static void cdrom_lock_medium(BlockDriverState *bs, bool locked)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->fd < 0)
+        return;
+    if (ioctl(s->fd, (locked ? CDIOCPREVENT : CDIOCALLOW)) < 0) {
+        /*
+         * Note: an error can happen if the distribution automatically
+         * mounts the CD-ROM
+         */
+        /* perror("CDROM_LOCKDOOR"); */
+    }
+}
+
+static BlockDriver bdrv_host_cdrom = {
+    .format_name        = "host_cdrom",
+    .protocol_name      = "host_cdrom",
+    .instance_size      = sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_probe_device	= cdrom_probe_device,
+    .bdrv_parse_filename = cdrom_parse_filename,
+    .bdrv_file_open     = cdrom_open,
+    .bdrv_close         = raw_close,
+    .bdrv_reopen_prepare = raw_reopen_prepare,
+    .bdrv_reopen_commit  = raw_reopen_commit,
+    .bdrv_reopen_abort   = raw_reopen_abort,
+    .bdrv_create        = hdev_create,
+    .create_opts        = &raw_create_opts,
+
+    .bdrv_co_preadv         = raw_co_preadv,
+    .bdrv_co_pwritev        = raw_co_pwritev,
+    .bdrv_aio_flush	= raw_aio_flush,
+    .bdrv_refresh_limits = raw_refresh_limits,
+    .bdrv_io_plug = raw_aio_plug,
+    .bdrv_io_unplug = raw_aio_unplug,
+
+    .bdrv_truncate      = raw_truncate,
+    .bdrv_getlength      = raw_getlength,
+    .has_variable_length = true,
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+
+    /* removable device support */
+    .bdrv_is_inserted   = cdrom_is_inserted,
+    .bdrv_eject         = cdrom_eject,
+    .bdrv_lock_medium   = cdrom_lock_medium,
+};
+#endif /* __FreeBSD__ */
+
+static void bdrv_file_init(void)
+{
+    /*
+     * Register all the drivers.  Note that order is important, the driver
+     * registered last will get probed first.
+     */
+    bdrv_register(&bdrv_file);
+    bdrv_register(&bdrv_host_device);
+#ifdef __linux__
+    bdrv_register(&bdrv_host_cdrom);
+#endif
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+    bdrv_register(&bdrv_host_cdrom);
+#endif
+}
+
+block_init(bdrv_file_init);
diff --git a/block/file-win32.c b/block/file-win32.c
new file mode 100644
index 0000000..800fabd
--- /dev/null
+++ b/block/file-win32.c
@@ -0,0 +1,781 @@
+/*
+ * Block driver for RAW files (win32)
+ *
+ * Copyright (c) 2006 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "qemu/timer.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+#include "block/raw-aio.h"
+#include "trace.h"
+#include "block/thread-pool.h"
+#include "qemu/iov.h"
+#include "qapi/qmp/qstring.h"
+#include "qapi/util.h"
+#include <windows.h>
+#include <winioctl.h>
+
+#define FTYPE_FILE 0
+#define FTYPE_CD     1
+#define FTYPE_HARDDISK 2
+
+typedef struct RawWin32AIOData {
+    BlockDriverState *bs;
+    HANDLE hfile;
+    struct iovec *aio_iov;
+    int aio_niov;
+    size_t aio_nbytes;
+    off64_t aio_offset;
+    int aio_type;
+} RawWin32AIOData;
+
+typedef struct BDRVRawState {
+    HANDLE hfile;
+    int type;
+    char drive_path[16]; /* format: "d:\" */
+    QEMUWin32AIOState *aio;
+} BDRVRawState;
+
+/*
+ * Read/writes the data to/from a given linear buffer.
+ *
+ * Returns the number of bytes handles or -errno in case of an error. Short
+ * reads are only returned if the end of the file is reached.
+ */
+static size_t handle_aiocb_rw(RawWin32AIOData *aiocb)
+{
+    size_t offset = 0;
+    int i;
+
+    for (i = 0; i < aiocb->aio_niov; i++) {
+        OVERLAPPED ov;
+        DWORD ret, ret_count, len;
+
+        memset(&ov, 0, sizeof(ov));
+        ov.Offset = (aiocb->aio_offset + offset);
+        ov.OffsetHigh = (aiocb->aio_offset + offset) >> 32;
+        len = aiocb->aio_iov[i].iov_len;
+        if (aiocb->aio_type & QEMU_AIO_WRITE) {
+            ret = WriteFile(aiocb->hfile, aiocb->aio_iov[i].iov_base,
+                            len, &ret_count, &ov);
+        } else {
+            ret = ReadFile(aiocb->hfile, aiocb->aio_iov[i].iov_base,
+                           len, &ret_count, &ov);
+        }
+        if (!ret) {
+            ret_count = 0;
+        }
+        if (ret_count != len) {
+            offset += ret_count;
+            break;
+        }
+        offset += len;
+    }
+
+    return offset;
+}
+
+static int aio_worker(void *arg)
+{
+    RawWin32AIOData *aiocb = arg;
+    ssize_t ret = 0;
+    size_t count;
+
+    switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
+    case QEMU_AIO_READ:
+        count = handle_aiocb_rw(aiocb);
+        if (count < aiocb->aio_nbytes) {
+            /* A short read means that we have reached EOF. Pad the buffer
+             * with zeros for bytes after EOF. */
+            iov_memset(aiocb->aio_iov, aiocb->aio_niov, count,
+                      0, aiocb->aio_nbytes - count);
+
+            count = aiocb->aio_nbytes;
+        }
+        if (count == aiocb->aio_nbytes) {
+            ret = 0;
+        } else {
+            ret = -EINVAL;
+        }
+        break;
+    case QEMU_AIO_WRITE:
+        count = handle_aiocb_rw(aiocb);
+        if (count == aiocb->aio_nbytes) {
+            ret = 0;
+        } else {
+            ret = -EINVAL;
+        }
+        break;
+    case QEMU_AIO_FLUSH:
+        if (!FlushFileBuffers(aiocb->hfile)) {
+            return -EIO;
+        }
+        break;
+    default:
+        fprintf(stderr, "invalid aio request (0x%x)\n", aiocb->aio_type);
+        ret = -EINVAL;
+        break;
+    }
+
+    g_free(aiocb);
+    return ret;
+}
+
+static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
+        int64_t offset, QEMUIOVector *qiov, int count,
+        BlockCompletionFunc *cb, void *opaque, int type)
+{
+    RawWin32AIOData *acb = g_new(RawWin32AIOData, 1);
+    ThreadPool *pool;
+
+    acb->bs = bs;
+    acb->hfile = hfile;
+    acb->aio_type = type;
+
+    if (qiov) {
+        acb->aio_iov = qiov->iov;
+        acb->aio_niov = qiov->niov;
+        assert(qiov->size == count);
+    }
+    acb->aio_nbytes = count;
+    acb->aio_offset = offset;
+
+    trace_paio_submit(acb, opaque, offset, count, type);
+    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
+    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
+}
+
+int qemu_ftruncate64(int fd, int64_t length)
+{
+    LARGE_INTEGER li;
+    DWORD dw;
+    LONG high;
+    HANDLE h;
+    BOOL res;
+
+    if ((GetVersion() & 0x80000000UL) && (length >> 32) != 0)
+	return -1;
+
+    h = (HANDLE)_get_osfhandle(fd);
+
+    /* get current position, ftruncate do not change position */
+    li.HighPart = 0;
+    li.LowPart = SetFilePointer (h, 0, &li.HighPart, FILE_CURRENT);
+    if (li.LowPart == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
+	return -1;
+    }
+
+    high = length >> 32;
+    dw = SetFilePointer(h, (DWORD) length, &high, FILE_BEGIN);
+    if (dw == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
+	return -1;
+    }
+    res = SetEndOfFile(h);
+
+    /* back to old position */
+    SetFilePointer(h, li.LowPart, &li.HighPart, FILE_BEGIN);
+    return res ? 0 : -1;
+}
+
+static int set_sparse(int fd)
+{
+    DWORD returned;
+    return (int) DeviceIoControl((HANDLE)_get_osfhandle(fd), FSCTL_SET_SPARSE,
+				 NULL, 0, NULL, 0, &returned, NULL);
+}
+
+static void raw_detach_aio_context(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->aio) {
+        win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
+    }
+}
+
+static void raw_attach_aio_context(BlockDriverState *bs,
+                                   AioContext *new_context)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->aio) {
+        win32_aio_attach_aio_context(s->aio, new_context);
+    }
+}
+
+static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    DWORD sectorsPerCluster, freeClusters, totalClusters, count;
+    DISK_GEOMETRY_EX dg;
+    BOOL status;
+
+    if (s->type == FTYPE_CD) {
+        bs->bl.request_alignment = 2048;
+        return;
+    }
+    if (s->type == FTYPE_HARDDISK) {
+        status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
+                                 NULL, 0, &dg, sizeof(dg), &count, NULL);
+        if (status != 0) {
+            bs->bl.request_alignment = dg.Geometry.BytesPerSector;
+            return;
+        }
+        /* try GetDiskFreeSpace too */
+    }
+
+    if (s->drive_path[0]) {
+        GetDiskFreeSpace(s->drive_path, &sectorsPerCluster,
+                         &dg.Geometry.BytesPerSector,
+                         &freeClusters, &totalClusters);
+        bs->bl.request_alignment = dg.Geometry.BytesPerSector;
+    }
+}
+
+static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
+                            DWORD *overlapped)
+{
+    assert(access_flags != NULL);
+    assert(overlapped != NULL);
+
+    if (flags & BDRV_O_RDWR) {
+        *access_flags = GENERIC_READ | GENERIC_WRITE;
+    } else {
+        *access_flags = GENERIC_READ;
+    }
+
+    *overlapped = FILE_ATTRIBUTE_NORMAL;
+    if (use_aio) {
+        *overlapped |= FILE_FLAG_OVERLAPPED;
+    }
+    if (flags & BDRV_O_NOCACHE) {
+        *overlapped |= FILE_FLAG_NO_BUFFERING;
+    }
+}
+
+static void raw_parse_filename(const char *filename, QDict *options,
+                               Error **errp)
+{
+    /* The filename does not have to be prefixed by the protocol name, since
+     * "file" is the default protocol; therefore, the return value of this
+     * function call can be ignored. */
+    strstart(filename, "file:", &filename);
+
+    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
+}
+
+static QemuOptsList raw_runtime_opts = {
+    .name = "raw",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
+    .desc = {
+        {
+            .name = "filename",
+            .type = QEMU_OPT_STRING,
+            .help = "File name of the image",
+        },
+        {
+            .name = "aio",
+            .type = QEMU_OPT_STRING,
+            .help = "host AIO implementation (threads, native)",
+        },
+        { /* end of list */ }
+    },
+};
+
+static bool get_aio_option(QemuOpts *opts, int flags, Error **errp)
+{
+    BlockdevAioOptions aio, aio_default;
+
+    aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE
+                                              : BLOCKDEV_AIO_OPTIONS_THREADS;
+    aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
+                          BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp);
+
+    switch (aio) {
+    case BLOCKDEV_AIO_OPTIONS_NATIVE:
+        return true;
+    case BLOCKDEV_AIO_OPTIONS_THREADS:
+        return false;
+    default:
+        error_setg(errp, "Invalid AIO option");
+    }
+    return false;
+}
+
+static int raw_open(BlockDriverState *bs, QDict *options, int flags,
+                    Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    int access_flags;
+    DWORD overlapped;
+    QemuOpts *opts;
+    Error *local_err = NULL;
+    const char *filename;
+    bool use_aio;
+    int ret;
+
+    s->type = FTYPE_FILE;
+
+    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    filename = qemu_opt_get(opts, "filename");
+
+    use_aio = get_aio_option(opts, flags, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto fail;
+    }
+
+    raw_parse_flags(flags, use_aio, &access_flags, &overlapped);
+
+    if (filename[0] && filename[1] == ':') {
+        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", filename[0]);
+    } else if (filename[0] == '\\' && filename[1] == '\\') {
+        s->drive_path[0] = 0;
+    } else {
+        /* Relative path.  */
+        char buf[MAX_PATH];
+        GetCurrentDirectory(MAX_PATH, buf);
+        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", buf[0]);
+    }
+
+    s->hfile = CreateFile(filename, access_flags,
+                          FILE_SHARE_READ, NULL,
+                          OPEN_EXISTING, overlapped, NULL);
+    if (s->hfile == INVALID_HANDLE_VALUE) {
+        int err = GetLastError();
+
+        error_setg_win32(errp, err, "Could not open '%s'", filename);
+        if (err == ERROR_ACCESS_DENIED) {
+            ret = -EACCES;
+        } else {
+            ret = -EINVAL;
+        }
+        goto fail;
+    }
+
+    if (use_aio) {
+        s->aio = win32_aio_init();
+        if (s->aio == NULL) {
+            CloseHandle(s->hfile);
+            error_setg(errp, "Could not initialize AIO");
+            ret = -EINVAL;
+            goto fail;
+        }
+
+        ret = win32_aio_attach(s->aio, s->hfile);
+        if (ret < 0) {
+            win32_aio_cleanup(s->aio);
+            CloseHandle(s->hfile);
+            error_setg_errno(errp, -ret, "Could not enable AIO");
+            goto fail;
+        }
+
+        win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
+    }
+
+    ret = 0;
+fail:
+    qemu_opts_del(opts);
+    return ret;
+}
+
+static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
+                         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+                         BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+    if (s->aio) {
+        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
+                                nb_sectors, cb, opaque, QEMU_AIO_READ);
+    } else {
+        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
+                           nb_sectors << BDRV_SECTOR_BITS,
+                           cb, opaque, QEMU_AIO_READ);
+    }
+}
+
+static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
+                          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
+                          BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+    if (s->aio) {
+        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
+                                nb_sectors, cb, opaque, QEMU_AIO_WRITE);
+    } else {
+        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
+                           nb_sectors << BDRV_SECTOR_BITS,
+                           cb, opaque, QEMU_AIO_WRITE);
+    }
+}
+
+static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
+                         BlockCompletionFunc *cb, void *opaque)
+{
+    BDRVRawState *s = bs->opaque;
+    return paio_submit(bs, s->hfile, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
+}
+
+static void raw_close(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    if (s->aio) {
+        win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
+        win32_aio_cleanup(s->aio);
+        s->aio = NULL;
+    }
+
+    CloseHandle(s->hfile);
+    if (bs->open_flags & BDRV_O_TEMPORARY) {
+        unlink(bs->filename);
+    }
+}
+
+static int raw_truncate(BlockDriverState *bs, int64_t offset)
+{
+    BDRVRawState *s = bs->opaque;
+    LONG low, high;
+    DWORD dwPtrLow;
+
+    low = offset;
+    high = offset >> 32;
+
+    /*
+     * An error has occurred if the return value is INVALID_SET_FILE_POINTER
+     * and GetLastError doesn't return NO_ERROR.
+     */
+    dwPtrLow = SetFilePointer(s->hfile, low, &high, FILE_BEGIN);
+    if (dwPtrLow == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
+        fprintf(stderr, "SetFilePointer error: %lu\n", GetLastError());
+        return -EIO;
+    }
+    if (SetEndOfFile(s->hfile) == 0) {
+        fprintf(stderr, "SetEndOfFile error: %lu\n", GetLastError());
+        return -EIO;
+    }
+    return 0;
+}
+
+static int64_t raw_getlength(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+    LARGE_INTEGER l;
+    ULARGE_INTEGER available, total, total_free;
+    DISK_GEOMETRY_EX dg;
+    DWORD count;
+    BOOL status;
+
+    switch(s->type) {
+    case FTYPE_FILE:
+        l.LowPart = GetFileSize(s->hfile, (PDWORD)&l.HighPart);
+        if (l.LowPart == 0xffffffffUL && GetLastError() != NO_ERROR)
+            return -EIO;
+        break;
+    case FTYPE_CD:
+        if (!GetDiskFreeSpaceEx(s->drive_path, &available, &total, &total_free))
+            return -EIO;
+        l.QuadPart = total.QuadPart;
+        break;
+    case FTYPE_HARDDISK:
+        status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
+                                 NULL, 0, &dg, sizeof(dg), &count, NULL);
+        if (status != 0) {
+            l = dg.DiskSize;
+        }
+        break;
+    default:
+        return -EIO;
+    }
+    return l.QuadPart;
+}
+
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+    typedef DWORD (WINAPI * get_compressed_t)(const char *filename,
+                                              DWORD * high);
+    get_compressed_t get_compressed;
+    struct _stati64 st;
+    const char *filename = bs->filename;
+    /* WinNT support GetCompressedFileSize to determine allocate size */
+    get_compressed =
+        (get_compressed_t) GetProcAddress(GetModuleHandle("kernel32"),
+                                            "GetCompressedFileSizeA");
+    if (get_compressed) {
+        DWORD high, low;
+        low = get_compressed(filename, &high);
+        if (low != 0xFFFFFFFFlu || GetLastError() == NO_ERROR) {
+            return (((int64_t) high) << 32) + low;
+        }
+    }
+
+    if (_stati64(filename, &st) < 0) {
+        return -1;
+    }
+    return st.st_size;
+}
+
+static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
+{
+    int fd;
+    int64_t total_size = 0;
+
+    strstart(filename, "file:", &filename);
+
+    /* Read out options */
+    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
+                          BDRV_SECTOR_SIZE);
+
+    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
+                   0644);
+    if (fd < 0) {
+        error_setg_errno(errp, errno, "Could not create file");
+        return -EIO;
+    }
+    set_sparse(fd);
+    ftruncate(fd, total_size);
+    qemu_close(fd);
+    return 0;
+}
+
+
+static QemuOptsList raw_create_opts = {
+    .name = "raw-create-opts",
+    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
+    .desc = {
+        {
+            .name = BLOCK_OPT_SIZE,
+            .type = QEMU_OPT_SIZE,
+            .help = "Virtual disk size"
+        },
+        { /* end of list */ }
+    }
+};
+
+BlockDriver bdrv_file = {
+    .format_name	= "file",
+    .protocol_name	= "file",
+    .instance_size	= sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_parse_filename = raw_parse_filename,
+    .bdrv_file_open     = raw_open,
+    .bdrv_refresh_limits = raw_probe_alignment,
+    .bdrv_close         = raw_close,
+    .bdrv_create        = raw_create,
+    .bdrv_has_zero_init = bdrv_has_zero_init_1,
+
+    .bdrv_aio_readv     = raw_aio_readv,
+    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush     = raw_aio_flush,
+
+    .bdrv_truncate	= raw_truncate,
+    .bdrv_getlength	= raw_getlength,
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+
+    .create_opts        = &raw_create_opts,
+};
+
+/***********************************************/
+/* host device */
+
+static int find_cdrom(char *cdrom_name, int cdrom_name_size)
+{
+    char drives[256], *pdrv = drives;
+    UINT type;
+
+    memset(drives, 0, sizeof(drives));
+    GetLogicalDriveStrings(sizeof(drives), drives);
+    while(pdrv[0] != '\0') {
+        type = GetDriveType(pdrv);
+        switch(type) {
+        case DRIVE_CDROM:
+            snprintf(cdrom_name, cdrom_name_size, "\\\\.\\%c:", pdrv[0]);
+            return 0;
+            break;
+        }
+        pdrv += lstrlen(pdrv) + 1;
+    }
+    return -1;
+}
+
+static int find_device_type(BlockDriverState *bs, const char *filename)
+{
+    BDRVRawState *s = bs->opaque;
+    UINT type;
+    const char *p;
+
+    if (strstart(filename, "\\\\.\\", &p) ||
+        strstart(filename, "//./", &p)) {
+        if (stristart(p, "PhysicalDrive", NULL))
+            return FTYPE_HARDDISK;
+        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", p[0]);
+        type = GetDriveType(s->drive_path);
+        switch (type) {
+        case DRIVE_REMOVABLE:
+        case DRIVE_FIXED:
+            return FTYPE_HARDDISK;
+        case DRIVE_CDROM:
+            return FTYPE_CD;
+        default:
+            return FTYPE_FILE;
+        }
+    } else {
+        return FTYPE_FILE;
+    }
+}
+
+static int hdev_probe_device(const char *filename)
+{
+    if (strstart(filename, "/dev/cdrom", NULL))
+        return 100;
+    if (is_windows_drive(filename))
+        return 100;
+    return 0;
+}
+
+static void hdev_parse_filename(const char *filename, QDict *options,
+                                Error **errp)
+{
+    /* The prefix is optional, just as for "file". */
+    strstart(filename, "host_device:", &filename);
+
+    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
+}
+
+static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
+                     Error **errp)
+{
+    BDRVRawState *s = bs->opaque;
+    int access_flags, create_flags;
+    int ret = 0;
+    DWORD overlapped;
+    char device_name[64];
+
+    Error *local_err = NULL;
+    const char *filename;
+    bool use_aio;
+
+    QemuOpts *opts = qemu_opts_create(&raw_runtime_opts, NULL, 0,
+                                      &error_abort);
+    qemu_opts_absorb_qdict(opts, options, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto done;
+    }
+
+    filename = qemu_opt_get(opts, "filename");
+
+    use_aio = get_aio_option(opts, flags, &local_err);
+    if (!local_err && use_aio) {
+        error_setg(&local_err, "AIO is not supported on Windows host devices");
+    }
+    if (local_err) {
+        error_propagate(errp, local_err);
+        ret = -EINVAL;
+        goto done;
+    }
+
+    if (strstart(filename, "/dev/cdrom", NULL)) {
+        if (find_cdrom(device_name, sizeof(device_name)) < 0) {
+            error_setg(errp, "Could not open CD-ROM drive");
+            ret = -ENOENT;
+            goto done;
+        }
+        filename = device_name;
+    } else {
+        /* transform drive letters into device name */
+        if (((filename[0] >= 'a' && filename[0] <= 'z') ||
+             (filename[0] >= 'A' && filename[0] <= 'Z')) &&
+            filename[1] == ':' && filename[2] == '\0') {
+            snprintf(device_name, sizeof(device_name), "\\\\.\\%c:", filename[0]);
+            filename = device_name;
+        }
+    }
+    s->type = find_device_type(bs, filename);
+
+    raw_parse_flags(flags, use_aio, &access_flags, &overlapped);
+
+    create_flags = OPEN_EXISTING;
+
+    s->hfile = CreateFile(filename, access_flags,
+                          FILE_SHARE_READ, NULL,
+                          create_flags, overlapped, NULL);
+    if (s->hfile == INVALID_HANDLE_VALUE) {
+        int err = GetLastError();
+
+        if (err == ERROR_ACCESS_DENIED) {
+            ret = -EACCES;
+        } else {
+            ret = -EINVAL;
+        }
+        error_setg_errno(errp, -ret, "Could not open device");
+        goto done;
+    }
+
+done:
+    qemu_opts_del(opts);
+    return ret;
+}
+
+static BlockDriver bdrv_host_device = {
+    .format_name	= "host_device",
+    .protocol_name	= "host_device",
+    .instance_size	= sizeof(BDRVRawState),
+    .bdrv_needs_filename = true,
+    .bdrv_parse_filename = hdev_parse_filename,
+    .bdrv_probe_device	= hdev_probe_device,
+    .bdrv_file_open	= hdev_open,
+    .bdrv_close		= raw_close,
+
+    .bdrv_aio_readv     = raw_aio_readv,
+    .bdrv_aio_writev    = raw_aio_writev,
+    .bdrv_aio_flush     = raw_aio_flush,
+
+    .bdrv_detach_aio_context = raw_detach_aio_context,
+    .bdrv_attach_aio_context = raw_attach_aio_context,
+
+    .bdrv_getlength      = raw_getlength,
+    .has_variable_length = true,
+
+    .bdrv_get_allocated_file_size
+                        = raw_get_allocated_file_size,
+};
+
+static void bdrv_file_init(void)
+{
+    bdrv_register(&bdrv_file);
+    bdrv_register(&bdrv_host_device);
+}
+
+block_init(bdrv_file_init);
diff --git a/block/gluster.c b/block/gluster.c
index a0a74e4..1a22f29 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1253,7 +1253,7 @@ static int qemu_gluster_has_zero_init(BlockDriverState *bs)
  * If @start is in a trailing hole or beyond EOF, return -ENXIO.
  * If we can't find out, return a negative errno other than -ENXIO.
  *
- * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
+ * (Shamefully copied from file-posix.c, only miniscule adaptions.)
  */
 static int find_allocation(BlockDriverState *bs, off_t start,
                            off_t *data, off_t *hole)
@@ -1349,7 +1349,7 @@ exit:
  * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
  * beyond the end of the disk image it will be clamped.
  *
- * (Based on raw_co_get_block_status() from raw-posix.c.)
+ * (Based on raw_co_get_block_status() from file-posix.c.)
  */
 static int64_t coroutine_fn qemu_gluster_co_get_block_status(
         BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
diff --git a/block/raw-posix.c b/block/raw-posix.c
deleted file mode 100644
index 28b47d9..0000000
--- a/block/raw-posix.c
+++ /dev/null
@@ -1,2616 +0,0 @@
-/*
- * Block driver for RAW files (posix)
- *
- * Copyright (c) 2006 Fabrice Bellard
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-#include "qemu/osdep.h"
-#include "qapi/error.h"
-#include "qemu/cutils.h"
-#include "qemu/error-report.h"
-#include "qemu/timer.h"
-#include "qemu/log.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
-#include "trace.h"
-#include "block/thread-pool.h"
-#include "qemu/iov.h"
-#include "block/raw-aio.h"
-#include "qapi/util.h"
-#include "qapi/qmp/qstring.h"
-
-#if defined(__APPLE__) && (__MACH__)
-#include <paths.h>
-#include <sys/param.h>
-#include <IOKit/IOKitLib.h>
-#include <IOKit/IOBSD.h>
-#include <IOKit/storage/IOMediaBSDClient.h>
-#include <IOKit/storage/IOMedia.h>
-#include <IOKit/storage/IOCDMedia.h>
-//#include <IOKit/storage/IOCDTypes.h>
-#include <IOKit/storage/IODVDMedia.h>
-#include <CoreFoundation/CoreFoundation.h>
-#endif
-
-#ifdef __sun__
-#define _POSIX_PTHREAD_SEMANTICS 1
-#include <sys/dkio.h>
-#endif
-#ifdef __linux__
-#include <sys/ioctl.h>
-#include <sys/param.h>
-#include <linux/cdrom.h>
-#include <linux/fd.h>
-#include <linux/fs.h>
-#include <linux/hdreg.h>
-#include <scsi/sg.h>
-#ifdef __s390__
-#include <asm/dasd.h>
-#endif
-#ifndef FS_NOCOW_FL
-#define FS_NOCOW_FL                     0x00800000 /* Do not cow file */
-#endif
-#endif
-#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || defined(CONFIG_FALLOCATE_ZERO_RANGE)
-#include <linux/falloc.h>
-#endif
-#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
-#include <sys/disk.h>
-#include <sys/cdio.h>
-#endif
-
-#ifdef __OpenBSD__
-#include <sys/ioctl.h>
-#include <sys/disklabel.h>
-#include <sys/dkio.h>
-#endif
-
-#ifdef __NetBSD__
-#include <sys/ioctl.h>
-#include <sys/disklabel.h>
-#include <sys/dkio.h>
-#include <sys/disk.h>
-#endif
-
-#ifdef __DragonFly__
-#include <sys/ioctl.h>
-#include <sys/diskslice.h>
-#endif
-
-#ifdef CONFIG_XFS
-#include <xfs/xfs.h>
-#endif
-
-//#define DEBUG_BLOCK
-
-#ifdef DEBUG_BLOCK
-# define DEBUG_BLOCK_PRINT 1
-#else
-# define DEBUG_BLOCK_PRINT 0
-#endif
-#define DPRINTF(fmt, ...) \
-do { \
-    if (DEBUG_BLOCK_PRINT) { \
-        printf(fmt, ## __VA_ARGS__); \
-    } \
-} while (0)
-
-/* OS X does not have O_DSYNC */
-#ifndef O_DSYNC
-#ifdef O_SYNC
-#define O_DSYNC O_SYNC
-#elif defined(O_FSYNC)
-#define O_DSYNC O_FSYNC
-#endif
-#endif
-
-/* Approximate O_DIRECT with O_DSYNC if O_DIRECT isn't available */
-#ifndef O_DIRECT
-#define O_DIRECT O_DSYNC
-#endif
-
-#define FTYPE_FILE   0
-#define FTYPE_CD     1
-
-#define MAX_BLOCKSIZE	4096
-
-typedef struct BDRVRawState {
-    int fd;
-    int type;
-    int open_flags;
-    size_t buf_align;
-
-#ifdef CONFIG_XFS
-    bool is_xfs:1;
-#endif
-    bool has_discard:1;
-    bool has_write_zeroes:1;
-    bool discard_zeroes:1;
-    bool use_linux_aio:1;
-    bool has_fallocate;
-    bool needs_alignment;
-} BDRVRawState;
-
-typedef struct BDRVRawReopenState {
-    int fd;
-    int open_flags;
-} BDRVRawReopenState;
-
-static int fd_open(BlockDriverState *bs);
-static int64_t raw_getlength(BlockDriverState *bs);
-
-typedef struct RawPosixAIOData {
-    BlockDriverState *bs;
-    int aio_fildes;
-    union {
-        struct iovec *aio_iov;
-        void *aio_ioctl_buf;
-    };
-    int aio_niov;
-    uint64_t aio_nbytes;
-#define aio_ioctl_cmd   aio_nbytes /* for QEMU_AIO_IOCTL */
-    off_t aio_offset;
-    int aio_type;
-} RawPosixAIOData;
-
-#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-static int cdrom_reopen(BlockDriverState *bs);
-#endif
-
-#if defined(__NetBSD__)
-static int raw_normalize_devicepath(const char **filename)
-{
-    static char namebuf[PATH_MAX];
-    const char *dp, *fname;
-    struct stat sb;
-
-    fname = *filename;
-    dp = strrchr(fname, '/');
-    if (lstat(fname, &sb) < 0) {
-        fprintf(stderr, "%s: stat failed: %s\n",
-            fname, strerror(errno));
-        return -errno;
-    }
-
-    if (!S_ISBLK(sb.st_mode)) {
-        return 0;
-    }
-
-    if (dp == NULL) {
-        snprintf(namebuf, PATH_MAX, "r%s", fname);
-    } else {
-        snprintf(namebuf, PATH_MAX, "%.*s/r%s",
-            (int)(dp - fname), fname, dp + 1);
-    }
-    fprintf(stderr, "%s is a block device", fname);
-    *filename = namebuf;
-    fprintf(stderr, ", using %s\n", *filename);
-
-    return 0;
-}
-#else
-static int raw_normalize_devicepath(const char **filename)
-{
-    return 0;
-}
-#endif
-
-/*
- * Get logical block size via ioctl. On success store it in @sector_size_p.
- */
-static int probe_logical_blocksize(int fd, unsigned int *sector_size_p)
-{
-    unsigned int sector_size;
-    bool success = false;
-
-    errno = ENOTSUP;
-
-    /* Try a few ioctls to get the right size */
-#ifdef BLKSSZGET
-    if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
-        *sector_size_p = sector_size;
-        success = true;
-    }
-#endif
-#ifdef DKIOCGETBLOCKSIZE
-    if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
-        *sector_size_p = sector_size;
-        success = true;
-    }
-#endif
-#ifdef DIOCGSECTORSIZE
-    if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
-        *sector_size_p = sector_size;
-        success = true;
-    }
-#endif
-
-    return success ? 0 : -errno;
-}
-
-/**
- * Get physical block size of @fd.
- * On success, store it in @blk_size and return 0.
- * On failure, return -errno.
- */
-static int probe_physical_blocksize(int fd, unsigned int *blk_size)
-{
-#ifdef BLKPBSZGET
-    if (ioctl(fd, BLKPBSZGET, blk_size) < 0) {
-        return -errno;
-    }
-    return 0;
-#else
-    return -ENOTSUP;
-#endif
-}
-
-/* Check if read is allowed with given memory buffer and length.
- *
- * This function is used to check O_DIRECT memory buffer and request alignment.
- */
-static bool raw_is_io_aligned(int fd, void *buf, size_t len)
-{
-    ssize_t ret = pread(fd, buf, len, 0);
-
-    if (ret >= 0) {
-        return true;
-    }
-
-#ifdef __linux__
-    /* The Linux kernel returns EINVAL for misaligned O_DIRECT reads.  Ignore
-     * other errors (e.g. real I/O error), which could happen on a failed
-     * drive, since we only care about probing alignment.
-     */
-    if (errno != EINVAL) {
-        return true;
-    }
-#endif
-
-    return false;
-}
-
-static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    char *buf;
-    size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
-
-    /* For SCSI generic devices the alignment is not really used.
-       With buffered I/O, we don't have any restrictions. */
-    if (bdrv_is_sg(bs) || !s->needs_alignment) {
-        bs->bl.request_alignment = 1;
-        s->buf_align = 1;
-        return;
-    }
-
-    bs->bl.request_alignment = 0;
-    s->buf_align = 0;
-    /* Let's try to use the logical blocksize for the alignment. */
-    if (probe_logical_blocksize(fd, &bs->bl.request_alignment) < 0) {
-        bs->bl.request_alignment = 0;
-    }
-#ifdef CONFIG_XFS
-    if (s->is_xfs) {
-        struct dioattr da;
-        if (xfsctl(NULL, fd, XFS_IOC_DIOINFO, &da) >= 0) {
-            bs->bl.request_alignment = da.d_miniosz;
-            /* The kernel returns wrong information for d_mem */
-            /* s->buf_align = da.d_mem; */
-        }
-    }
-#endif
-
-    /* If we could not get the sizes so far, we can only guess them */
-    if (!s->buf_align) {
-        size_t align;
-        buf = qemu_memalign(max_align, 2 * max_align);
-        for (align = 512; align <= max_align; align <<= 1) {
-            if (raw_is_io_aligned(fd, buf + align, max_align)) {
-                s->buf_align = align;
-                break;
-            }
-        }
-        qemu_vfree(buf);
-    }
-
-    if (!bs->bl.request_alignment) {
-        size_t align;
-        buf = qemu_memalign(s->buf_align, max_align);
-        for (align = 512; align <= max_align; align <<= 1) {
-            if (raw_is_io_aligned(fd, buf, align)) {
-                bs->bl.request_alignment = align;
-                break;
-            }
-        }
-        qemu_vfree(buf);
-    }
-
-    if (!s->buf_align || !bs->bl.request_alignment) {
-        error_setg(errp, "Could not find working O_DIRECT alignment");
-        error_append_hint(errp, "Try cache.direct=off\n");
-    }
-}
-
-static void raw_parse_flags(int bdrv_flags, int *open_flags)
-{
-    assert(open_flags != NULL);
-
-    *open_flags |= O_BINARY;
-    *open_flags &= ~O_ACCMODE;
-    if (bdrv_flags & BDRV_O_RDWR) {
-        *open_flags |= O_RDWR;
-    } else {
-        *open_flags |= O_RDONLY;
-    }
-
-    /* Use O_DSYNC for write-through caching, no flags for write-back caching,
-     * and O_DIRECT for no caching. */
-    if ((bdrv_flags & BDRV_O_NOCACHE)) {
-        *open_flags |= O_DIRECT;
-    }
-}
-
-static void raw_parse_filename(const char *filename, QDict *options,
-                               Error **errp)
-{
-    /* The filename does not have to be prefixed by the protocol name, since
-     * "file" is the default protocol; therefore, the return value of this
-     * function call can be ignored. */
-    strstart(filename, "file:", &filename);
-
-    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
-}
-
-static QemuOptsList raw_runtime_opts = {
-    .name = "raw",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
-    .desc = {
-        {
-            .name = "filename",
-            .type = QEMU_OPT_STRING,
-            .help = "File name of the image",
-        },
-        {
-            .name = "aio",
-            .type = QEMU_OPT_STRING,
-            .help = "host AIO implementation (threads, native)",
-        },
-        { /* end of list */ }
-    },
-};
-
-static int raw_open_common(BlockDriverState *bs, QDict *options,
-                           int bdrv_flags, int open_flags, Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    const char *filename = NULL;
-    BlockdevAioOptions aio, aio_default;
-    int fd, ret;
-    struct stat st;
-
-    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    filename = qemu_opt_get(opts, "filename");
-
-    ret = raw_normalize_devicepath(&filename);
-    if (ret != 0) {
-        error_setg_errno(errp, -ret, "Could not normalize device path");
-        goto fail;
-    }
-
-    aio_default = (bdrv_flags & BDRV_O_NATIVE_AIO)
-                  ? BLOCKDEV_AIO_OPTIONS_NATIVE
-                  : BLOCKDEV_AIO_OPTIONS_THREADS;
-    aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
-                          BLOCKDEV_AIO_OPTIONS__MAX, aio_default, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
-    s->use_linux_aio = (aio == BLOCKDEV_AIO_OPTIONS_NATIVE);
-
-    s->open_flags = open_flags;
-    raw_parse_flags(bdrv_flags, &s->open_flags);
-
-    s->fd = -1;
-    fd = qemu_open(filename, s->open_flags, 0644);
-    if (fd < 0) {
-        ret = -errno;
-        error_setg_errno(errp, errno, "Could not open '%s'", filename);
-        if (ret == -EROFS) {
-            ret = -EACCES;
-        }
-        goto fail;
-    }
-    s->fd = fd;
-
-#ifdef CONFIG_LINUX_AIO
-     /* Currently Linux does AIO only for files opened with O_DIRECT */
-    if (s->use_linux_aio && !(s->open_flags & O_DIRECT)) {
-        error_setg(errp, "aio=native was specified, but it requires "
-                         "cache.direct=on, which was not specified.");
-        ret = -EINVAL;
-        goto fail;
-    }
-#else
-    if (s->use_linux_aio) {
-        error_setg(errp, "aio=native was specified, but is not supported "
-                         "in this build.");
-        ret = -EINVAL;
-        goto fail;
-    }
-#endif /* !defined(CONFIG_LINUX_AIO) */
-
-    s->has_discard = true;
-    s->has_write_zeroes = true;
-    bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
-    if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
-        s->needs_alignment = true;
-    }
-
-    if (fstat(s->fd, &st) < 0) {
-        ret = -errno;
-        error_setg_errno(errp, errno, "Could not stat file");
-        goto fail;
-    }
-    if (S_ISREG(st.st_mode)) {
-        s->discard_zeroes = true;
-        s->has_fallocate = true;
-    }
-    if (S_ISBLK(st.st_mode)) {
-#ifdef BLKDISCARDZEROES
-        unsigned int arg;
-        if (ioctl(s->fd, BLKDISCARDZEROES, &arg) == 0 && arg) {
-            s->discard_zeroes = true;
-        }
-#endif
-#ifdef __linux__
-        /* On Linux 3.10, BLKDISCARD leaves stale data in the page cache.  Do
-         * not rely on the contents of discarded blocks unless using O_DIRECT.
-         * Same for BLKZEROOUT.
-         */
-        if (!(bs->open_flags & BDRV_O_NOCACHE)) {
-            s->discard_zeroes = false;
-            s->has_write_zeroes = false;
-        }
-#endif
-    }
-#ifdef __FreeBSD__
-    if (S_ISCHR(st.st_mode)) {
-        /*
-         * The file is a char device (disk), which on FreeBSD isn't behind
-         * a pager, so force all requests to be aligned. This is needed
-         * so QEMU makes sure all IO operations on the device are aligned
-         * to sector size, or else FreeBSD will reject them with EINVAL.
-         */
-        s->needs_alignment = true;
-    }
-#endif
-
-#ifdef CONFIG_XFS
-    if (platform_test_xfs_fd(s->fd)) {
-        s->is_xfs = true;
-    }
-#endif
-
-    ret = 0;
-fail:
-    if (filename && (bdrv_flags & BDRV_O_TEMPORARY)) {
-        unlink(filename);
-    }
-    qemu_opts_del(opts);
-    return ret;
-}
-
-static int raw_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-
-    s->type = FTYPE_FILE;
-    return raw_open_common(bs, options, flags, 0, errp);
-}
-
-static int raw_reopen_prepare(BDRVReopenState *state,
-                              BlockReopenQueue *queue, Error **errp)
-{
-    BDRVRawState *s;
-    BDRVRawReopenState *rs;
-    int ret = 0;
-    Error *local_err = NULL;
-
-    assert(state != NULL);
-    assert(state->bs != NULL);
-
-    s = state->bs->opaque;
-
-    state->opaque = g_new0(BDRVRawReopenState, 1);
-    rs = state->opaque;
-
-    if (s->type == FTYPE_CD) {
-        rs->open_flags |= O_NONBLOCK;
-    }
-
-    raw_parse_flags(state->flags, &rs->open_flags);
-
-    rs->fd = -1;
-
-    int fcntl_flags = O_APPEND | O_NONBLOCK;
-#ifdef O_NOATIME
-    fcntl_flags |= O_NOATIME;
-#endif
-
-#ifdef O_ASYNC
-    /* Not all operating systems have O_ASYNC, and those that don't
-     * will not let us track the state into rs->open_flags (typically
-     * you achieve the same effect with an ioctl, for example I_SETSIG
-     * on Solaris). But we do not use O_ASYNC, so that's fine.
-     */
-    assert((s->open_flags & O_ASYNC) == 0);
-#endif
-
-    if ((rs->open_flags & ~fcntl_flags) == (s->open_flags & ~fcntl_flags)) {
-        /* dup the original fd */
-        rs->fd = qemu_dup(s->fd);
-        if (rs->fd >= 0) {
-            ret = fcntl_setfl(rs->fd, rs->open_flags);
-            if (ret) {
-                qemu_close(rs->fd);
-                rs->fd = -1;
-            }
-        }
-    }
-
-    /* If we cannot use fcntl, or fcntl failed, fall back to qemu_open() */
-    if (rs->fd == -1) {
-        const char *normalized_filename = state->bs->filename;
-        ret = raw_normalize_devicepath(&normalized_filename);
-        if (ret < 0) {
-            error_setg_errno(errp, -ret, "Could not normalize device path");
-        } else {
-            assert(!(rs->open_flags & O_CREAT));
-            rs->fd = qemu_open(normalized_filename, rs->open_flags);
-            if (rs->fd == -1) {
-                error_setg_errno(errp, errno, "Could not reopen file");
-                ret = -1;
-            }
-        }
-    }
-
-    /* Fail already reopen_prepare() if we can't get a working O_DIRECT
-     * alignment with the new fd. */
-    if (rs->fd != -1) {
-        raw_probe_alignment(state->bs, rs->fd, &local_err);
-        if (local_err) {
-            qemu_close(rs->fd);
-            rs->fd = -1;
-            error_propagate(errp, local_err);
-            ret = -EINVAL;
-        }
-    }
-
-    return ret;
-}
-
-static void raw_reopen_commit(BDRVReopenState *state)
-{
-    BDRVRawReopenState *rs = state->opaque;
-    BDRVRawState *s = state->bs->opaque;
-
-    s->open_flags = rs->open_flags;
-
-    qemu_close(s->fd);
-    s->fd = rs->fd;
-
-    g_free(state->opaque);
-    state->opaque = NULL;
-}
-
-
-static void raw_reopen_abort(BDRVReopenState *state)
-{
-    BDRVRawReopenState *rs = state->opaque;
-
-     /* nothing to do if NULL, we didn't get far enough */
-    if (rs == NULL) {
-        return;
-    }
-
-    if (rs->fd >= 0) {
-        qemu_close(rs->fd);
-        rs->fd = -1;
-    }
-    g_free(state->opaque);
-    state->opaque = NULL;
-}
-
-static int hdev_get_max_transfer_length(int fd)
-{
-#ifdef BLKSECTGET
-    int max_sectors = 0;
-    if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
-        return max_sectors;
-    } else {
-        return -errno;
-    }
-#else
-    return -ENOSYS;
-#endif
-}
-
-static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    struct stat st;
-
-    if (!fstat(s->fd, &st)) {
-        if (S_ISBLK(st.st_mode)) {
-            int ret = hdev_get_max_transfer_length(s->fd);
-            if (ret > 0 && ret <= BDRV_REQUEST_MAX_SECTORS) {
-                bs->bl.max_transfer = pow2floor(ret << BDRV_SECTOR_BITS);
-            }
-        }
-    }
-
-    raw_probe_alignment(bs, s->fd, errp);
-    bs->bl.min_mem_alignment = s->buf_align;
-    bs->bl.opt_mem_alignment = MAX(s->buf_align, getpagesize());
-}
-
-static int check_for_dasd(int fd)
-{
-#ifdef BIODASDINFO2
-    struct dasd_information2_t info = {0};
-
-    return ioctl(fd, BIODASDINFO2, &info);
-#else
-    return -1;
-#endif
-}
-
-/**
- * Try to get @bs's logical and physical block size.
- * On success, store them in @bsz and return zero.
- * On failure, return negative errno.
- */
-static int hdev_probe_blocksizes(BlockDriverState *bs, BlockSizes *bsz)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-
-    /* If DASD, get blocksizes */
-    if (check_for_dasd(s->fd) < 0) {
-        return -ENOTSUP;
-    }
-    ret = probe_logical_blocksize(s->fd, &bsz->log);
-    if (ret < 0) {
-        return ret;
-    }
-    return probe_physical_blocksize(s->fd, &bsz->phys);
-}
-
-/**
- * Try to get @bs's geometry: cyls, heads, sectors.
- * On success, store them in @geo and return 0.
- * On failure return -errno.
- * (Allows block driver to assign default geometry values that guest sees)
- */
-#ifdef __linux__
-static int hdev_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
-{
-    BDRVRawState *s = bs->opaque;
-    struct hd_geometry ioctl_geo = {0};
-
-    /* If DASD, get its geometry */
-    if (check_for_dasd(s->fd) < 0) {
-        return -ENOTSUP;
-    }
-    if (ioctl(s->fd, HDIO_GETGEO, &ioctl_geo) < 0) {
-        return -errno;
-    }
-    /* HDIO_GETGEO may return success even though geo contains zeros
-       (e.g. certain multipath setups) */
-    if (!ioctl_geo.heads || !ioctl_geo.sectors || !ioctl_geo.cylinders) {
-        return -ENOTSUP;
-    }
-    /* Do not return a geometry for partition */
-    if (ioctl_geo.start != 0) {
-        return -ENOTSUP;
-    }
-    geo->heads = ioctl_geo.heads;
-    geo->sectors = ioctl_geo.sectors;
-    geo->cylinders = ioctl_geo.cylinders;
-
-    return 0;
-}
-#else /* __linux__ */
-static int hdev_probe_geometry(BlockDriverState *bs, HDGeometry *geo)
-{
-    return -ENOTSUP;
-}
-#endif
-
-static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
-{
-    int ret;
-
-    ret = ioctl(aiocb->aio_fildes, aiocb->aio_ioctl_cmd, aiocb->aio_ioctl_buf);
-    if (ret == -1) {
-        return -errno;
-    }
-
-    return 0;
-}
-
-static ssize_t handle_aiocb_flush(RawPosixAIOData *aiocb)
-{
-    int ret;
-
-    ret = qemu_fdatasync(aiocb->aio_fildes);
-    if (ret == -1) {
-        return -errno;
-    }
-    return 0;
-}
-
-#ifdef CONFIG_PREADV
-
-static bool preadv_present = true;
-
-static ssize_t
-qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
-{
-    return preadv(fd, iov, nr_iov, offset);
-}
-
-static ssize_t
-qemu_pwritev(int fd, const struct iovec *iov, int nr_iov, off_t offset)
-{
-    return pwritev(fd, iov, nr_iov, offset);
-}
-
-#else
-
-static bool preadv_present = false;
-
-static ssize_t
-qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
-{
-    return -ENOSYS;
-}
-
-static ssize_t
-qemu_pwritev(int fd, const struct iovec *iov, int nr_iov, off_t offset)
-{
-    return -ENOSYS;
-}
-
-#endif
-
-static ssize_t handle_aiocb_rw_vector(RawPosixAIOData *aiocb)
-{
-    ssize_t len;
-
-    do {
-        if (aiocb->aio_type & QEMU_AIO_WRITE)
-            len = qemu_pwritev(aiocb->aio_fildes,
-                               aiocb->aio_iov,
-                               aiocb->aio_niov,
-                               aiocb->aio_offset);
-         else
-            len = qemu_preadv(aiocb->aio_fildes,
-                              aiocb->aio_iov,
-                              aiocb->aio_niov,
-                              aiocb->aio_offset);
-    } while (len == -1 && errno == EINTR);
-
-    if (len == -1) {
-        return -errno;
-    }
-    return len;
-}
-
-/*
- * Read/writes the data to/from a given linear buffer.
- *
- * Returns the number of bytes handles or -errno in case of an error. Short
- * reads are only returned if the end of the file is reached.
- */
-static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf)
-{
-    ssize_t offset = 0;
-    ssize_t len;
-
-    while (offset < aiocb->aio_nbytes) {
-        if (aiocb->aio_type & QEMU_AIO_WRITE) {
-            len = pwrite(aiocb->aio_fildes,
-                         (const char *)buf + offset,
-                         aiocb->aio_nbytes - offset,
-                         aiocb->aio_offset + offset);
-        } else {
-            len = pread(aiocb->aio_fildes,
-                        buf + offset,
-                        aiocb->aio_nbytes - offset,
-                        aiocb->aio_offset + offset);
-        }
-        if (len == -1 && errno == EINTR) {
-            continue;
-        } else if (len == -1 && errno == EINVAL &&
-                   (aiocb->bs->open_flags & BDRV_O_NOCACHE) &&
-                   !(aiocb->aio_type & QEMU_AIO_WRITE) &&
-                   offset > 0) {
-            /* O_DIRECT pread() may fail with EINVAL when offset is unaligned
-             * after a short read.  Assume that O_DIRECT short reads only occur
-             * at EOF.  Therefore this is a short read, not an I/O error.
-             */
-            break;
-        } else if (len == -1) {
-            offset = -errno;
-            break;
-        } else if (len == 0) {
-            break;
-        }
-        offset += len;
-    }
-
-    return offset;
-}
-
-static ssize_t handle_aiocb_rw(RawPosixAIOData *aiocb)
-{
-    ssize_t nbytes;
-    char *buf;
-
-    if (!(aiocb->aio_type & QEMU_AIO_MISALIGNED)) {
-        /*
-         * If there is just a single buffer, and it is properly aligned
-         * we can just use plain pread/pwrite without any problems.
-         */
-        if (aiocb->aio_niov == 1) {
-             return handle_aiocb_rw_linear(aiocb, aiocb->aio_iov->iov_base);
-        }
-        /*
-         * We have more than one iovec, and all are properly aligned.
-         *
-         * Try preadv/pwritev first and fall back to linearizing the
-         * buffer if it's not supported.
-         */
-        if (preadv_present) {
-            nbytes = handle_aiocb_rw_vector(aiocb);
-            if (nbytes == aiocb->aio_nbytes ||
-                (nbytes < 0 && nbytes != -ENOSYS)) {
-                return nbytes;
-            }
-            preadv_present = false;
-        }
-
-        /*
-         * XXX(hch): short read/write.  no easy way to handle the reminder
-         * using these interfaces.  For now retry using plain
-         * pread/pwrite?
-         */
-    }
-
-    /*
-     * Ok, we have to do it the hard way, copy all segments into
-     * a single aligned buffer.
-     */
-    buf = qemu_try_blockalign(aiocb->bs, aiocb->aio_nbytes);
-    if (buf == NULL) {
-        return -ENOMEM;
-    }
-
-    if (aiocb->aio_type & QEMU_AIO_WRITE) {
-        char *p = buf;
-        int i;
-
-        for (i = 0; i < aiocb->aio_niov; ++i) {
-            memcpy(p, aiocb->aio_iov[i].iov_base, aiocb->aio_iov[i].iov_len);
-            p += aiocb->aio_iov[i].iov_len;
-        }
-        assert(p - buf == aiocb->aio_nbytes);
-    }
-
-    nbytes = handle_aiocb_rw_linear(aiocb, buf);
-    if (!(aiocb->aio_type & QEMU_AIO_WRITE)) {
-        char *p = buf;
-        size_t count = aiocb->aio_nbytes, copy;
-        int i;
-
-        for (i = 0; i < aiocb->aio_niov && count; ++i) {
-            copy = count;
-            if (copy > aiocb->aio_iov[i].iov_len) {
-                copy = aiocb->aio_iov[i].iov_len;
-            }
-            memcpy(aiocb->aio_iov[i].iov_base, p, copy);
-            assert(count >= copy);
-            p     += copy;
-            count -= copy;
-        }
-        assert(count == 0);
-    }
-    qemu_vfree(buf);
-
-    return nbytes;
-}
-
-#ifdef CONFIG_XFS
-static int xfs_write_zeroes(BDRVRawState *s, int64_t offset, uint64_t bytes)
-{
-    struct xfs_flock64 fl;
-    int err;
-
-    memset(&fl, 0, sizeof(fl));
-    fl.l_whence = SEEK_SET;
-    fl.l_start = offset;
-    fl.l_len = bytes;
-
-    if (xfsctl(NULL, s->fd, XFS_IOC_ZERO_RANGE, &fl) < 0) {
-        err = errno;
-        DPRINTF("cannot write zero range (%s)\n", strerror(errno));
-        return -err;
-    }
-
-    return 0;
-}
-
-static int xfs_discard(BDRVRawState *s, int64_t offset, uint64_t bytes)
-{
-    struct xfs_flock64 fl;
-    int err;
-
-    memset(&fl, 0, sizeof(fl));
-    fl.l_whence = SEEK_SET;
-    fl.l_start = offset;
-    fl.l_len = bytes;
-
-    if (xfsctl(NULL, s->fd, XFS_IOC_UNRESVSP64, &fl) < 0) {
-        err = errno;
-        DPRINTF("cannot punch hole (%s)\n", strerror(errno));
-        return -err;
-    }
-
-    return 0;
-}
-#endif
-
-static int translate_err(int err)
-{
-    if (err == -ENODEV || err == -ENOSYS || err == -EOPNOTSUPP ||
-        err == -ENOTTY) {
-        err = -ENOTSUP;
-    }
-    return err;
-}
-
-#ifdef CONFIG_FALLOCATE
-static int do_fallocate(int fd, int mode, off_t offset, off_t len)
-{
-    do {
-        if (fallocate(fd, mode, offset, len) == 0) {
-            return 0;
-        }
-    } while (errno == EINTR);
-    return translate_err(-errno);
-}
-#endif
-
-static ssize_t handle_aiocb_write_zeroes_block(RawPosixAIOData *aiocb)
-{
-    int ret = -ENOTSUP;
-    BDRVRawState *s = aiocb->bs->opaque;
-
-    if (!s->has_write_zeroes) {
-        return -ENOTSUP;
-    }
-
-#ifdef BLKZEROOUT
-    do {
-        uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
-        if (ioctl(aiocb->aio_fildes, BLKZEROOUT, range) == 0) {
-            return 0;
-        }
-    } while (errno == EINTR);
-
-    ret = translate_err(-errno);
-#endif
-
-    if (ret == -ENOTSUP) {
-        s->has_write_zeroes = false;
-    }
-    return ret;
-}
-
-static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb)
-{
-#if defined(CONFIG_FALLOCATE) || defined(CONFIG_XFS)
-    BDRVRawState *s = aiocb->bs->opaque;
-#endif
-
-    if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
-        return handle_aiocb_write_zeroes_block(aiocb);
-    }
-
-#ifdef CONFIG_XFS
-    if (s->is_xfs) {
-        return xfs_write_zeroes(s, aiocb->aio_offset, aiocb->aio_nbytes);
-    }
-#endif
-
-#ifdef CONFIG_FALLOCATE_ZERO_RANGE
-    if (s->has_write_zeroes) {
-        int ret = do_fallocate(s->fd, FALLOC_FL_ZERO_RANGE,
-                               aiocb->aio_offset, aiocb->aio_nbytes);
-        if (ret == 0 || ret != -ENOTSUP) {
-            return ret;
-        }
-        s->has_write_zeroes = false;
-    }
-#endif
-
-#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
-    if (s->has_discard && s->has_fallocate) {
-        int ret = do_fallocate(s->fd,
-                               FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-                               aiocb->aio_offset, aiocb->aio_nbytes);
-        if (ret == 0) {
-            ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
-            if (ret == 0 || ret != -ENOTSUP) {
-                return ret;
-            }
-            s->has_fallocate = false;
-        } else if (ret != -ENOTSUP) {
-            return ret;
-        } else {
-            s->has_discard = false;
-        }
-    }
-#endif
-
-#ifdef CONFIG_FALLOCATE
-    if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) {
-        int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes);
-        if (ret == 0 || ret != -ENOTSUP) {
-            return ret;
-        }
-        s->has_fallocate = false;
-    }
-#endif
-
-    return -ENOTSUP;
-}
-
-static ssize_t handle_aiocb_discard(RawPosixAIOData *aiocb)
-{
-    int ret = -EOPNOTSUPP;
-    BDRVRawState *s = aiocb->bs->opaque;
-
-    if (!s->has_discard) {
-        return -ENOTSUP;
-    }
-
-    if (aiocb->aio_type & QEMU_AIO_BLKDEV) {
-#ifdef BLKDISCARD
-        do {
-            uint64_t range[2] = { aiocb->aio_offset, aiocb->aio_nbytes };
-            if (ioctl(aiocb->aio_fildes, BLKDISCARD, range) == 0) {
-                return 0;
-            }
-        } while (errno == EINTR);
-
-        ret = -errno;
-#endif
-    } else {
-#ifdef CONFIG_XFS
-        if (s->is_xfs) {
-            return xfs_discard(s, aiocb->aio_offset, aiocb->aio_nbytes);
-        }
-#endif
-
-#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
-        ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
-                           aiocb->aio_offset, aiocb->aio_nbytes);
-#endif
-    }
-
-    ret = translate_err(ret);
-    if (ret == -ENOTSUP) {
-        s->has_discard = false;
-    }
-    return ret;
-}
-
-static int aio_worker(void *arg)
-{
-    RawPosixAIOData *aiocb = arg;
-    ssize_t ret = 0;
-
-    switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
-    case QEMU_AIO_READ:
-        ret = handle_aiocb_rw(aiocb);
-        if (ret >= 0 && ret < aiocb->aio_nbytes) {
-            iov_memset(aiocb->aio_iov, aiocb->aio_niov, ret,
-                      0, aiocb->aio_nbytes - ret);
-
-            ret = aiocb->aio_nbytes;
-        }
-        if (ret == aiocb->aio_nbytes) {
-            ret = 0;
-        } else if (ret >= 0 && ret < aiocb->aio_nbytes) {
-            ret = -EINVAL;
-        }
-        break;
-    case QEMU_AIO_WRITE:
-        ret = handle_aiocb_rw(aiocb);
-        if (ret == aiocb->aio_nbytes) {
-            ret = 0;
-        } else if (ret >= 0 && ret < aiocb->aio_nbytes) {
-            ret = -EINVAL;
-        }
-        break;
-    case QEMU_AIO_FLUSH:
-        ret = handle_aiocb_flush(aiocb);
-        break;
-    case QEMU_AIO_IOCTL:
-        ret = handle_aiocb_ioctl(aiocb);
-        break;
-    case QEMU_AIO_DISCARD:
-        ret = handle_aiocb_discard(aiocb);
-        break;
-    case QEMU_AIO_WRITE_ZEROES:
-        ret = handle_aiocb_write_zeroes(aiocb);
-        break;
-    default:
-        fprintf(stderr, "invalid aio request (0x%x)\n", aiocb->aio_type);
-        ret = -EINVAL;
-        break;
-    }
-
-    g_free(aiocb);
-    return ret;
-}
-
-static int paio_submit_co(BlockDriverState *bs, int fd,
-                          int64_t offset, QEMUIOVector *qiov,
-                          int count, int type)
-{
-    RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
-    ThreadPool *pool;
-
-    acb->bs = bs;
-    acb->aio_type = type;
-    acb->aio_fildes = fd;
-
-    acb->aio_nbytes = count;
-    acb->aio_offset = offset;
-
-    if (qiov) {
-        acb->aio_iov = qiov->iov;
-        acb->aio_niov = qiov->niov;
-        assert(qiov->size == count);
-    }
-
-    trace_paio_submit_co(offset, count, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    return thread_pool_submit_co(pool, aio_worker, acb);
-}
-
-static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
-        int64_t offset, QEMUIOVector *qiov, int count,
-        BlockCompletionFunc *cb, void *opaque, int type)
-{
-    RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
-    ThreadPool *pool;
-
-    acb->bs = bs;
-    acb->aio_type = type;
-    acb->aio_fildes = fd;
-
-    acb->aio_nbytes = count;
-    acb->aio_offset = offset;
-
-    if (qiov) {
-        acb->aio_iov = qiov->iov;
-        acb->aio_niov = qiov->niov;
-        assert(qiov->size == acb->aio_nbytes);
-    }
-
-    trace_paio_submit(acb, opaque, offset, count, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
-}
-
-static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
-                                   uint64_t bytes, QEMUIOVector *qiov, int type)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (fd_open(bs) < 0)
-        return -EIO;
-
-    /*
-     * Check if the underlying device requires requests to be aligned,
-     * and if the request we are trying to submit is aligned or not.
-     * If this is the case tell the low-level driver that it needs
-     * to copy the buffer.
-     */
-    if (s->needs_alignment) {
-        if (!bdrv_qiov_is_aligned(bs, qiov)) {
-            type |= QEMU_AIO_MISALIGNED;
-#ifdef CONFIG_LINUX_AIO
-        } else if (s->use_linux_aio) {
-            LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
-            assert(qiov->size == bytes);
-            return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
-#endif
-        }
-    }
-
-    return paio_submit_co(bs, s->fd, offset, qiov, bytes, type);
-}
-
-static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
-                                      uint64_t bytes, QEMUIOVector *qiov,
-                                      int flags)
-{
-    return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ);
-}
-
-static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
-                                       uint64_t bytes, QEMUIOVector *qiov,
-                                       int flags)
-{
-    assert(flags == 0);
-    return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE);
-}
-
-static void raw_aio_plug(BlockDriverState *bs)
-{
-#ifdef CONFIG_LINUX_AIO
-    BDRVRawState *s = bs->opaque;
-    if (s->use_linux_aio) {
-        LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
-        laio_io_plug(bs, aio);
-    }
-#endif
-}
-
-static void raw_aio_unplug(BlockDriverState *bs)
-{
-#ifdef CONFIG_LINUX_AIO
-    BDRVRawState *s = bs->opaque;
-    if (s->use_linux_aio) {
-        LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
-        laio_io_unplug(bs, aio);
-    }
-#endif
-}
-
-static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
-        BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (fd_open(bs) < 0)
-        return NULL;
-
-    return paio_submit(bs, s->fd, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
-}
-
-static void raw_close(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->fd >= 0) {
-        qemu_close(s->fd);
-        s->fd = -1;
-    }
-}
-
-static int raw_truncate(BlockDriverState *bs, int64_t offset)
-{
-    BDRVRawState *s = bs->opaque;
-    struct stat st;
-
-    if (fstat(s->fd, &st)) {
-        return -errno;
-    }
-
-    if (S_ISREG(st.st_mode)) {
-        if (ftruncate(s->fd, offset) < 0) {
-            return -errno;
-        }
-    } else if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
-       if (offset > raw_getlength(bs)) {
-           return -EINVAL;
-       }
-    } else {
-        return -ENOTSUP;
-    }
-
-    return 0;
-}
-
-#ifdef __OpenBSD__
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int fd = s->fd;
-    struct stat st;
-
-    if (fstat(fd, &st))
-        return -errno;
-    if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
-        struct disklabel dl;
-
-        if (ioctl(fd, DIOCGDINFO, &dl))
-            return -errno;
-        return (uint64_t)dl.d_secsize *
-            dl.d_partitions[DISKPART(st.st_rdev)].p_size;
-    } else
-        return st.st_size;
-}
-#elif defined(__NetBSD__)
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int fd = s->fd;
-    struct stat st;
-
-    if (fstat(fd, &st))
-        return -errno;
-    if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode)) {
-        struct dkwedge_info dkw;
-
-        if (ioctl(fd, DIOCGWEDGEINFO, &dkw) != -1) {
-            return dkw.dkw_size * 512;
-        } else {
-            struct disklabel dl;
-
-            if (ioctl(fd, DIOCGDINFO, &dl))
-                return -errno;
-            return (uint64_t)dl.d_secsize *
-                dl.d_partitions[DISKPART(st.st_rdev)].p_size;
-        }
-    } else
-        return st.st_size;
-}
-#elif defined(__sun__)
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    struct dk_minfo minfo;
-    int ret;
-    int64_t size;
-
-    ret = fd_open(bs);
-    if (ret < 0) {
-        return ret;
-    }
-
-    /*
-     * Use the DKIOCGMEDIAINFO ioctl to read the size.
-     */
-    ret = ioctl(s->fd, DKIOCGMEDIAINFO, &minfo);
-    if (ret != -1) {
-        return minfo.dki_lbsize * minfo.dki_capacity;
-    }
-
-    /*
-     * There are reports that lseek on some devices fails, but
-     * irc discussion said that contingency on contingency was overkill.
-     */
-    size = lseek(s->fd, 0, SEEK_END);
-    if (size < 0) {
-        return -errno;
-    }
-    return size;
-}
-#elif defined(CONFIG_BSD)
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int fd = s->fd;
-    int64_t size;
-    struct stat sb;
-#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
-    int reopened = 0;
-#endif
-    int ret;
-
-    ret = fd_open(bs);
-    if (ret < 0)
-        return ret;
-
-#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
-again:
-#endif
-    if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
-#ifdef DIOCGMEDIASIZE
-	if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
-#elif defined(DIOCGPART)
-        {
-                struct partinfo pi;
-                if (ioctl(fd, DIOCGPART, &pi) == 0)
-                        size = pi.media_size;
-                else
-                        size = 0;
-        }
-        if (size == 0)
-#endif
-#if defined(__APPLE__) && defined(__MACH__)
-        {
-            uint64_t sectors = 0;
-            uint32_t sector_size = 0;
-
-            if (ioctl(fd, DKIOCGETBLOCKCOUNT, &sectors) == 0
-               && ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) == 0) {
-                size = sectors * sector_size;
-            } else {
-                size = lseek(fd, 0LL, SEEK_END);
-                if (size < 0) {
-                    return -errno;
-                }
-            }
-        }
-#else
-        size = lseek(fd, 0LL, SEEK_END);
-        if (size < 0) {
-            return -errno;
-        }
-#endif
-#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-        switch(s->type) {
-        case FTYPE_CD:
-            /* XXX FreeBSD acd returns UINT_MAX sectors for an empty drive */
-            if (size == 2048LL * (unsigned)-1)
-                size = 0;
-            /* XXX no disc?  maybe we need to reopen... */
-            if (size <= 0 && !reopened && cdrom_reopen(bs) >= 0) {
-                reopened = 1;
-                goto again;
-            }
-        }
-#endif
-    } else {
-        size = lseek(fd, 0, SEEK_END);
-        if (size < 0) {
-            return -errno;
-        }
-    }
-    return size;
-}
-#else
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-    int64_t size;
-
-    ret = fd_open(bs);
-    if (ret < 0) {
-        return ret;
-    }
-
-    size = lseek(s->fd, 0, SEEK_END);
-    if (size < 0) {
-        return -errno;
-    }
-    return size;
-}
-#endif
-
-static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
-{
-    struct stat st;
-    BDRVRawState *s = bs->opaque;
-
-    if (fstat(s->fd, &st) < 0) {
-        return -errno;
-    }
-    return (int64_t)st.st_blocks * 512;
-}
-
-static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
-{
-    int fd;
-    int result = 0;
-    int64_t total_size = 0;
-    bool nocow = false;
-    PreallocMode prealloc;
-    char *buf = NULL;
-    Error *local_err = NULL;
-
-    strstart(filename, "file:", &filename);
-
-    /* Read out options */
-    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                          BDRV_SECTOR_SIZE);
-    nocow = qemu_opt_get_bool(opts, BLOCK_OPT_NOCOW, false);
-    buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-    prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
-                               PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
-                               &local_err);
-    g_free(buf);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        result = -EINVAL;
-        goto out;
-    }
-
-    fd = qemu_open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY,
-                   0644);
-    if (fd < 0) {
-        result = -errno;
-        error_setg_errno(errp, -result, "Could not create file");
-        goto out;
-    }
-
-    if (nocow) {
-#ifdef __linux__
-        /* Set NOCOW flag to solve performance issue on fs like btrfs.
-         * This is an optimisation. The FS_IOC_SETFLAGS ioctl return value
-         * will be ignored since any failure of this operation should not
-         * block the left work.
-         */
-        int attr;
-        if (ioctl(fd, FS_IOC_GETFLAGS, &attr) == 0) {
-            attr |= FS_NOCOW_FL;
-            ioctl(fd, FS_IOC_SETFLAGS, &attr);
-        }
-#endif
-    }
-
-    if (ftruncate(fd, total_size) != 0) {
-        result = -errno;
-        error_setg_errno(errp, -result, "Could not resize file");
-        goto out_close;
-    }
-
-    switch (prealloc) {
-#ifdef CONFIG_POSIX_FALLOCATE
-    case PREALLOC_MODE_FALLOC:
-        /* posix_fallocate() doesn't set errno. */
-        result = -posix_fallocate(fd, 0, total_size);
-        if (result != 0) {
-            error_setg_errno(errp, -result,
-                             "Could not preallocate data for the new file");
-        }
-        break;
-#endif
-    case PREALLOC_MODE_FULL:
-    {
-        int64_t num = 0, left = total_size;
-        buf = g_malloc0(65536);
-
-        while (left > 0) {
-            num = MIN(left, 65536);
-            result = write(fd, buf, num);
-            if (result < 0) {
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not write to the new file");
-                break;
-            }
-            left -= result;
-        }
-        if (result >= 0) {
-            result = fsync(fd);
-            if (result < 0) {
-                result = -errno;
-                error_setg_errno(errp, -result,
-                                 "Could not flush new file to disk");
-            }
-        }
-        g_free(buf);
-        break;
-    }
-    case PREALLOC_MODE_OFF:
-        break;
-    default:
-        result = -EINVAL;
-        error_setg(errp, "Unsupported preallocation mode: %s",
-                   PreallocMode_lookup[prealloc]);
-        break;
-    }
-
-out_close:
-    if (qemu_close(fd) != 0 && result == 0) {
-        result = -errno;
-        error_setg_errno(errp, -result, "Could not close the new file");
-    }
-out:
-    return result;
-}
-
-/*
- * Find allocation range in @bs around offset @start.
- * May change underlying file descriptor's file offset.
- * If @start is not in a hole, store @start in @data, and the
- * beginning of the next hole in @hole, and return 0.
- * If @start is in a non-trailing hole, store @start in @hole and the
- * beginning of the next non-hole in @data, and return 0.
- * If @start is in a trailing hole or beyond EOF, return -ENXIO.
- * If we can't find out, return a negative errno other than -ENXIO.
- */
-static int find_allocation(BlockDriverState *bs, off_t start,
-                           off_t *data, off_t *hole)
-{
-#if defined SEEK_HOLE && defined SEEK_DATA
-    BDRVRawState *s = bs->opaque;
-    off_t offs;
-
-    /*
-     * SEEK_DATA cases:
-     * D1. offs == start: start is in data
-     * D2. offs > start: start is in a hole, next data at offs
-     * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
-     *                              or start is beyond EOF
-     *     If the latter happens, the file has been truncated behind
-     *     our back since we opened it.  All bets are off then.
-     *     Treating like a trailing hole is simplest.
-     * D4. offs < 0, errno != ENXIO: we learned nothing
-     */
-    offs = lseek(s->fd, start, SEEK_DATA);
-    if (offs < 0) {
-        return -errno;          /* D3 or D4 */
-    }
-    assert(offs >= start);
-
-    if (offs > start) {
-        /* D2: in hole, next data at offs */
-        *hole = start;
-        *data = offs;
-        return 0;
-    }
-
-    /* D1: in data, end not yet known */
-
-    /*
-     * SEEK_HOLE cases:
-     * H1. offs == start: start is in a hole
-     *     If this happens here, a hole has been dug behind our back
-     *     since the previous lseek().
-     * H2. offs > start: either start is in data, next hole at offs,
-     *                   or start is in trailing hole, EOF at offs
-     *     Linux treats trailing holes like any other hole: offs ==
-     *     start.  Solaris seeks to EOF instead: offs > start (blech).
-     *     If that happens here, a hole has been dug behind our back
-     *     since the previous lseek().
-     * H3. offs < 0, errno = ENXIO: start is beyond EOF
-     *     If this happens, the file has been truncated behind our
-     *     back since we opened it.  Treat it like a trailing hole.
-     * H4. offs < 0, errno != ENXIO: we learned nothing
-     *     Pretend we know nothing at all, i.e. "forget" about D1.
-     */
-    offs = lseek(s->fd, start, SEEK_HOLE);
-    if (offs < 0) {
-        return -errno;          /* D1 and (H3 or H4) */
-    }
-    assert(offs >= start);
-
-    if (offs > start) {
-        /*
-         * D1 and H2: either in data, next hole at offs, or it was in
-         * data but is now in a trailing hole.  In the latter case,
-         * all bets are off.  Treating it as if it there was data all
-         * the way to EOF is safe, so simply do that.
-         */
-        *data = start;
-        *hole = offs;
-        return 0;
-    }
-
-    /* D1 and H1 */
-    return -EBUSY;
-#else
-    return -ENOTSUP;
-#endif
-}
-
-/*
- * Returns the allocation status of the specified sectors.
- *
- * If 'sector_num' is beyond the end of the disk image the return value is 0
- * and 'pnum' is set to 0.
- *
- * 'pnum' is set to the number of sectors (including and immediately following
- * the specified sector) that are known to be in the same
- * allocated/unallocated state.
- *
- * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
- * beyond the end of the disk image it will be clamped.
- */
-static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
-                                                    int64_t sector_num,
-                                                    int nb_sectors, int *pnum,
-                                                    BlockDriverState **file)
-{
-    off_t start, data = 0, hole = 0;
-    int64_t total_size;
-    int ret;
-
-    ret = fd_open(bs);
-    if (ret < 0) {
-        return ret;
-    }
-
-    start = sector_num * BDRV_SECTOR_SIZE;
-    total_size = bdrv_getlength(bs);
-    if (total_size < 0) {
-        return total_size;
-    } else if (start >= total_size) {
-        *pnum = 0;
-        return 0;
-    } else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
-        nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
-    }
-
-    ret = find_allocation(bs, start, &data, &hole);
-    if (ret == -ENXIO) {
-        /* Trailing hole */
-        *pnum = nb_sectors;
-        ret = BDRV_BLOCK_ZERO;
-    } else if (ret < 0) {
-        /* No info available, so pretend there are no holes */
-        *pnum = nb_sectors;
-        ret = BDRV_BLOCK_DATA;
-    } else if (data == start) {
-        /* On a data extent, compute sectors to the end of the extent,
-         * possibly including a partial sector at EOF. */
-        *pnum = MIN(nb_sectors, DIV_ROUND_UP(hole - start, BDRV_SECTOR_SIZE));
-        ret = BDRV_BLOCK_DATA;
-    } else {
-        /* On a hole, compute sectors to the beginning of the next extent.  */
-        assert(hole == start);
-        *pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
-        ret = BDRV_BLOCK_ZERO;
-    }
-    *file = bs;
-    return ret | BDRV_BLOCK_OFFSET_VALID | start;
-}
-
-static coroutine_fn BlockAIOCB *raw_aio_pdiscard(BlockDriverState *bs,
-    int64_t offset, int count,
-    BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-
-    return paio_submit(bs, s->fd, offset, NULL, count,
-                       cb, opaque, QEMU_AIO_DISCARD);
-}
-
-static int coroutine_fn raw_co_pwrite_zeroes(
-    BlockDriverState *bs, int64_t offset,
-    int count, BdrvRequestFlags flags)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (!(flags & BDRV_REQ_MAY_UNMAP)) {
-        return paio_submit_co(bs, s->fd, offset, NULL, count,
-                              QEMU_AIO_WRITE_ZEROES);
-    } else if (s->discard_zeroes) {
-        return paio_submit_co(bs, s->fd, offset, NULL, count,
-                              QEMU_AIO_DISCARD);
-    }
-    return -ENOTSUP;
-}
-
-static int raw_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
-{
-    BDRVRawState *s = bs->opaque;
-
-    bdi->unallocated_blocks_are_zero = s->discard_zeroes;
-    bdi->can_write_zeroes_with_unmap = s->discard_zeroes;
-    return 0;
-}
-
-static QemuOptsList raw_create_opts = {
-    .name = "raw-create-opts",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
-    .desc = {
-        {
-            .name = BLOCK_OPT_SIZE,
-            .type = QEMU_OPT_SIZE,
-            .help = "Virtual disk size"
-        },
-        {
-            .name = BLOCK_OPT_NOCOW,
-            .type = QEMU_OPT_BOOL,
-            .help = "Turn off copy-on-write (valid only on btrfs)"
-        },
-        {
-            .name = BLOCK_OPT_PREALLOC,
-            .type = QEMU_OPT_STRING,
-            .help = "Preallocation mode (allowed values: off, falloc, full)"
-        },
-        { /* end of list */ }
-    }
-};
-
-BlockDriver bdrv_file = {
-    .format_name = "file",
-    .protocol_name = "file",
-    .instance_size = sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_probe = NULL, /* no probe for protocols */
-    .bdrv_parse_filename = raw_parse_filename,
-    .bdrv_file_open = raw_open,
-    .bdrv_reopen_prepare = raw_reopen_prepare,
-    .bdrv_reopen_commit = raw_reopen_commit,
-    .bdrv_reopen_abort = raw_reopen_abort,
-    .bdrv_close = raw_close,
-    .bdrv_create = raw_create,
-    .bdrv_has_zero_init = bdrv_has_zero_init_1,
-    .bdrv_co_get_block_status = raw_co_get_block_status,
-    .bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,
-
-    .bdrv_co_preadv         = raw_co_preadv,
-    .bdrv_co_pwritev        = raw_co_pwritev,
-    .bdrv_aio_flush = raw_aio_flush,
-    .bdrv_aio_pdiscard = raw_aio_pdiscard,
-    .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_io_plug = raw_aio_plug,
-    .bdrv_io_unplug = raw_aio_unplug,
-
-    .bdrv_truncate = raw_truncate,
-    .bdrv_getlength = raw_getlength,
-    .bdrv_get_info = raw_get_info,
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-
-    .create_opts = &raw_create_opts,
-};
-
-/***********************************************/
-/* host device */
-
-#if defined(__APPLE__) && defined(__MACH__)
-static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
-                                CFIndex maxPathSize, int flags);
-static char *FindEjectableOpticalMedia(io_iterator_t *mediaIterator)
-{
-    kern_return_t kernResult = KERN_FAILURE;
-    mach_port_t     masterPort;
-    CFMutableDictionaryRef  classesToMatch;
-    const char *matching_array[] = {kIODVDMediaClass, kIOCDMediaClass};
-    char *mediaType = NULL;
-
-    kernResult = IOMasterPort( MACH_PORT_NULL, &masterPort );
-    if ( KERN_SUCCESS != kernResult ) {
-        printf( "IOMasterPort returned %d\n", kernResult );
-    }
-
-    int index;
-    for (index = 0; index < ARRAY_SIZE(matching_array); index++) {
-        classesToMatch = IOServiceMatching(matching_array[index]);
-        if (classesToMatch == NULL) {
-            error_report("IOServiceMatching returned NULL for %s",
-                         matching_array[index]);
-            continue;
-        }
-        CFDictionarySetValue(classesToMatch, CFSTR(kIOMediaEjectableKey),
-                             kCFBooleanTrue);
-        kernResult = IOServiceGetMatchingServices(masterPort, classesToMatch,
-                                                  mediaIterator);
-        if (kernResult != KERN_SUCCESS) {
-            error_report("Note: IOServiceGetMatchingServices returned %d",
-                         kernResult);
-            continue;
-        }
-
-        /* If a match was found, leave the loop */
-        if (*mediaIterator != 0) {
-            DPRINTF("Matching using %s\n", matching_array[index]);
-            mediaType = g_strdup(matching_array[index]);
-            break;
-        }
-    }
-    return mediaType;
-}
-
-kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
-                         CFIndex maxPathSize, int flags)
-{
-    io_object_t     nextMedia;
-    kern_return_t   kernResult = KERN_FAILURE;
-    *bsdPath = '\0';
-    nextMedia = IOIteratorNext( mediaIterator );
-    if ( nextMedia )
-    {
-        CFTypeRef   bsdPathAsCFString;
-    bsdPathAsCFString = IORegistryEntryCreateCFProperty( nextMedia, CFSTR( kIOBSDNameKey ), kCFAllocatorDefault, 0 );
-        if ( bsdPathAsCFString ) {
-            size_t devPathLength;
-            strcpy( bsdPath, _PATH_DEV );
-            if (flags & BDRV_O_NOCACHE) {
-                strcat(bsdPath, "r");
-            }
-            devPathLength = strlen( bsdPath );
-            if ( CFStringGetCString( bsdPathAsCFString, bsdPath + devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) {
-                kernResult = KERN_SUCCESS;
-            }
-            CFRelease( bsdPathAsCFString );
-        }
-        IOObjectRelease( nextMedia );
-    }
-
-    return kernResult;
-}
-
-/* Sets up a real cdrom for use in QEMU */
-static bool setup_cdrom(char *bsd_path, Error **errp)
-{
-    int index, num_of_test_partitions = 2, fd;
-    char test_partition[MAXPATHLEN];
-    bool partition_found = false;
-
-    /* look for a working partition */
-    for (index = 0; index < num_of_test_partitions; index++) {
-        snprintf(test_partition, sizeof(test_partition), "%ss%d", bsd_path,
-                 index);
-        fd = qemu_open(test_partition, O_RDONLY | O_BINARY | O_LARGEFILE);
-        if (fd >= 0) {
-            partition_found = true;
-            qemu_close(fd);
-            break;
-        }
-    }
-
-    /* if a working partition on the device was not found */
-    if (partition_found == false) {
-        error_setg(errp, "Failed to find a working partition on disc");
-    } else {
-        DPRINTF("Using %s as optical disc\n", test_partition);
-        pstrcpy(bsd_path, MAXPATHLEN, test_partition);
-    }
-    return partition_found;
-}
-
-/* Prints directions on mounting and unmounting a device */
-static void print_unmounting_directions(const char *file_name)
-{
-    error_report("If device %s is mounted on the desktop, unmount"
-                 " it first before using it in QEMU", file_name);
-    error_report("Command to unmount device: diskutil unmountDisk %s",
-                 file_name);
-    error_report("Command to mount device: diskutil mountDisk %s", file_name);
-}
-
-#endif /* defined(__APPLE__) && defined(__MACH__) */
-
-static int hdev_probe_device(const char *filename)
-{
-    struct stat st;
-
-    /* allow a dedicated CD-ROM driver to match with a higher priority */
-    if (strstart(filename, "/dev/cdrom", NULL))
-        return 50;
-
-    if (stat(filename, &st) >= 0 &&
-            (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
-        return 100;
-    }
-
-    return 0;
-}
-
-static int check_hdev_writable(BDRVRawState *s)
-{
-#if defined(BLKROGET)
-    /* Linux block devices can be configured "read-only" using blockdev(8).
-     * This is independent of device node permissions and therefore open(2)
-     * with O_RDWR succeeds.  Actual writes fail with EPERM.
-     *
-     * bdrv_open() is supposed to fail if the disk is read-only.  Explicitly
-     * check for read-only block devices so that Linux block devices behave
-     * properly.
-     */
-    struct stat st;
-    int readonly = 0;
-
-    if (fstat(s->fd, &st)) {
-        return -errno;
-    }
-
-    if (!S_ISBLK(st.st_mode)) {
-        return 0;
-    }
-
-    if (ioctl(s->fd, BLKROGET, &readonly) < 0) {
-        return -errno;
-    }
-
-    if (readonly) {
-        return -EACCES;
-    }
-#endif /* defined(BLKROGET) */
-    return 0;
-}
-
-static void hdev_parse_filename(const char *filename, QDict *options,
-                                Error **errp)
-{
-    /* The prefix is optional, just as for "file". */
-    strstart(filename, "host_device:", &filename);
-
-    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
-}
-
-static bool hdev_is_sg(BlockDriverState *bs)
-{
-
-#if defined(__linux__)
-
-    BDRVRawState *s = bs->opaque;
-    struct stat st;
-    struct sg_scsi_id scsiid;
-    int sg_version;
-    int ret;
-
-    if (stat(bs->filename, &st) < 0 || !S_ISCHR(st.st_mode)) {
-        return false;
-    }
-
-    ret = ioctl(s->fd, SG_GET_VERSION_NUM, &sg_version);
-    if (ret < 0) {
-        return false;
-    }
-
-    ret = ioctl(s->fd, SG_GET_SCSI_ID, &scsiid);
-    if (ret >= 0) {
-        DPRINTF("SG device found: type=%d, version=%d\n",
-            scsiid.scsi_type, sg_version);
-        return true;
-    }
-
-#endif
-
-    return false;
-}
-
-static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
-                     Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    Error *local_err = NULL;
-    int ret;
-
-#if defined(__APPLE__) && defined(__MACH__)
-    const char *filename = qdict_get_str(options, "filename");
-    char bsd_path[MAXPATHLEN] = "";
-    bool error_occurred = false;
-
-    /* If using a real cdrom */
-    if (strcmp(filename, "/dev/cdrom") == 0) {
-        char *mediaType = NULL;
-        kern_return_t ret_val;
-        io_iterator_t mediaIterator = 0;
-
-        mediaType = FindEjectableOpticalMedia(&mediaIterator);
-        if (mediaType == NULL) {
-            error_setg(errp, "Please make sure your CD/DVD is in the optical"
-                       " drive");
-            error_occurred = true;
-            goto hdev_open_Mac_error;
-        }
-
-        ret_val = GetBSDPath(mediaIterator, bsd_path, sizeof(bsd_path), flags);
-        if (ret_val != KERN_SUCCESS) {
-            error_setg(errp, "Could not get BSD path for optical drive");
-            error_occurred = true;
-            goto hdev_open_Mac_error;
-        }
-
-        /* If a real optical drive was not found */
-        if (bsd_path[0] == '\0') {
-            error_setg(errp, "Failed to obtain bsd path for optical drive");
-            error_occurred = true;
-            goto hdev_open_Mac_error;
-        }
-
-        /* If using a cdrom disc and finding a partition on the disc failed */
-        if (strncmp(mediaType, kIOCDMediaClass, 9) == 0 &&
-            setup_cdrom(bsd_path, errp) == false) {
-            print_unmounting_directions(bsd_path);
-            error_occurred = true;
-            goto hdev_open_Mac_error;
-        }
-
-        qdict_put(options, "filename", qstring_from_str(bsd_path));
-
-hdev_open_Mac_error:
-        g_free(mediaType);
-        if (mediaIterator) {
-            IOObjectRelease(mediaIterator);
-        }
-        if (error_occurred) {
-            return -ENOENT;
-        }
-    }
-#endif /* defined(__APPLE__) && defined(__MACH__) */
-
-    s->type = FTYPE_FILE;
-
-    ret = raw_open_common(bs, options, flags, 0, &local_err);
-    if (ret < 0) {
-        error_propagate(errp, local_err);
-#if defined(__APPLE__) && defined(__MACH__)
-        if (*bsd_path) {
-            filename = bsd_path;
-        }
-        /* if a physical device experienced an error while being opened */
-        if (strncmp(filename, "/dev/", 5) == 0) {
-            print_unmounting_directions(filename);
-        }
-#endif /* defined(__APPLE__) && defined(__MACH__) */
-        return ret;
-    }
-
-    /* Since this does ioctl the device must be already opened */
-    bs->sg = hdev_is_sg(bs);
-
-    if (flags & BDRV_O_RDWR) {
-        ret = check_hdev_writable(s);
-        if (ret < 0) {
-            raw_close(bs);
-            error_setg_errno(errp, -ret, "The device is not writable");
-            return ret;
-        }
-    }
-
-    return ret;
-}
-
-#if defined(__linux__)
-
-static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
-        unsigned long int req, void *buf,
-        BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-    RawPosixAIOData *acb;
-    ThreadPool *pool;
-
-    if (fd_open(bs) < 0)
-        return NULL;
-
-    acb = g_new(RawPosixAIOData, 1);
-    acb->bs = bs;
-    acb->aio_type = QEMU_AIO_IOCTL;
-    acb->aio_fildes = s->fd;
-    acb->aio_offset = 0;
-    acb->aio_ioctl_buf = buf;
-    acb->aio_ioctl_cmd = req;
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
-}
-#endif /* linux */
-
-static int fd_open(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-
-    /* this is just to ensure s->fd is sane (its called by io ops) */
-    if (s->fd >= 0)
-        return 0;
-    return -EIO;
-}
-
-static coroutine_fn BlockAIOCB *hdev_aio_pdiscard(BlockDriverState *bs,
-    int64_t offset, int count,
-    BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (fd_open(bs) < 0) {
-        return NULL;
-    }
-    return paio_submit(bs, s->fd, offset, NULL, count,
-                       cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
-}
-
-static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs,
-    int64_t offset, int count, BdrvRequestFlags flags)
-{
-    BDRVRawState *s = bs->opaque;
-    int rc;
-
-    rc = fd_open(bs);
-    if (rc < 0) {
-        return rc;
-    }
-    if (!(flags & BDRV_REQ_MAY_UNMAP)) {
-        return paio_submit_co(bs, s->fd, offset, NULL, count,
-                              QEMU_AIO_WRITE_ZEROES|QEMU_AIO_BLKDEV);
-    } else if (s->discard_zeroes) {
-        return paio_submit_co(bs, s->fd, offset, NULL, count,
-                              QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
-    }
-    return -ENOTSUP;
-}
-
-static int hdev_create(const char *filename, QemuOpts *opts,
-                       Error **errp)
-{
-    int fd;
-    int ret = 0;
-    struct stat stat_buf;
-    int64_t total_size = 0;
-    bool has_prefix;
-
-    /* This function is used by both protocol block drivers and therefore either
-     * of these prefixes may be given.
-     * The return value has to be stored somewhere, otherwise this is an error
-     * due to -Werror=unused-value. */
-    has_prefix =
-        strstart(filename, "host_device:", &filename) ||
-        strstart(filename, "host_cdrom:" , &filename);
-
-    (void)has_prefix;
-
-    ret = raw_normalize_devicepath(&filename);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Could not normalize device path");
-        return ret;
-    }
-
-    /* Read out options */
-    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                          BDRV_SECTOR_SIZE);
-
-    fd = qemu_open(filename, O_WRONLY | O_BINARY);
-    if (fd < 0) {
-        ret = -errno;
-        error_setg_errno(errp, -ret, "Could not open device");
-        return ret;
-    }
-
-    if (fstat(fd, &stat_buf) < 0) {
-        ret = -errno;
-        error_setg_errno(errp, -ret, "Could not stat device");
-    } else if (!S_ISBLK(stat_buf.st_mode) && !S_ISCHR(stat_buf.st_mode)) {
-        error_setg(errp,
-                   "The given file is neither a block nor a character device");
-        ret = -ENODEV;
-    } else if (lseek(fd, 0, SEEK_END) < total_size) {
-        error_setg(errp, "Device is too small");
-        ret = -ENOSPC;
-    }
-
-    qemu_close(fd);
-    return ret;
-}
-
-static BlockDriver bdrv_host_device = {
-    .format_name        = "host_device",
-    .protocol_name        = "host_device",
-    .instance_size      = sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_probe_device  = hdev_probe_device,
-    .bdrv_parse_filename = hdev_parse_filename,
-    .bdrv_file_open     = hdev_open,
-    .bdrv_close         = raw_close,
-    .bdrv_reopen_prepare = raw_reopen_prepare,
-    .bdrv_reopen_commit  = raw_reopen_commit,
-    .bdrv_reopen_abort   = raw_reopen_abort,
-    .bdrv_create         = hdev_create,
-    .create_opts         = &raw_create_opts,
-    .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
-
-    .bdrv_co_preadv         = raw_co_preadv,
-    .bdrv_co_pwritev        = raw_co_pwritev,
-    .bdrv_aio_flush	= raw_aio_flush,
-    .bdrv_aio_pdiscard   = hdev_aio_pdiscard,
-    .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_io_plug = raw_aio_plug,
-    .bdrv_io_unplug = raw_aio_unplug,
-
-    .bdrv_truncate      = raw_truncate,
-    .bdrv_getlength	= raw_getlength,
-    .bdrv_get_info = raw_get_info,
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-    .bdrv_probe_blocksizes = hdev_probe_blocksizes,
-    .bdrv_probe_geometry = hdev_probe_geometry,
-
-    /* generic scsi device */
-#ifdef __linux__
-    .bdrv_aio_ioctl     = hdev_aio_ioctl,
-#endif
-};
-
-#if defined(__linux__) || defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-static void cdrom_parse_filename(const char *filename, QDict *options,
-                                 Error **errp)
-{
-    /* The prefix is optional, just as for "file". */
-    strstart(filename, "host_cdrom:", &filename);
-
-    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
-}
-#endif
-
-#ifdef __linux__
-static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
-                      Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-
-    s->type = FTYPE_CD;
-
-    /* open will not fail even if no CD is inserted, so add O_NONBLOCK */
-    return raw_open_common(bs, options, flags, O_NONBLOCK, errp);
-}
-
-static int cdrom_probe_device(const char *filename)
-{
-    int fd, ret;
-    int prio = 0;
-    struct stat st;
-
-    fd = qemu_open(filename, O_RDONLY | O_NONBLOCK);
-    if (fd < 0) {
-        goto out;
-    }
-    ret = fstat(fd, &st);
-    if (ret == -1 || !S_ISBLK(st.st_mode)) {
-        goto outc;
-    }
-
-    /* Attempt to detect via a CDROM specific ioctl */
-    ret = ioctl(fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
-    if (ret >= 0)
-        prio = 100;
-
-outc:
-    qemu_close(fd);
-out:
-    return prio;
-}
-
-static bool cdrom_is_inserted(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int ret;
-
-    ret = ioctl(s->fd, CDROM_DRIVE_STATUS, CDSL_CURRENT);
-    return ret == CDS_DISC_OK;
-}
-
-static void cdrom_eject(BlockDriverState *bs, bool eject_flag)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (eject_flag) {
-        if (ioctl(s->fd, CDROMEJECT, NULL) < 0)
-            perror("CDROMEJECT");
-    } else {
-        if (ioctl(s->fd, CDROMCLOSETRAY, NULL) < 0)
-            perror("CDROMEJECT");
-    }
-}
-
-static void cdrom_lock_medium(BlockDriverState *bs, bool locked)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (ioctl(s->fd, CDROM_LOCKDOOR, locked) < 0) {
-        /*
-         * Note: an error can happen if the distribution automatically
-         * mounts the CD-ROM
-         */
-        /* perror("CDROM_LOCKDOOR"); */
-    }
-}
-
-static BlockDriver bdrv_host_cdrom = {
-    .format_name        = "host_cdrom",
-    .protocol_name      = "host_cdrom",
-    .instance_size      = sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_probe_device	= cdrom_probe_device,
-    .bdrv_parse_filename = cdrom_parse_filename,
-    .bdrv_file_open     = cdrom_open,
-    .bdrv_close         = raw_close,
-    .bdrv_reopen_prepare = raw_reopen_prepare,
-    .bdrv_reopen_commit  = raw_reopen_commit,
-    .bdrv_reopen_abort   = raw_reopen_abort,
-    .bdrv_create         = hdev_create,
-    .create_opts         = &raw_create_opts,
-
-
-    .bdrv_co_preadv         = raw_co_preadv,
-    .bdrv_co_pwritev        = raw_co_pwritev,
-    .bdrv_aio_flush	= raw_aio_flush,
-    .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_io_plug = raw_aio_plug,
-    .bdrv_io_unplug = raw_aio_unplug,
-
-    .bdrv_truncate      = raw_truncate,
-    .bdrv_getlength      = raw_getlength,
-    .has_variable_length = true,
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-
-    /* removable device support */
-    .bdrv_is_inserted   = cdrom_is_inserted,
-    .bdrv_eject         = cdrom_eject,
-    .bdrv_lock_medium   = cdrom_lock_medium,
-
-    /* generic scsi device */
-    .bdrv_aio_ioctl     = hdev_aio_ioctl,
-};
-#endif /* __linux__ */
-
-#if defined (__FreeBSD__) || defined(__FreeBSD_kernel__)
-static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
-                      Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    Error *local_err = NULL;
-    int ret;
-
-    s->type = FTYPE_CD;
-
-    ret = raw_open_common(bs, options, flags, 0, &local_err);
-    if (ret) {
-        error_propagate(errp, local_err);
-        return ret;
-    }
-
-    /* make sure the door isn't locked at this time */
-    ioctl(s->fd, CDIOCALLOW);
-    return 0;
-}
-
-static int cdrom_probe_device(const char *filename)
-{
-    if (strstart(filename, "/dev/cd", NULL) ||
-            strstart(filename, "/dev/acd", NULL))
-        return 100;
-    return 0;
-}
-
-static int cdrom_reopen(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    int fd;
-
-    /*
-     * Force reread of possibly changed/newly loaded disc,
-     * FreeBSD seems to not notice sometimes...
-     */
-    if (s->fd >= 0)
-        qemu_close(s->fd);
-    fd = qemu_open(bs->filename, s->open_flags, 0644);
-    if (fd < 0) {
-        s->fd = -1;
-        return -EIO;
-    }
-    s->fd = fd;
-
-    /* make sure the door isn't locked at this time */
-    ioctl(s->fd, CDIOCALLOW);
-    return 0;
-}
-
-static bool cdrom_is_inserted(BlockDriverState *bs)
-{
-    return raw_getlength(bs) > 0;
-}
-
-static void cdrom_eject(BlockDriverState *bs, bool eject_flag)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->fd < 0)
-        return;
-
-    (void) ioctl(s->fd, CDIOCALLOW);
-
-    if (eject_flag) {
-        if (ioctl(s->fd, CDIOCEJECT) < 0)
-            perror("CDIOCEJECT");
-    } else {
-        if (ioctl(s->fd, CDIOCCLOSE) < 0)
-            perror("CDIOCCLOSE");
-    }
-
-    cdrom_reopen(bs);
-}
-
-static void cdrom_lock_medium(BlockDriverState *bs, bool locked)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->fd < 0)
-        return;
-    if (ioctl(s->fd, (locked ? CDIOCPREVENT : CDIOCALLOW)) < 0) {
-        /*
-         * Note: an error can happen if the distribution automatically
-         * mounts the CD-ROM
-         */
-        /* perror("CDROM_LOCKDOOR"); */
-    }
-}
-
-static BlockDriver bdrv_host_cdrom = {
-    .format_name        = "host_cdrom",
-    .protocol_name      = "host_cdrom",
-    .instance_size      = sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_probe_device	= cdrom_probe_device,
-    .bdrv_parse_filename = cdrom_parse_filename,
-    .bdrv_file_open     = cdrom_open,
-    .bdrv_close         = raw_close,
-    .bdrv_reopen_prepare = raw_reopen_prepare,
-    .bdrv_reopen_commit  = raw_reopen_commit,
-    .bdrv_reopen_abort   = raw_reopen_abort,
-    .bdrv_create        = hdev_create,
-    .create_opts        = &raw_create_opts,
-
-    .bdrv_co_preadv         = raw_co_preadv,
-    .bdrv_co_pwritev        = raw_co_pwritev,
-    .bdrv_aio_flush	= raw_aio_flush,
-    .bdrv_refresh_limits = raw_refresh_limits,
-    .bdrv_io_plug = raw_aio_plug,
-    .bdrv_io_unplug = raw_aio_unplug,
-
-    .bdrv_truncate      = raw_truncate,
-    .bdrv_getlength      = raw_getlength,
-    .has_variable_length = true,
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-
-    /* removable device support */
-    .bdrv_is_inserted   = cdrom_is_inserted,
-    .bdrv_eject         = cdrom_eject,
-    .bdrv_lock_medium   = cdrom_lock_medium,
-};
-#endif /* __FreeBSD__ */
-
-static void bdrv_file_init(void)
-{
-    /*
-     * Register all the drivers.  Note that order is important, the driver
-     * registered last will get probed first.
-     */
-    bdrv_register(&bdrv_file);
-    bdrv_register(&bdrv_host_device);
-#ifdef __linux__
-    bdrv_register(&bdrv_host_cdrom);
-#endif
-#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-    bdrv_register(&bdrv_host_cdrom);
-#endif
-}
-
-block_init(bdrv_file_init);
diff --git a/block/raw-win32.c b/block/raw-win32.c
deleted file mode 100644
index 800fabd..0000000
--- a/block/raw-win32.c
+++ /dev/null
@@ -1,781 +0,0 @@
-/*
- * Block driver for RAW files (win32)
- *
- * Copyright (c) 2006 Fabrice Bellard
- *
- * Permission is hereby granted, free of charge, to any person obtaining a copy
- * of this software and associated documentation files (the "Software"), to deal
- * in the Software without restriction, including without limitation the rights
- * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
- * copies of the Software, and to permit persons to whom the Software is
- * furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
- * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
- * THE SOFTWARE.
- */
-#include "qemu/osdep.h"
-#include "qapi/error.h"
-#include "qemu/cutils.h"
-#include "qemu/timer.h"
-#include "block/block_int.h"
-#include "qemu/module.h"
-#include "block/raw-aio.h"
-#include "trace.h"
-#include "block/thread-pool.h"
-#include "qemu/iov.h"
-#include "qapi/qmp/qstring.h"
-#include "qapi/util.h"
-#include <windows.h>
-#include <winioctl.h>
-
-#define FTYPE_FILE 0
-#define FTYPE_CD     1
-#define FTYPE_HARDDISK 2
-
-typedef struct RawWin32AIOData {
-    BlockDriverState *bs;
-    HANDLE hfile;
-    struct iovec *aio_iov;
-    int aio_niov;
-    size_t aio_nbytes;
-    off64_t aio_offset;
-    int aio_type;
-} RawWin32AIOData;
-
-typedef struct BDRVRawState {
-    HANDLE hfile;
-    int type;
-    char drive_path[16]; /* format: "d:\" */
-    QEMUWin32AIOState *aio;
-} BDRVRawState;
-
-/*
- * Read/writes the data to/from a given linear buffer.
- *
- * Returns the number of bytes handles or -errno in case of an error. Short
- * reads are only returned if the end of the file is reached.
- */
-static size_t handle_aiocb_rw(RawWin32AIOData *aiocb)
-{
-    size_t offset = 0;
-    int i;
-
-    for (i = 0; i < aiocb->aio_niov; i++) {
-        OVERLAPPED ov;
-        DWORD ret, ret_count, len;
-
-        memset(&ov, 0, sizeof(ov));
-        ov.Offset = (aiocb->aio_offset + offset);
-        ov.OffsetHigh = (aiocb->aio_offset + offset) >> 32;
-        len = aiocb->aio_iov[i].iov_len;
-        if (aiocb->aio_type & QEMU_AIO_WRITE) {
-            ret = WriteFile(aiocb->hfile, aiocb->aio_iov[i].iov_base,
-                            len, &ret_count, &ov);
-        } else {
-            ret = ReadFile(aiocb->hfile, aiocb->aio_iov[i].iov_base,
-                           len, &ret_count, &ov);
-        }
-        if (!ret) {
-            ret_count = 0;
-        }
-        if (ret_count != len) {
-            offset += ret_count;
-            break;
-        }
-        offset += len;
-    }
-
-    return offset;
-}
-
-static int aio_worker(void *arg)
-{
-    RawWin32AIOData *aiocb = arg;
-    ssize_t ret = 0;
-    size_t count;
-
-    switch (aiocb->aio_type & QEMU_AIO_TYPE_MASK) {
-    case QEMU_AIO_READ:
-        count = handle_aiocb_rw(aiocb);
-        if (count < aiocb->aio_nbytes) {
-            /* A short read means that we have reached EOF. Pad the buffer
-             * with zeros for bytes after EOF. */
-            iov_memset(aiocb->aio_iov, aiocb->aio_niov, count,
-                      0, aiocb->aio_nbytes - count);
-
-            count = aiocb->aio_nbytes;
-        }
-        if (count == aiocb->aio_nbytes) {
-            ret = 0;
-        } else {
-            ret = -EINVAL;
-        }
-        break;
-    case QEMU_AIO_WRITE:
-        count = handle_aiocb_rw(aiocb);
-        if (count == aiocb->aio_nbytes) {
-            ret = 0;
-        } else {
-            ret = -EINVAL;
-        }
-        break;
-    case QEMU_AIO_FLUSH:
-        if (!FlushFileBuffers(aiocb->hfile)) {
-            return -EIO;
-        }
-        break;
-    default:
-        fprintf(stderr, "invalid aio request (0x%x)\n", aiocb->aio_type);
-        ret = -EINVAL;
-        break;
-    }
-
-    g_free(aiocb);
-    return ret;
-}
-
-static BlockAIOCB *paio_submit(BlockDriverState *bs, HANDLE hfile,
-        int64_t offset, QEMUIOVector *qiov, int count,
-        BlockCompletionFunc *cb, void *opaque, int type)
-{
-    RawWin32AIOData *acb = g_new(RawWin32AIOData, 1);
-    ThreadPool *pool;
-
-    acb->bs = bs;
-    acb->hfile = hfile;
-    acb->aio_type = type;
-
-    if (qiov) {
-        acb->aio_iov = qiov->iov;
-        acb->aio_niov = qiov->niov;
-        assert(qiov->size == count);
-    }
-    acb->aio_nbytes = count;
-    acb->aio_offset = offset;
-
-    trace_paio_submit(acb, opaque, offset, count, type);
-    pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
-    return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
-}
-
-int qemu_ftruncate64(int fd, int64_t length)
-{
-    LARGE_INTEGER li;
-    DWORD dw;
-    LONG high;
-    HANDLE h;
-    BOOL res;
-
-    if ((GetVersion() & 0x80000000UL) && (length >> 32) != 0)
-	return -1;
-
-    h = (HANDLE)_get_osfhandle(fd);
-
-    /* get current position, ftruncate do not change position */
-    li.HighPart = 0;
-    li.LowPart = SetFilePointer (h, 0, &li.HighPart, FILE_CURRENT);
-    if (li.LowPart == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
-	return -1;
-    }
-
-    high = length >> 32;
-    dw = SetFilePointer(h, (DWORD) length, &high, FILE_BEGIN);
-    if (dw == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
-	return -1;
-    }
-    res = SetEndOfFile(h);
-
-    /* back to old position */
-    SetFilePointer(h, li.LowPart, &li.HighPart, FILE_BEGIN);
-    return res ? 0 : -1;
-}
-
-static int set_sparse(int fd)
-{
-    DWORD returned;
-    return (int) DeviceIoControl((HANDLE)_get_osfhandle(fd), FSCTL_SET_SPARSE,
-				 NULL, 0, NULL, 0, &returned, NULL);
-}
-
-static void raw_detach_aio_context(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->aio) {
-        win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
-    }
-}
-
-static void raw_attach_aio_context(BlockDriverState *bs,
-                                   AioContext *new_context)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->aio) {
-        win32_aio_attach_aio_context(s->aio, new_context);
-    }
-}
-
-static void raw_probe_alignment(BlockDriverState *bs, Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    DWORD sectorsPerCluster, freeClusters, totalClusters, count;
-    DISK_GEOMETRY_EX dg;
-    BOOL status;
-
-    if (s->type == FTYPE_CD) {
-        bs->bl.request_alignment = 2048;
-        return;
-    }
-    if (s->type == FTYPE_HARDDISK) {
-        status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
-                                 NULL, 0, &dg, sizeof(dg), &count, NULL);
-        if (status != 0) {
-            bs->bl.request_alignment = dg.Geometry.BytesPerSector;
-            return;
-        }
-        /* try GetDiskFreeSpace too */
-    }
-
-    if (s->drive_path[0]) {
-        GetDiskFreeSpace(s->drive_path, &sectorsPerCluster,
-                         &dg.Geometry.BytesPerSector,
-                         &freeClusters, &totalClusters);
-        bs->bl.request_alignment = dg.Geometry.BytesPerSector;
-    }
-}
-
-static void raw_parse_flags(int flags, bool use_aio, int *access_flags,
-                            DWORD *overlapped)
-{
-    assert(access_flags != NULL);
-    assert(overlapped != NULL);
-
-    if (flags & BDRV_O_RDWR) {
-        *access_flags = GENERIC_READ | GENERIC_WRITE;
-    } else {
-        *access_flags = GENERIC_READ;
-    }
-
-    *overlapped = FILE_ATTRIBUTE_NORMAL;
-    if (use_aio) {
-        *overlapped |= FILE_FLAG_OVERLAPPED;
-    }
-    if (flags & BDRV_O_NOCACHE) {
-        *overlapped |= FILE_FLAG_NO_BUFFERING;
-    }
-}
-
-static void raw_parse_filename(const char *filename, QDict *options,
-                               Error **errp)
-{
-    /* The filename does not have to be prefixed by the protocol name, since
-     * "file" is the default protocol; therefore, the return value of this
-     * function call can be ignored. */
-    strstart(filename, "file:", &filename);
-
-    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
-}
-
-static QemuOptsList raw_runtime_opts = {
-    .name = "raw",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_runtime_opts.head),
-    .desc = {
-        {
-            .name = "filename",
-            .type = QEMU_OPT_STRING,
-            .help = "File name of the image",
-        },
-        {
-            .name = "aio",
-            .type = QEMU_OPT_STRING,
-            .help = "host AIO implementation (threads, native)",
-        },
-        { /* end of list */ }
-    },
-};
-
-static bool get_aio_option(QemuOpts *opts, int flags, Error **errp)
-{
-    BlockdevAioOptions aio, aio_default;
-
-    aio_default = (flags & BDRV_O_NATIVE_AIO) ? BLOCKDEV_AIO_OPTIONS_NATIVE
-                                              : BLOCKDEV_AIO_OPTIONS_THREADS;
-    aio = qapi_enum_parse(BlockdevAioOptions_lookup, qemu_opt_get(opts, "aio"),
-                          BLOCKDEV_AIO_OPTIONS__MAX, aio_default, errp);
-
-    switch (aio) {
-    case BLOCKDEV_AIO_OPTIONS_NATIVE:
-        return true;
-    case BLOCKDEV_AIO_OPTIONS_THREADS:
-        return false;
-    default:
-        error_setg(errp, "Invalid AIO option");
-    }
-    return false;
-}
-
-static int raw_open(BlockDriverState *bs, QDict *options, int flags,
-                    Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    int access_flags;
-    DWORD overlapped;
-    QemuOpts *opts;
-    Error *local_err = NULL;
-    const char *filename;
-    bool use_aio;
-    int ret;
-
-    s->type = FTYPE_FILE;
-
-    opts = qemu_opts_create(&raw_runtime_opts, NULL, 0, &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    filename = qemu_opt_get(opts, "filename");
-
-    use_aio = get_aio_option(opts, flags, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto fail;
-    }
-
-    raw_parse_flags(flags, use_aio, &access_flags, &overlapped);
-
-    if (filename[0] && filename[1] == ':') {
-        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", filename[0]);
-    } else if (filename[0] == '\\' && filename[1] == '\\') {
-        s->drive_path[0] = 0;
-    } else {
-        /* Relative path.  */
-        char buf[MAX_PATH];
-        GetCurrentDirectory(MAX_PATH, buf);
-        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", buf[0]);
-    }
-
-    s->hfile = CreateFile(filename, access_flags,
-                          FILE_SHARE_READ, NULL,
-                          OPEN_EXISTING, overlapped, NULL);
-    if (s->hfile == INVALID_HANDLE_VALUE) {
-        int err = GetLastError();
-
-        error_setg_win32(errp, err, "Could not open '%s'", filename);
-        if (err == ERROR_ACCESS_DENIED) {
-            ret = -EACCES;
-        } else {
-            ret = -EINVAL;
-        }
-        goto fail;
-    }
-
-    if (use_aio) {
-        s->aio = win32_aio_init();
-        if (s->aio == NULL) {
-            CloseHandle(s->hfile);
-            error_setg(errp, "Could not initialize AIO");
-            ret = -EINVAL;
-            goto fail;
-        }
-
-        ret = win32_aio_attach(s->aio, s->hfile);
-        if (ret < 0) {
-            win32_aio_cleanup(s->aio);
-            CloseHandle(s->hfile);
-            error_setg_errno(errp, -ret, "Could not enable AIO");
-            goto fail;
-        }
-
-        win32_aio_attach_aio_context(s->aio, bdrv_get_aio_context(bs));
-    }
-
-    ret = 0;
-fail:
-    qemu_opts_del(opts);
-    return ret;
-}
-
-static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
-                         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                         BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-    if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_READ);
-    } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
-                           cb, opaque, QEMU_AIO_READ);
-    }
-}
-
-static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
-                          int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
-                          BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-    if (s->aio) {
-        return win32_aio_submit(bs, s->aio, s->hfile, sector_num, qiov,
-                                nb_sectors, cb, opaque, QEMU_AIO_WRITE);
-    } else {
-        return paio_submit(bs, s->hfile, sector_num << BDRV_SECTOR_BITS, qiov,
-                           nb_sectors << BDRV_SECTOR_BITS,
-                           cb, opaque, QEMU_AIO_WRITE);
-    }
-}
-
-static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
-                         BlockCompletionFunc *cb, void *opaque)
-{
-    BDRVRawState *s = bs->opaque;
-    return paio_submit(bs, s->hfile, 0, NULL, 0, cb, opaque, QEMU_AIO_FLUSH);
-}
-
-static void raw_close(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-
-    if (s->aio) {
-        win32_aio_detach_aio_context(s->aio, bdrv_get_aio_context(bs));
-        win32_aio_cleanup(s->aio);
-        s->aio = NULL;
-    }
-
-    CloseHandle(s->hfile);
-    if (bs->open_flags & BDRV_O_TEMPORARY) {
-        unlink(bs->filename);
-    }
-}
-
-static int raw_truncate(BlockDriverState *bs, int64_t offset)
-{
-    BDRVRawState *s = bs->opaque;
-    LONG low, high;
-    DWORD dwPtrLow;
-
-    low = offset;
-    high = offset >> 32;
-
-    /*
-     * An error has occurred if the return value is INVALID_SET_FILE_POINTER
-     * and GetLastError doesn't return NO_ERROR.
-     */
-    dwPtrLow = SetFilePointer(s->hfile, low, &high, FILE_BEGIN);
-    if (dwPtrLow == INVALID_SET_FILE_POINTER && GetLastError() != NO_ERROR) {
-        fprintf(stderr, "SetFilePointer error: %lu\n", GetLastError());
-        return -EIO;
-    }
-    if (SetEndOfFile(s->hfile) == 0) {
-        fprintf(stderr, "SetEndOfFile error: %lu\n", GetLastError());
-        return -EIO;
-    }
-    return 0;
-}
-
-static int64_t raw_getlength(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-    LARGE_INTEGER l;
-    ULARGE_INTEGER available, total, total_free;
-    DISK_GEOMETRY_EX dg;
-    DWORD count;
-    BOOL status;
-
-    switch(s->type) {
-    case FTYPE_FILE:
-        l.LowPart = GetFileSize(s->hfile, (PDWORD)&l.HighPart);
-        if (l.LowPart == 0xffffffffUL && GetLastError() != NO_ERROR)
-            return -EIO;
-        break;
-    case FTYPE_CD:
-        if (!GetDiskFreeSpaceEx(s->drive_path, &available, &total, &total_free))
-            return -EIO;
-        l.QuadPart = total.QuadPart;
-        break;
-    case FTYPE_HARDDISK:
-        status = DeviceIoControl(s->hfile, IOCTL_DISK_GET_DRIVE_GEOMETRY_EX,
-                                 NULL, 0, &dg, sizeof(dg), &count, NULL);
-        if (status != 0) {
-            l = dg.DiskSize;
-        }
-        break;
-    default:
-        return -EIO;
-    }
-    return l.QuadPart;
-}
-
-static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
-{
-    typedef DWORD (WINAPI * get_compressed_t)(const char *filename,
-                                              DWORD * high);
-    get_compressed_t get_compressed;
-    struct _stati64 st;
-    const char *filename = bs->filename;
-    /* WinNT support GetCompressedFileSize to determine allocate size */
-    get_compressed =
-        (get_compressed_t) GetProcAddress(GetModuleHandle("kernel32"),
-                                            "GetCompressedFileSizeA");
-    if (get_compressed) {
-        DWORD high, low;
-        low = get_compressed(filename, &high);
-        if (low != 0xFFFFFFFFlu || GetLastError() == NO_ERROR) {
-            return (((int64_t) high) << 32) + low;
-        }
-    }
-
-    if (_stati64(filename, &st) < 0) {
-        return -1;
-    }
-    return st.st_size;
-}
-
-static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
-{
-    int fd;
-    int64_t total_size = 0;
-
-    strstart(filename, "file:", &filename);
-
-    /* Read out options */
-    total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-                          BDRV_SECTOR_SIZE);
-
-    fd = qemu_open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
-                   0644);
-    if (fd < 0) {
-        error_setg_errno(errp, errno, "Could not create file");
-        return -EIO;
-    }
-    set_sparse(fd);
-    ftruncate(fd, total_size);
-    qemu_close(fd);
-    return 0;
-}
-
-
-static QemuOptsList raw_create_opts = {
-    .name = "raw-create-opts",
-    .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
-    .desc = {
-        {
-            .name = BLOCK_OPT_SIZE,
-            .type = QEMU_OPT_SIZE,
-            .help = "Virtual disk size"
-        },
-        { /* end of list */ }
-    }
-};
-
-BlockDriver bdrv_file = {
-    .format_name	= "file",
-    .protocol_name	= "file",
-    .instance_size	= sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_parse_filename = raw_parse_filename,
-    .bdrv_file_open     = raw_open,
-    .bdrv_refresh_limits = raw_probe_alignment,
-    .bdrv_close         = raw_close,
-    .bdrv_create        = raw_create,
-    .bdrv_has_zero_init = bdrv_has_zero_init_1,
-
-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
-    .bdrv_aio_flush     = raw_aio_flush,
-
-    .bdrv_truncate	= raw_truncate,
-    .bdrv_getlength	= raw_getlength,
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-
-    .create_opts        = &raw_create_opts,
-};
-
-/***********************************************/
-/* host device */
-
-static int find_cdrom(char *cdrom_name, int cdrom_name_size)
-{
-    char drives[256], *pdrv = drives;
-    UINT type;
-
-    memset(drives, 0, sizeof(drives));
-    GetLogicalDriveStrings(sizeof(drives), drives);
-    while(pdrv[0] != '\0') {
-        type = GetDriveType(pdrv);
-        switch(type) {
-        case DRIVE_CDROM:
-            snprintf(cdrom_name, cdrom_name_size, "\\\\.\\%c:", pdrv[0]);
-            return 0;
-            break;
-        }
-        pdrv += lstrlen(pdrv) + 1;
-    }
-    return -1;
-}
-
-static int find_device_type(BlockDriverState *bs, const char *filename)
-{
-    BDRVRawState *s = bs->opaque;
-    UINT type;
-    const char *p;
-
-    if (strstart(filename, "\\\\.\\", &p) ||
-        strstart(filename, "//./", &p)) {
-        if (stristart(p, "PhysicalDrive", NULL))
-            return FTYPE_HARDDISK;
-        snprintf(s->drive_path, sizeof(s->drive_path), "%c:\\", p[0]);
-        type = GetDriveType(s->drive_path);
-        switch (type) {
-        case DRIVE_REMOVABLE:
-        case DRIVE_FIXED:
-            return FTYPE_HARDDISK;
-        case DRIVE_CDROM:
-            return FTYPE_CD;
-        default:
-            return FTYPE_FILE;
-        }
-    } else {
-        return FTYPE_FILE;
-    }
-}
-
-static int hdev_probe_device(const char *filename)
-{
-    if (strstart(filename, "/dev/cdrom", NULL))
-        return 100;
-    if (is_windows_drive(filename))
-        return 100;
-    return 0;
-}
-
-static void hdev_parse_filename(const char *filename, QDict *options,
-                                Error **errp)
-{
-    /* The prefix is optional, just as for "file". */
-    strstart(filename, "host_device:", &filename);
-
-    qdict_put_obj(options, "filename", QOBJECT(qstring_from_str(filename)));
-}
-
-static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
-                     Error **errp)
-{
-    BDRVRawState *s = bs->opaque;
-    int access_flags, create_flags;
-    int ret = 0;
-    DWORD overlapped;
-    char device_name[64];
-
-    Error *local_err = NULL;
-    const char *filename;
-    bool use_aio;
-
-    QemuOpts *opts = qemu_opts_create(&raw_runtime_opts, NULL, 0,
-                                      &error_abort);
-    qemu_opts_absorb_qdict(opts, options, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto done;
-    }
-
-    filename = qemu_opt_get(opts, "filename");
-
-    use_aio = get_aio_option(opts, flags, &local_err);
-    if (!local_err && use_aio) {
-        error_setg(&local_err, "AIO is not supported on Windows host devices");
-    }
-    if (local_err) {
-        error_propagate(errp, local_err);
-        ret = -EINVAL;
-        goto done;
-    }
-
-    if (strstart(filename, "/dev/cdrom", NULL)) {
-        if (find_cdrom(device_name, sizeof(device_name)) < 0) {
-            error_setg(errp, "Could not open CD-ROM drive");
-            ret = -ENOENT;
-            goto done;
-        }
-        filename = device_name;
-    } else {
-        /* transform drive letters into device name */
-        if (((filename[0] >= 'a' && filename[0] <= 'z') ||
-             (filename[0] >= 'A' && filename[0] <= 'Z')) &&
-            filename[1] == ':' && filename[2] == '\0') {
-            snprintf(device_name, sizeof(device_name), "\\\\.\\%c:", filename[0]);
-            filename = device_name;
-        }
-    }
-    s->type = find_device_type(bs, filename);
-
-    raw_parse_flags(flags, use_aio, &access_flags, &overlapped);
-
-    create_flags = OPEN_EXISTING;
-
-    s->hfile = CreateFile(filename, access_flags,
-                          FILE_SHARE_READ, NULL,
-                          create_flags, overlapped, NULL);
-    if (s->hfile == INVALID_HANDLE_VALUE) {
-        int err = GetLastError();
-
-        if (err == ERROR_ACCESS_DENIED) {
-            ret = -EACCES;
-        } else {
-            ret = -EINVAL;
-        }
-        error_setg_errno(errp, -ret, "Could not open device");
-        goto done;
-    }
-
-done:
-    qemu_opts_del(opts);
-    return ret;
-}
-
-static BlockDriver bdrv_host_device = {
-    .format_name	= "host_device",
-    .protocol_name	= "host_device",
-    .instance_size	= sizeof(BDRVRawState),
-    .bdrv_needs_filename = true,
-    .bdrv_parse_filename = hdev_parse_filename,
-    .bdrv_probe_device	= hdev_probe_device,
-    .bdrv_file_open	= hdev_open,
-    .bdrv_close		= raw_close,
-
-    .bdrv_aio_readv     = raw_aio_readv,
-    .bdrv_aio_writev    = raw_aio_writev,
-    .bdrv_aio_flush     = raw_aio_flush,
-
-    .bdrv_detach_aio_context = raw_detach_aio_context,
-    .bdrv_attach_aio_context = raw_attach_aio_context,
-
-    .bdrv_getlength      = raw_getlength,
-    .has_variable_length = true,
-
-    .bdrv_get_allocated_file_size
-                        = raw_get_allocated_file_size,
-};
-
-static void bdrv_file_init(void)
-{
-    bdrv_register(&bdrv_file);
-    bdrv_register(&bdrv_host_device);
-}
-
-block_init(bdrv_file_init);
diff --git a/block/trace-events b/block/trace-events
index cfc05f2..671a6a8 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -53,8 +53,8 @@ qmp_block_job_resume(void *job) "job %p"
 qmp_block_job_complete(void *job) "job %p"
 qmp_block_stream(void *bs, void *job) "bs %p job %p"
 
-# block/raw-win32.c
-# block/raw-posix.c
+# block/file-win32.c
+# block/file-posix.c
 paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
 paio_submit(void *acb, void *opaque, int64_t offset, int count, int type) "acb %p opaque %p offset %"PRId64" count %d type %d"
 
diff --git a/configure b/configure
index 218df87..86f5214 100755
--- a/configure
+++ b/configure
@@ -2750,7 +2750,7 @@ if compile_prog "" "" ; then
 fi
 
 ##########################################
-# xfsctl() probe, used for raw-posix
+# xfsctl() probe, used for file-posix.c
 if test "$xfs" != "no" ; then
   cat > $TMPC << EOF
 #include <stddef.h>  /* NULL */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 83a423c..4e4562d 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -184,7 +184,7 @@ struct BlockDriver {
 
     /*
      * Flushes all data that was already written to the OS all the way down to
-     * the disk (for example raw-posix calls fsync()).
+     * the disk (for example file-posix.c calls fsync()).
      */
     int coroutine_fn (*bdrv_co_flush_to_disk)(BlockDriverState *bs);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c
  2017-01-09 13:44 ` [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c Kevin Wolf
@ 2017-01-09 14:32   ` Eric Blake
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Blake @ 2017-01-09 14:32 UTC (permalink / raw)
  To: Kevin Wolf, qemu-block; +Cc: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1288 bytes --]

On 01/09/2017 07:44 AM, Kevin Wolf wrote:
> From: Eric Blake <eblake@redhat.com>
> 
> These files deal with the file protocol, not the raw format (the
> file protocol is often used with other formats, and the raw
> format is not forced to use the file protocol).  Rename things
> to make it a bit easier to follow.
> 
> Suggested-by: Daniel P. Berrange <berrange@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  MAINTAINERS               |    4 +-
>  block/Makefile.objs       |    4 +-
>  block/file-posix.c        | 2616 +++++++++++++++++++++++++++++++++++++++++++++
>  block/file-win32.c        |  781 ++++++++++++++
>  block/gluster.c           |    4 +-
>  block/raw-posix.c         | 2616 ---------------------------------------------
>  block/raw-win32.c         |  781 --------------

Did you forget to configure rename detection?

It doesn't impact the validity of the pull request, but sure makes for a
larger hit on the inbox.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PULL 00/14] Block layer patches
  2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
                   ` (13 preceding siblings ...)
  2017-01-09 13:44 ` [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c Kevin Wolf
@ 2017-01-09 15:30 ` Peter Maydell
  14 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2017-01-09 15:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Qemu-block, QEMU Developers

On 9 January 2017 at 13:44, Kevin Wolf <kwolf@redhat.com> wrote:
> The following changes since commit ffe22bf51065dd33022cf91f77a821d1f11c250d:
>
>   Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20161224' into staging (2017-01-06 15:18:09 +0000)
>
> are available in the git repository at:
>
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to c1bb86cd8ae67c14f79422b6e544d1e2bf40eeb2:
>
>   block: Rename raw-{posix,win32} to file-*.c (2017-01-09 13:30:53 +0100)
>
> ----------------------------------------------------------------
> Block layer patches
>
> ----------------------------------------------------------------

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-01-09 15:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-09 13:44 [Qemu-devel] [PULL 00/14] Block layer patches Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 01/14] qemu-img: fix in-flight count for qemu-img bench Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 02/14] coroutine: Introduce qemu_coroutine_enter_if_inactive() Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 03/14] quorum: Remove s from quorum_aio_get() arguments Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 04/14] quorum: Implement .bdrv_co_readv/writev Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 05/14] quorum: Do cleanup in caller coroutine Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 06/14] quorum: Inline quorum_aio_cb() Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 07/14] quorum: Avoid bdrv_aio_writev() for rewrites Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 08/14] quorum: Implement .bdrv_co_preadv/pwritev() Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 09/14] quorum: Inline quorum_fifo_aio_cb() Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 10/14] quorum: Clean up quorum_aio_get() Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 11/14] blkdebug: Implement bdrv_co_preadv/pwritev/flush Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 12/14] blkverify: " Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 13/14] block: Rename raw_bsd to raw-format.c Kevin Wolf
2017-01-09 13:44 ` [Qemu-devel] [PULL 14/14] block: Rename raw-{posix, win32} to file-*.c Kevin Wolf
2017-01-09 14:32   ` Eric Blake
2017-01-09 15:30 ` [Qemu-devel] [PULL 00/14] Block layer patches Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).