qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] block: Fix blockdev-create with iothreads
@ 2023-05-25 12:47 Kevin Wolf
  2023-05-25 12:47 ` [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers Kevin Wolf
                   ` (11 more replies)
  0 siblings, 12 replies; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

This series started with the last patch, an iotest that tests the fixes
for .bdrv_co_create() that were made in commit b2ab5f545f ('block:
bdrv/blk_co_unref() for calls in coroutine context').

Unfortunately, it turned out that much more is wrong about creating
images in an iothread. In particular, AioContext locking is messy and in
many cases wrong. (Only one more reason to get rid of them.) This series
fixes all the hangs and crashes I saw while making the final test case
pass.

Note that this series isn't fully bisectable. Some of the existing bugs
cancel each other out, so after fixing one, some test cases will break
until the other one is fixed, too.

Kevin Wolf (12):
  block-coroutine-wrapper: Take AioContext lock in no_co_wrappers
  block: Clarify locking rules for bdrv_open(_inherit)()
  block: Take main AioContext lock when calling bdrv_open()
  block-backend: Fix blk_new_open() for iothreads
  mirror: Hold main AioContext lock for calling bdrv_open_backing_file()
  qcow2: Fix open with 'file' in iothread
  raw-format: Fix open with 'file' in iothread
  copy-before-write: Fix open with child in iothread
  block: Take AioContext lock in bdrv_open_driver()
  block: Fix AioContext locking in bdrv_insert_node()
  iotests: Make verify_virtio_scsi_pci_or_ccw() public
  iotests: Test blockdev-create in iothread

 include/block/block-common.h                  |  3 +
 block.c                                       | 33 ++++++---
 block/block-backend.c                         | 36 ++++++++--
 block/copy-before-write.c                     | 21 ++++--
 block/mirror.c                                |  6 ++
 block/qapi-sysemu.c                           |  3 +
 block/qcow2.c                                 |  8 ++-
 block/raw-format.c                            |  5 ++
 blockdev.c                                    | 29 ++++++--
 qemu-nbd.c                                    |  4 ++
 tests/unit/test-block-iothread.c              |  4 +-
 scripts/block-coroutine-wrapper.py            | 25 ++++---
 tests/qemu-iotests/iotests.py                 |  2 +-
 tests/qemu-iotests/256                        |  2 +-
 tests/qemu-iotests/tests/iothreads-create     | 67 +++++++++++++++++++
 tests/qemu-iotests/tests/iothreads-create.out |  4 ++
 16 files changed, 210 insertions(+), 42 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/iothreads-create
 create mode 100644 tests/qemu-iotests/tests/iothreads-create.out

-- 
2.40.1



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:15   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)() Kevin Wolf
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

All of the functions that currently take a BlockDriverState, BdrvChild
or BlockBackend as their first parameter expect the associated
AioContext to be locked when they are called. In the case of
no_co_wrappers, they are called from bottom halves directly in the main
loop, so no other caller can be expected to take the lock for them. This
can result in assertion failures because a lock that isn't taken is
released in nested event loops.

Looking at the first parameter is already done by co_wrappers to decide
where the coroutine should run, so doing the same in no_co_wrappers is
only consistent. Take the lock in the generated bottom halves to fix the
problem.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/block/block-common.h       |  3 +++
 block/block-backend.c              |  7 ++++++-
 scripts/block-coroutine-wrapper.py | 25 +++++++++++++++----------
 3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/include/block/block-common.h b/include/block/block-common.h
index 93196229ac..e15395f2cb 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -65,6 +65,9 @@
  * scheduling a BH in the bottom half that runs the respective non-coroutine
  * function. The coroutine yields after scheduling the BH and is reentered when
  * the wrapped function returns.
+ *
+ * If the first parameter of the function is a BlockDriverState, BdrvChild or
+ * BlockBackend pointer, the AioContext lock for it is taken in the wrapper.
  */
 #define no_co_wrapper
 
diff --git a/block/block-backend.c b/block/block-backend.c
index ca537cd0ad..26447664ab 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2394,9 +2394,14 @@ void blk_op_unblock_all(BlockBackend *blk, Error *reason)
 
 AioContext *blk_get_aio_context(BlockBackend *blk)
 {
-    BlockDriverState *bs = blk_bs(blk);
+    BlockDriverState *bs;
     IO_CODE();
 
+    if (!blk) {
+        return qemu_get_aio_context();
+    }
+
+    bs = blk_bs(blk);
     if (bs) {
         AioContext *ctx = bdrv_get_aio_context(blk_bs(blk));
         assert(ctx == blk->ctx);
diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py
index 60e9b3107c..d4a183db61 100644
--- a/scripts/block-coroutine-wrapper.py
+++ b/scripts/block-coroutine-wrapper.py
@@ -88,16 +88,7 @@ def __init__(self, wrapper_type: str, return_type: str, name: str,
                 raise ValueError(f"no_co function can't be rdlock: {self.name}")
             self.target_name = f'{subsystem}_{subname}'
 
-        t = self.args[0].type
-        if t == 'BlockDriverState *':
-            ctx = 'bdrv_get_aio_context(bs)'
-        elif t == 'BdrvChild *':
-            ctx = 'bdrv_get_aio_context(child->bs)'
-        elif t == 'BlockBackend *':
-            ctx = 'blk_get_aio_context(blk)'
-        else:
-            ctx = 'qemu_get_aio_context()'
-        self.ctx = ctx
+        self.ctx = self.gen_ctx()
 
         self.get_result = 's->ret = '
         self.ret = 'return s.ret;'
@@ -109,6 +100,17 @@ def __init__(self, wrapper_type: str, return_type: str, name: str,
             self.co_ret = ''
             self.return_field = ''
 
+    def gen_ctx(self, prefix: str = '') -> str:
+        t = self.args[0].type
+        if t == 'BlockDriverState *':
+            return f'bdrv_get_aio_context({prefix}bs)'
+        elif t == 'BdrvChild *':
+            return f'bdrv_get_aio_context({prefix}child->bs)'
+        elif t == 'BlockBackend *':
+            return f'blk_get_aio_context({prefix}blk)'
+        else:
+            return 'qemu_get_aio_context()'
+
     def gen_list(self, format: str) -> str:
         return ', '.join(format.format_map(arg.__dict__) for arg in self.args)
 
@@ -262,8 +264,11 @@ def gen_no_co_wrapper(func: FuncDecl) -> str:
 static void {name}_bh(void *opaque)
 {{
     {struct_name} *s = opaque;
+    AioContext *ctx = {func.gen_ctx('s->')};
 
+    aio_context_acquire(ctx);
     {func.get_result}{name}({ func.gen_list('s->{name}') });
+    aio_context_release(ctx);
 
     aio_co_wake(s->co);
 }}
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)()
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
  2023-05-25 12:47 ` [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:16   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open() Kevin Wolf
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

These functions specify that the caller must hold the "@filename
AioContext lock". This doesn't make sense, file names don't have an
AioContext. New BlockDriverStates always start in the main AioContext,
so this is what we really need here.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index a2f8d5a0c0..6ac47112fe 100644
--- a/block.c
+++ b/block.c
@@ -3810,9 +3810,7 @@ out:
  * should be opened. If specified, neither options nor a filename may be given,
  * nor can an existing BDS be reused (that is, *pbs has to be NULL).
  *
- * The caller must always hold @filename AioContext lock, because this
- * function eventually calls bdrv_refresh_total_sectors() which polls
- * when called from non-coroutine context.
+ * The caller must always hold the main AioContext lock.
  */
 static BlockDriverState * no_coroutine_fn
 bdrv_open_inherit(const char *filename, const char *reference, QDict *options,
@@ -4100,11 +4098,7 @@ close_and_fail:
     return NULL;
 }
 
-/*
- * The caller must always hold @filename AioContext lock, because this
- * function eventually calls bdrv_refresh_total_sectors() which polls
- * when called from non-coroutine context.
- */
+/* The caller must always hold the main AioContext lock. */
 BlockDriverState *bdrv_open(const char *filename, const char *reference,
                             QDict *options, int flags, Error **errp)
 {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open()
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
  2023-05-25 12:47 ` [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers Kevin Wolf
  2023-05-25 12:47 ` [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)() Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:20   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads Kevin Wolf
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

The function documentation already says that all callers must hold the
main AioContext lock, but not all of them do. This can cause assertion
failures when functions called by bdrv_open() try to drop the lock. Fix
a few more callers to take the lock before calling bdrv_open().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c                          |  3 +++
 block/block-backend.c            |  2 ++
 block/qapi-sysemu.c              |  3 +++
 blockdev.c                       | 29 +++++++++++++++++++++++------
 qemu-nbd.c                       |  4 ++++
 tests/unit/test-block-iothread.c |  3 +++
 6 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 6ac47112fe..79bc9c01de 100644
--- a/block.c
+++ b/block.c
@@ -7037,6 +7037,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
         return;
     }
 
+    aio_context_acquire(qemu_get_aio_context());
+
     /* Create parameter list */
     create_opts = qemu_opts_append(create_opts, drv->create_opts);
     create_opts = qemu_opts_append(create_opts, proto_drv->create_opts);
@@ -7186,6 +7188,7 @@ out:
     qemu_opts_del(opts);
     qemu_opts_free(create_opts);
     error_propagate(errp, local_err);
+    aio_context_release(qemu_get_aio_context());
 }
 
 AioContext *bdrv_get_aio_context(BlockDriverState *bs)
diff --git a/block/block-backend.c b/block/block-backend.c
index 26447664ab..1d89fabd35 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -452,7 +452,9 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
     }
 
     blk = blk_new(qemu_get_aio_context(), perm, shared);
+    aio_context_acquire(qemu_get_aio_context());
     bs = bdrv_open(filename, reference, options, flags, errp);
+    aio_context_release(qemu_get_aio_context());
     if (!bs) {
         blk_unref(blk);
         return NULL;
diff --git a/block/qapi-sysemu.c b/block/qapi-sysemu.c
index cec3c1afb4..ef07151892 100644
--- a/block/qapi-sysemu.c
+++ b/block/qapi-sysemu.c
@@ -362,7 +362,10 @@ void qmp_blockdev_change_medium(const char *device,
         qdict_put_str(options, "driver", format);
     }
 
+    aio_context_acquire(qemu_get_aio_context());
     medium_bs = bdrv_open(filename, NULL, options, bdrv_flags, errp);
+    aio_context_release(qemu_get_aio_context());
+
     if (!medium_bs) {
         goto fail;
     }
diff --git a/blockdev.c b/blockdev.c
index 5d56b79df4..db2725fe74 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -662,6 +662,7 @@ err_no_opts:
 /* Takes the ownership of bs_opts */
 BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
 {
+    BlockDriverState *bs;
     int bdrv_flags = 0;
 
     GLOBAL_STATE_CODE();
@@ -676,7 +677,11 @@ BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
         bdrv_flags |= BDRV_O_INACTIVE;
     }
 
-    return bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
+    aio_context_acquire(qemu_get_aio_context());
+    bs = bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
+    aio_context_release(qemu_get_aio_context());
+
+    return bs;
 }
 
 void blockdev_close_all_bdrv_states(void)
@@ -1480,14 +1485,20 @@ static void external_snapshot_action(TransactionAction *action,
         }
         qdict_put_str(options, "driver", format);
     }
+    aio_context_release(aio_context);
 
+    aio_context_acquire(qemu_get_aio_context());
     state->new_bs = bdrv_open(new_image_file, snapshot_ref, options, flags,
                               errp);
+    aio_context_release(qemu_get_aio_context());
+
     /* We will manually add the backing_hd field to the bs later */
     if (!state->new_bs) {
-        goto out;
+        return;
     }
 
+    aio_context_acquire(aio_context);
+
     /*
      * Allow attaching a backing file to an overlay that's already in use only
      * if the parents don't assume that they are already seeing a valid image.
@@ -1732,15 +1743,18 @@ static void drive_backup_action(DriveBackup *backup,
     if (format) {
         qdict_put_str(options, "driver", format);
     }
+    aio_context_release(aio_context);
 
+    aio_context_acquire(qemu_get_aio_context());
     target_bs = bdrv_open(backup->target, NULL, options, flags, errp);
+    aio_context_release(qemu_get_aio_context());
+
     if (!target_bs) {
-        goto out;
+        return;
     }
 
     /* Honor bdrv_try_change_aio_context() context acquisition requirements. */
     old_context = bdrv_get_aio_context(target_bs);
-    aio_context_release(aio_context);
     aio_context_acquire(old_context);
 
     ret = bdrv_try_change_aio_context(target_bs, aio_context, NULL, errp);
@@ -3066,13 +3080,17 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
     if (format) {
         qdict_put_str(options, "driver", format);
     }
+    aio_context_release(aio_context);
 
     /* Mirroring takes care of copy-on-write using the source's backing
      * file.
      */
+    aio_context_acquire(qemu_get_aio_context());
     target_bs = bdrv_open(arg->target, NULL, options, flags, errp);
+    aio_context_release(qemu_get_aio_context());
+
     if (!target_bs) {
-        goto out;
+        return;
     }
 
     zero_target = (arg->sync == MIRROR_SYNC_MODE_FULL &&
@@ -3082,7 +3100,6 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
 
     /* Honor bdrv_try_change_aio_context() context acquisition requirements. */
     old_context = bdrv_get_aio_context(target_bs);
-    aio_context_release(aio_context);
     aio_context_acquire(old_context);
 
     ret = bdrv_try_change_aio_context(target_bs, aio_context, NULL, errp);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 6ff45308a9..4276163564 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -1071,7 +1071,11 @@ int main(int argc, char **argv)
         qdict_put_str(raw_opts, "driver", "raw");
         qdict_put_str(raw_opts, "file", bs->node_name);
         qdict_put_int(raw_opts, "offset", dev_offset);
+
+        aio_context_acquire(qemu_get_aio_context());
         bs = bdrv_open(NULL, NULL, raw_opts, flags, &error_fatal);
+        aio_context_release(qemu_get_aio_context());
+
         blk_remove_bs(blk);
         blk_insert_bs(blk, bs, &error_fatal);
         bdrv_unref(bs);
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 3a5e1eb2c4..1b40e65bad 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -833,8 +833,11 @@ static void test_attach_second_node(void)
     qdict_put_str(options, "driver", "raw");
     qdict_put_str(options, "file", "base");
 
+    /* FIXME raw_open() should take ctx's lock internally */
     aio_context_acquire(ctx);
+    aio_context_acquire(main_ctx);
     filter = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
+    aio_context_release(main_ctx);
     aio_context_release(ctx);
 
     g_assert(blk_get_aio_context(blk) == ctx);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (2 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open() Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:37   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file() Kevin Wolf
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

This fixes blk_new_open() to not assume that bs is in the main context.

In particular, the BlockBackend must be created with the right
AioContext because it will refuse to move to a different context
afterwards. (blk->allow_aio_context_change is false.)

Use this opportunity to use blk_insert_bs() instead of duplicating the
bdrv_root_attach_child() call. This is consistent with what
blk_new_with_bs() does. Add comments to document the locking rules.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/block-backend.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 1d89fabd35..dde60e0f71 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -389,6 +389,8 @@ BlockBackend *blk_new(AioContext *ctx, uint64_t perm, uint64_t shared_perm)
  * Both sets of permissions can be changed later using blk_set_perm().
  *
  * Return the new BlockBackend on success, null on failure.
+ *
+ * Callers must hold the AioContext lock of @bs.
  */
 BlockBackend *blk_new_with_bs(BlockDriverState *bs, uint64_t perm,
                               uint64_t shared_perm, Error **errp)
@@ -406,11 +408,15 @@ BlockBackend *blk_new_with_bs(BlockDriverState *bs, uint64_t perm,
 
 /*
  * Creates a new BlockBackend, opens a new BlockDriverState, and connects both.
- * The new BlockBackend is in the main AioContext.
+ * By default, the new BlockBackend is in the main AioContext, but if the
+ * parameters connect it with any existing node in a different AioContext, it
+ * may end up there instead.
  *
  * Just as with bdrv_open(), after having called this function the reference to
  * @options belongs to the block layer (even on failure).
  *
+ * Called without holding an AioContext lock.
+ *
  * TODO: Remove @filename and @flags; it should be possible to specify a whole
  * BDS tree just by specifying the @options QDict (or @reference,
  * alternatively). At the time of adding this function, this is not possible,
@@ -422,6 +428,7 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
 {
     BlockBackend *blk;
     BlockDriverState *bs;
+    AioContext *ctx;
     uint64_t perm = 0;
     uint64_t shared = BLK_PERM_ALL;
 
@@ -451,18 +458,24 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
         shared = BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED;
     }
 
-    blk = blk_new(qemu_get_aio_context(), perm, shared);
     aio_context_acquire(qemu_get_aio_context());
     bs = bdrv_open(filename, reference, options, flags, errp);
     aio_context_release(qemu_get_aio_context());
     if (!bs) {
-        blk_unref(blk);
         return NULL;
     }
 
-    blk->root = bdrv_root_attach_child(bs, "root", &child_root,
-                                       BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-                                       perm, shared, blk, errp);
+    /* bdrv_open() could have moved bs to a different AioContext */
+    ctx = bdrv_get_aio_context(bs);
+    blk = blk_new(bdrv_get_aio_context(bs), perm, shared);
+    blk->perm = perm;
+    blk->shared_perm = shared;
+
+    aio_context_acquire(ctx);
+    blk_insert_bs(blk, bs, errp);
+    bdrv_unref(bs);
+    aio_context_release(ctx);
+
     if (!blk->root) {
         blk_unref(blk);
         return NULL;
@@ -903,6 +916,8 @@ void blk_remove_bs(BlockBackend *blk)
 
 /*
  * Associates a new BlockDriverState with @blk.
+ *
+ * Callers must hold the AioContext lock of @bs.
  */
 int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp)
 {
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file()
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (3 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:40   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 06/12] qcow2: Fix open with 'file' in iothread Kevin Wolf
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

bdrv_open_backing_file() calls bdrv_open_inherit(), so all callers must
hold the main AioContext lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c        | 2 ++
 block/mirror.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/block.c b/block.c
index 79bc9c01de..be9ae364fb 100644
--- a/block.c
+++ b/block.c
@@ -3478,6 +3478,8 @@ int bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
  * itself, all options starting with "${bdref_key}." are considered part of the
  * BlockdevRef.
  *
+ * The caller must hold the main AioContext lock.
+ *
  * TODO Can this be unified with bdrv_open_image()?
  */
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
diff --git a/block/mirror.c b/block/mirror.c
index b7d92d1378..d3cacd1708 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -662,11 +662,15 @@ static int mirror_exit_common(Job *job)
     bool abort = job->ret < 0;
     int ret = 0;
 
+    GLOBAL_STATE_CODE();
+
     if (s->prepared) {
         return 0;
     }
     s->prepared = true;
 
+    aio_context_acquire(qemu_get_aio_context());
+
     mirror_top_bs = s->mirror_top_bs;
     bs_opaque = mirror_top_bs->opaque;
     src = mirror_top_bs->backing->bs;
@@ -789,6 +793,8 @@ static int mirror_exit_common(Job *job)
     bdrv_unref(mirror_top_bs);
     bdrv_unref(src);
 
+    aio_context_release(qemu_get_aio_context());
+
     return ret;
 }
 
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/12] qcow2: Fix open with 'file' in iothread
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (4 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file() Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:51   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 07/12] raw-format: " Kevin Wolf
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

qcow2_open() doesn't work correctly when opening the 'file' child moves
bs to an iothread, for several reasons:

- It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
  coroutine, which involves dropping the AioContext lock for bs when it
  is not in the main context - but we don't hold it, so this crashes.

- It runs the qcow2_open_entry() coroutine in the current thread instead
  of the new AioContext of bs.

- qcow2_open_entry() doesn't notify the main loop when it's done.

This patches fixes these issues around delegating work to a coroutine.
Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index b00b4e7575..7f3948360d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1904,6 +1904,8 @@ static void coroutine_fn qcow2_open_entry(void *opaque)
     qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, true,
                              qoc->errp);
     qemu_co_mutex_unlock(&s->lock);
+
+    aio_wait_kick();
 }
 
 static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
@@ -1929,8 +1931,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
 
     assert(!qemu_in_coroutine());
     assert(qemu_get_current_aio_context() == qemu_get_aio_context());
-    qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc));
-    BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
+
+    aio_co_enter(bdrv_get_aio_context(bs),
+                 qemu_coroutine_create(qcow2_open_entry, &qoc));
+    AIO_WAIT_WHILE_UNLOCKED(NULL, qoc.ret == -EINPROGRESS);
 
     return qoc.ret;
 }
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/12] raw-format: Fix open with 'file' in iothread
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (5 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 06/12] qcow2: Fix open with 'file' in iothread Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:52   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 08/12] copy-before-write: Fix open with child " Kevin Wolf
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

When opening the 'file' child moves bs to an iothread, we need to hold
the AioContext lock of it before we can call raw_apply_options() (and
more specifically, bdrv_getlength() inside of it).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/raw-format.c               | 5 +++++
 tests/unit/test-block-iothread.c | 3 ---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/raw-format.c b/block/raw-format.c
index 918fe4fb7e..e4f35268e6 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -468,6 +468,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
                     Error **errp)
 {
     BDRVRawState *s = bs->opaque;
+    AioContext *ctx;
     bool has_size;
     uint64_t offset, size;
     BdrvChildRole file_role;
@@ -515,7 +516,11 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
                 bs->file->bs->filename);
     }
 
+    ctx = bdrv_get_aio_context(bs);
+    aio_context_acquire(ctx);
     ret = raw_apply_options(bs, s, offset, has_size, size, errp);
+    aio_context_release(ctx);
+
     if (ret < 0) {
         return ret;
     }
diff --git a/tests/unit/test-block-iothread.c b/tests/unit/test-block-iothread.c
index 1b40e65bad..f081c09729 100644
--- a/tests/unit/test-block-iothread.c
+++ b/tests/unit/test-block-iothread.c
@@ -833,12 +833,9 @@ static void test_attach_second_node(void)
     qdict_put_str(options, "driver", "raw");
     qdict_put_str(options, "file", "base");
 
-    /* FIXME raw_open() should take ctx's lock internally */
-    aio_context_acquire(ctx);
     aio_context_acquire(main_ctx);
     filter = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
     aio_context_release(main_ctx);
-    aio_context_release(ctx);
 
     g_assert(blk_get_aio_context(blk) == ctx);
     g_assert(bdrv_get_aio_context(bs) == ctx);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/12] copy-before-write: Fix open with child in iothread
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (6 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 07/12] raw-format: " Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:55   ` Stefan Hajnoczi
  2023-05-26 14:30   ` Eric Blake
  2023-05-25 12:47 ` [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver() Kevin Wolf
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

The AioContext lock must not be held for bdrv_open_child(), but it is
necessary for the followig operations, in particular those using nested
event loops in coroutine wrappers.

Temporarily dropping the main AioContext lock is not necessary because
we know we run in the main thread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/copy-before-write.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 646d8227a4..b866e42271 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -412,6 +412,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     int64_t cluster_size;
     g_autoptr(BlockdevOptions) full_opts = NULL;
     BlockdevOptionsCbw *opts;
+    AioContext *ctx;
     int ret;
 
     full_opts = cbw_parse_options(options, errp);
@@ -432,11 +433,15 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
         return -EINVAL;
     }
 
+    ctx = bdrv_get_aio_context(bs);
+    aio_context_acquire(ctx);
+
     if (opts->bitmap) {
         bitmap = block_dirty_bitmap_lookup(opts->bitmap->node,
                                            opts->bitmap->name, NULL, errp);
         if (!bitmap) {
-            return -EINVAL;
+            ret = -EINVAL;
+            goto out;
         }
     }
     s->on_cbw_error = opts->has_on_cbw_error ? opts->on_cbw_error :
@@ -454,21 +459,24 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     s->bcs = block_copy_state_new(bs->file, s->target, bitmap, errp);
     if (!s->bcs) {
         error_prepend(errp, "Cannot create block-copy-state: ");
-        return -EINVAL;
+        ret = -EINVAL;
+        goto out;
     }
 
     cluster_size = block_copy_cluster_size(s->bcs);
 
     s->done_bitmap = bdrv_create_dirty_bitmap(bs, cluster_size, NULL, errp);
     if (!s->done_bitmap) {
-        return -EINVAL;
+        ret = -EINVAL;
+        goto out;
     }
     bdrv_disable_dirty_bitmap(s->done_bitmap);
 
     /* s->access_bitmap starts equal to bcs bitmap */
     s->access_bitmap = bdrv_create_dirty_bitmap(bs, cluster_size, NULL, errp);
     if (!s->access_bitmap) {
-        return -EINVAL;
+        ret = -EINVAL;
+        goto out;
     }
     bdrv_disable_dirty_bitmap(s->access_bitmap);
     bdrv_dirty_bitmap_merge_internal(s->access_bitmap,
@@ -478,7 +486,10 @@ static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
     qemu_co_mutex_init(&s->lock);
     QLIST_INIT(&s->frozen_read_reqs);
 
-    return 0;
+    ret = 0;
+out:
+    aio_context_release(ctx);
+    return ret;
 }
 
 static void cbw_close(BlockDriverState *bs)
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver()
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (7 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 08/12] copy-before-write: Fix open with child " Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:56   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node() Kevin Wolf
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

bdrv_refresh_total_sectors() and bdrv_refresh_limits() expect to be
called under the AioContext lock of the node. Take the lock.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index be9ae364fb..e368a43761 100644
--- a/block.c
+++ b/block.c
@@ -1613,6 +1613,7 @@ static int no_coroutine_fn GRAPH_UNLOCKED
 bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name,
                  QDict *options, int open_flags, Error **errp)
 {
+    AioContext *ctx;
     Error *local_err = NULL;
     int i, ret;
     GLOBAL_STATE_CODE();
@@ -1660,15 +1661,21 @@ bdrv_open_driver(BlockDriverState *bs, BlockDriver *drv, const char *node_name,
     bs->supported_read_flags |= BDRV_REQ_REGISTERED_BUF;
     bs->supported_write_flags |= BDRV_REQ_REGISTERED_BUF;
 
+    /* Get the context after .bdrv_open, it can change the context */
+    ctx = bdrv_get_aio_context(bs);
+    aio_context_acquire(ctx);
+
     ret = bdrv_refresh_total_sectors(bs, bs->total_sectors);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "Could not refresh total sector count");
+        aio_context_release(ctx);
         return ret;
     }
 
     bdrv_graph_rdlock_main_loop();
     bdrv_refresh_limits(bs, NULL, &local_err);
     bdrv_graph_rdunlock_main_loop();
+    aio_context_release(ctx);
 
     if (local_err) {
         error_propagate(errp, local_err);
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node()
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (8 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver() Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:58   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public Kevin Wolf
  2023-05-25 12:47 ` [PATCH 12/12] iotests: Test blockdev-create in iothread Kevin Wolf
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

While calling bdrv_new_open_driver_opts(), the main AioContext lock must
be held, not the lock of the AioContext of the block subtree it will be
added to afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block.c b/block.c
index e368a43761..d8cc145b77 100644
--- a/block.c
+++ b/block.c
@@ -5393,12 +5393,17 @@ static void bdrv_delete(BlockDriverState *bs)
  * empty set of options. The reference to the QDict belongs to the block layer
  * after the call (even on failure), so if the caller intends to reuse the
  * dictionary, it needs to use qobject_ref() before calling bdrv_open.
+ *
+ * The caller holds the AioContext lock for @bs. It must make sure that @bs
+ * stays in the same AioContext, i.e. @options must not refer to nodes in a
+ * different AioContext.
  */
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
                                    int flags, Error **errp)
 {
     ERRP_GUARD();
     int ret;
+    AioContext *ctx = bdrv_get_aio_context(bs);
     BlockDriverState *new_node_bs = NULL;
     const char *drvname, *node_name;
     BlockDriver *drv;
@@ -5419,8 +5424,14 @@ BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *options,
 
     GLOBAL_STATE_CODE();
 
+    aio_context_release(ctx);
+    aio_context_acquire(qemu_get_aio_context());
     new_node_bs = bdrv_new_open_driver_opts(drv, node_name, options, flags,
                                             errp);
+    aio_context_release(qemu_get_aio_context());
+    aio_context_acquire(ctx);
+    assert(bdrv_get_aio_context(bs) == ctx);
+
     options = NULL; /* bdrv_new_open_driver() eats options */
     if (!new_node_bs) {
         error_prepend(errp, "Could not create node: ");
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (9 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node() Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 18:59   ` Stefan Hajnoczi
  2023-05-25 12:47 ` [PATCH 12/12] iotests: Test blockdev-create in iothread Kevin Wolf
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

It has no internal callers, so its only use is being called from
individual test cases. If the name starts with an underscore, it is
considered private and linters warn against calling it. 256 only gets
away with it currently because it's on the exception list for linters.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/iotests.py | 2 +-
 tests/qemu-iotests/256        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 7073579a7d..ef66fbd62b 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -1421,7 +1421,7 @@ def _verify_virtio_blk() -> None:
     if 'virtio-blk' not in out:
         notrun('Missing virtio-blk in QEMU binary')
 
-def _verify_virtio_scsi_pci_or_ccw() -> None:
+def verify_virtio_scsi_pci_or_ccw() -> None:
     out = qemu_pipe('-M', 'none', '-device', 'help')
     if 'virtio-scsi-pci' not in out and 'virtio-scsi-ccw' not in out:
         notrun('Missing virtio-scsi-pci or virtio-scsi-ccw in QEMU binary')
diff --git a/tests/qemu-iotests/256 b/tests/qemu-iotests/256
index 13666813bd..d7e67f4a05 100755
--- a/tests/qemu-iotests/256
+++ b/tests/qemu-iotests/256
@@ -24,7 +24,7 @@ import os
 import iotests
 from iotests import log
 
-iotests._verify_virtio_scsi_pci_or_ccw()
+iotests.verify_virtio_scsi_pci_or_ccw()
 
 iotests.script_initialize(supported_fmts=['qcow2'])
 size = 64 * 1024 * 1024
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/12] iotests: Test blockdev-create in iothread
  2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
                   ` (10 preceding siblings ...)
  2023-05-25 12:47 ` [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public Kevin Wolf
@ 2023-05-25 12:47 ` Kevin Wolf
  2023-05-25 19:01   ` Stefan Hajnoczi
  11 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-25 12:47 UTC (permalink / raw)
  To: qemu-block; +Cc: kwolf, stefanha, hreitz, eblake, qemu-devel

If blockdev-create references an existing node in an iothread (e.g. as
it's 'file' child), then suddenly all of the image creation code must
run in that AioContext, too. Test that this actually works.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 tests/qemu-iotests/tests/iothreads-create     | 67 +++++++++++++++++++
 tests/qemu-iotests/tests/iothreads-create.out |  4 ++
 2 files changed, 71 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/iothreads-create
 create mode 100644 tests/qemu-iotests/tests/iothreads-create.out

diff --git a/tests/qemu-iotests/tests/iothreads-create b/tests/qemu-iotests/tests/iothreads-create
new file mode 100755
index 0000000000..0c862d73f2
--- /dev/null
+++ b/tests/qemu-iotests/tests/iothreads-create
@@ -0,0 +1,67 @@
+#!/usr/bin/env python3
+# group: rw quick
+#
+# Copyright (C) 2023 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+# Creator/Owner: Kevin Wolf <kwolf@redhat.com>
+
+import asyncio
+import iotests
+
+iotests.script_initialize(supported_fmts=['qcow2', 'qcow', 'qed', 'vdi',
+                                          'vmdk', 'parallels'])
+iotests.verify_virtio_scsi_pci_or_ccw()
+
+with iotests.FilePath('disk.img') as img_path, \
+     iotests.VM() as vm:
+
+    iotests.qemu_img_create('-f', 'raw', img_path, '0')
+
+    vm.add_object('iothread,id=iothread0')
+    vm.add_blockdev(f'file,node-name=img-file,read-only=on,'
+                    f'filename={img_path}')
+    vm.add_device('virtio-scsi,iothread=iothread0')
+    vm.add_device('scsi-hd,drive=img-file,share-rw=on')
+
+    vm.launch()
+
+    iotests.log(vm.qmp(
+        'blockdev-reopen',
+        options=[{
+            'driver': 'file',
+            'filename': img_path,
+            'node-name': 'img-file',
+            'read-only': False,
+        }],
+    ))
+    iotests.log(vm.qmp(
+        'blockdev-create',
+        job_id='job0',
+        options={
+            'driver': iotests.imgfmt,
+            'file': 'img-file',
+            'size': 1024 * 1024,
+        },
+    ))
+
+    # Should succeed and not time out
+    try:
+        vm.run_job('job0', wait=5.0)
+        vm.shutdown()
+    except asyncio.TimeoutError:
+        # VM may be stuck, kill it
+        vm.kill()
+        raise
diff --git a/tests/qemu-iotests/tests/iothreads-create.out b/tests/qemu-iotests/tests/iothreads-create.out
new file mode 100644
index 0000000000..5c974ff77e
--- /dev/null
+++ b/tests/qemu-iotests/tests/iothreads-create.out
@@ -0,0 +1,4 @@
+{"return": {}}
+{"return": {}}
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
+{"return": {}}
-- 
2.40.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers
  2023-05-25 12:47 ` [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers Kevin Wolf
@ 2023-05-25 18:15   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:15 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1101 bytes --]

On Thu, May 25, 2023 at 02:47:02PM +0200, Kevin Wolf wrote:
> All of the functions that currently take a BlockDriverState, BdrvChild
> or BlockBackend as their first parameter expect the associated
> AioContext to be locked when they are called. In the case of
> no_co_wrappers, they are called from bottom halves directly in the main
> loop, so no other caller can be expected to take the lock for them. This
> can result in assertion failures because a lock that isn't taken is
> released in nested event loops.
> 
> Looking at the first parameter is already done by co_wrappers to decide
> where the coroutine should run, so doing the same in no_co_wrappers is
> only consistent. Take the lock in the generated bottom halves to fix the
> problem.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/block/block-common.h       |  3 +++
>  block/block-backend.c              |  7 ++++++-
>  scripts/block-coroutine-wrapper.py | 25 +++++++++++++++----------
>  3 files changed, 24 insertions(+), 11 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)()
  2023-05-25 12:47 ` [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)() Kevin Wolf
@ 2023-05-25 18:16   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:16 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 507 bytes --]

On Thu, May 25, 2023 at 02:47:03PM +0200, Kevin Wolf wrote:
> These functions specify that the caller must hold the "@filename
> AioContext lock". This doesn't make sense, file names don't have an
> AioContext. New BlockDriverStates always start in the main AioContext,
> so this is what we really need here.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open()
  2023-05-25 12:47 ` [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open() Kevin Wolf
@ 2023-05-25 18:20   ` Stefan Hajnoczi
  2023-05-26  8:27     ` Kevin Wolf
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:20 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 544 bytes --]

On Thu, May 25, 2023 at 02:47:04PM +0200, Kevin Wolf wrote:
> The function documentation already says that all callers must hold the
> main AioContext lock, but not all of them do. This can cause assertion
> failures when functions called by bdrv_open() try to drop the lock. Fix
> a few more callers to take the lock before calling bdrv_open().

Did you audit the code to check that there are no cases where
aio_context_acquire() is now called twice, leading to aio_poll() hangs?

Otherwise:
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads
  2023-05-25 12:47 ` [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads Kevin Wolf
@ 2023-05-25 18:37   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:37 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

On Thu, May 25, 2023 at 02:47:05PM +0200, Kevin Wolf wrote:
> This fixes blk_new_open() to not assume that bs is in the main context.
> 
> In particular, the BlockBackend must be created with the right
> AioContext because it will refuse to move to a different context
> afterwards. (blk->allow_aio_context_change is false.)
> 
> Use this opportunity to use blk_insert_bs() instead of duplicating the
> bdrv_root_attach_child() call. This is consistent with what
> blk_new_with_bs() does. Add comments to document the locking rules.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/block-backend.c | 27 +++++++++++++++++++++------
>  1 file changed, 21 insertions(+), 6 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file()
  2023-05-25 12:47 ` [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file() Kevin Wolf
@ 2023-05-25 18:40   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:40 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 376 bytes --]

On Thu, May 25, 2023 at 02:47:06PM +0200, Kevin Wolf wrote:
> bdrv_open_backing_file() calls bdrv_open_inherit(), so all callers must
> hold the main AioContext lock.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c        | 2 ++
>  block/mirror.c | 6 ++++++
>  2 files changed, 8 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 06/12] qcow2: Fix open with 'file' in iothread
  2023-05-25 12:47 ` [PATCH 06/12] qcow2: Fix open with 'file' in iothread Kevin Wolf
@ 2023-05-25 18:51   ` Stefan Hajnoczi
  2023-05-26  8:40     ` Kevin Wolf
  0 siblings, 1 reply; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:51 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2118 bytes --]

On Thu, May 25, 2023 at 02:47:07PM +0200, Kevin Wolf wrote:
> qcow2_open() doesn't work correctly when opening the 'file' child moves
> bs to an iothread, for several reasons:
> 
> - It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
>   coroutine, which involves dropping the AioContext lock for bs when it
>   is not in the main context - but we don't hold it, so this crashes.
> 
> - It runs the qcow2_open_entry() coroutine in the current thread instead
>   of the new AioContext of bs.
> 
> - qcow2_open_entry() doesn't notify the main loop when it's done.
> 
> This patches fixes these issues around delegating work to a coroutine.
> Temporarily dropping the main AioContext lock is not necessary because
> we know we run in the main thread.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index b00b4e7575..7f3948360d 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -1904,6 +1904,8 @@ static void coroutine_fn qcow2_open_entry(void *opaque)
>      qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, true,
>                               qoc->errp);
>      qemu_co_mutex_unlock(&s->lock);
> +
> +    aio_wait_kick();
>  }
>  
>  static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> @@ -1929,8 +1931,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
>  
>      assert(!qemu_in_coroutine());
>      assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> -    qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc));
> -    BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
> +
> +    aio_co_enter(bdrv_get_aio_context(bs),
> +                 qemu_coroutine_create(qcow2_open_entry, &qoc));
> +    AIO_WAIT_WHILE_UNLOCKED(NULL, qoc.ret == -EINPROGRESS);

Want to update the doc comment for bdrv_open_file_child() with a warning
that @parent's AioContext can change?

Otherwise:
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 07/12] raw-format: Fix open with 'file' in iothread
  2023-05-25 12:47 ` [PATCH 07/12] raw-format: " Kevin Wolf
@ 2023-05-25 18:52   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:52 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]

On Thu, May 25, 2023 at 02:47:08PM +0200, Kevin Wolf wrote:
> When opening the 'file' child moves bs to an iothread, we need to hold
> the AioContext lock of it before we can call raw_apply_options() (and
> more specifically, bdrv_getlength() inside of it).
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/raw-format.c               | 5 +++++
>  tests/unit/test-block-iothread.c | 3 ---
>  2 files changed, 5 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 08/12] copy-before-write: Fix open with child in iothread
  2023-05-25 12:47 ` [PATCH 08/12] copy-before-write: Fix open with child " Kevin Wolf
@ 2023-05-25 18:55   ` Stefan Hajnoczi
  2023-05-26 14:30   ` Eric Blake
  1 sibling, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:55 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 586 bytes --]

On Thu, May 25, 2023 at 02:47:09PM +0200, Kevin Wolf wrote:
> The AioContext lock must not be held for bdrv_open_child(), but it is
> necessary for the followig operations, in particular those using nested
> event loops in coroutine wrappers.
> 
> Temporarily dropping the main AioContext lock is not necessary because
> we know we run in the main thread.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/copy-before-write.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver()
  2023-05-25 12:47 ` [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver() Kevin Wolf
@ 2023-05-25 18:56   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:56 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

On Thu, May 25, 2023 at 02:47:10PM +0200, Kevin Wolf wrote:
> bdrv_refresh_total_sectors() and bdrv_refresh_limits() expect to be
> called under the AioContext lock of the node. Take the lock.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 7 +++++++
>  1 file changed, 7 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node()
  2023-05-25 12:47 ` [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node() Kevin Wolf
@ 2023-05-25 18:58   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:58 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 415 bytes --]

On Thu, May 25, 2023 at 02:47:11PM +0200, Kevin Wolf wrote:
> While calling bdrv_new_open_driver_opts(), the main AioContext lock must
> be held, not the lock of the AioContext of the block subtree it will be
> added to afterwards.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public
  2023-05-25 12:47 ` [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public Kevin Wolf
@ 2023-05-25 18:59   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 18:59 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 595 bytes --]

On Thu, May 25, 2023 at 02:47:12PM +0200, Kevin Wolf wrote:
> It has no internal callers, so its only use is being called from
> individual test cases. If the name starts with an underscore, it is
> considered private and linters warn against calling it. 256 only gets
> away with it currently because it's on the exception list for linters.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  tests/qemu-iotests/iotests.py | 2 +-
>  tests/qemu-iotests/256        | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 12/12] iotests: Test blockdev-create in iothread
  2023-05-25 12:47 ` [PATCH 12/12] iotests: Test blockdev-create in iothread Kevin Wolf
@ 2023-05-25 19:01   ` Stefan Hajnoczi
  0 siblings, 0 replies; 29+ messages in thread
From: Stefan Hajnoczi @ 2023-05-25 19:01 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 689 bytes --]

On Thu, May 25, 2023 at 02:47:13PM +0200, Kevin Wolf wrote:
> If blockdev-create references an existing node in an iothread (e.g. as
> it's 'file' child), then suddenly all of the image creation code must
> run in that AioContext, too. Test that this actually works.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  tests/qemu-iotests/tests/iothreads-create     | 67 +++++++++++++++++++
>  tests/qemu-iotests/tests/iothreads-create.out |  4 ++
>  2 files changed, 71 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/iothreads-create
>  create mode 100644 tests/qemu-iotests/tests/iothreads-create.out

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open()
  2023-05-25 18:20   ` Stefan Hajnoczi
@ 2023-05-26  8:27     ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2023-05-26  8:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1249 bytes --]

Am 25.05.2023 um 20:20 hat Stefan Hajnoczi geschrieben:
> On Thu, May 25, 2023 at 02:47:04PM +0200, Kevin Wolf wrote:
> > The function documentation already says that all callers must hold the
> > main AioContext lock, but not all of them do. This can cause assertion
> > failures when functions called by bdrv_open() try to drop the lock. Fix
> > a few more callers to take the lock before calling bdrv_open().
> 
> Did you audit the code to check that there are no cases where
> aio_context_acquire() is now called twice, leading to aio_poll() hangs?

Yes, I did go through (hopefully) all callers, so I should have caught
all of them.

I'm pretty sure that at this point in the series there are some callers
that call it while locking the wrong AioContext, but by the end of the
series they should have disappeared. This is where I couldn't find any
patch order that keeps things fully working in all intermediate steps,
but having a single giant patch felt even worse.

The good thing is that locking the main context multiple times is
harmless because we're running in the main thread, so nested event loops
can make progress even if it's still locked. (This patch, and some
others in the series, rely on this.)

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 06/12] qcow2: Fix open with 'file' in iothread
  2023-05-25 18:51   ` Stefan Hajnoczi
@ 2023-05-26  8:40     ` Kevin Wolf
  2023-05-26  9:04       ` Kevin Wolf
  0 siblings, 1 reply; 29+ messages in thread
From: Kevin Wolf @ 2023-05-26  8:40 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3519 bytes --]

Am 25.05.2023 um 20:51 hat Stefan Hajnoczi geschrieben:
> On Thu, May 25, 2023 at 02:47:07PM +0200, Kevin Wolf wrote:
> > qcow2_open() doesn't work correctly when opening the 'file' child moves
> > bs to an iothread, for several reasons:
> > 
> > - It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
> >   coroutine, which involves dropping the AioContext lock for bs when it
> >   is not in the main context - but we don't hold it, so this crashes.
> > 
> > - It runs the qcow2_open_entry() coroutine in the current thread instead
> >   of the new AioContext of bs.
> > 
> > - qcow2_open_entry() doesn't notify the main loop when it's done.
> > 
> > This patches fixes these issues around delegating work to a coroutine.
> > Temporarily dropping the main AioContext lock is not necessary because
> > we know we run in the main thread.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/qcow2.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index b00b4e7575..7f3948360d 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -1904,6 +1904,8 @@ static void coroutine_fn qcow2_open_entry(void *opaque)
> >      qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, true,
> >                               qoc->errp);
> >      qemu_co_mutex_unlock(&s->lock);
> > +
> > +    aio_wait_kick();
> >  }
> >  
> >  static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> > @@ -1929,8 +1931,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> >  
> >      assert(!qemu_in_coroutine());
> >      assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> > -    qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc));
> > -    BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
> > +
> > +    aio_co_enter(bdrv_get_aio_context(bs),
> > +                 qemu_coroutine_create(qcow2_open_entry, &qoc));
> > +    AIO_WAIT_WHILE_UNLOCKED(NULL, qoc.ret == -EINPROGRESS);
> 
> Want to update the doc comment for bdrv_open_file_child() with a warning
> that @parent's AioContext can change?

Ok, I'll squash in the following.

I seem to remember that bdrv_open_child() is actually wrong, too,
regarding AioContext locking, but I didn't need to fix it for this test
case. We need more test cases to break everything. :-)

And the earlier we can get rid of the AioContext lock the better,
because it seems really hard to get right across the board.

Kevin

diff --git a/block.c b/block.c
index a2f8d5a0c0..263e1e22f3 100644
--- a/block.c
+++ b/block.c
@@ -3644,6 +3644,9 @@ done:
  * BlockdevRef.
  *
  * The BlockdevRef will be removed from the options QDict.
+ *
+ * @parent can move to a different AioContext in this functions. Callers must
+ * make sure that their AioContext locking is still correct after this.
  */
 BdrvChild *bdrv_open_child(const char *filename,
                            QDict *options, const char *bdref_key,
@@ -3668,6 +3671,9 @@ BdrvChild *bdrv_open_child(const char *filename,

 /*
  * Wrapper on bdrv_open_child() for most popular case: open primary child of bs.
+ *
+ * @parent can move to a different AioContext in this functions. Callers must
+ * make sure that their AioContext locking is still correct after this.
  */
 int bdrv_open_file_child(const char *filename,
                          QDict *options, const char *bdref_key,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 06/12] qcow2: Fix open with 'file' in iothread
  2023-05-26  8:40     ` Kevin Wolf
@ 2023-05-26  9:04       ` Kevin Wolf
  0 siblings, 0 replies; 29+ messages in thread
From: Kevin Wolf @ 2023-05-26  9:04 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-block, hreitz, eblake, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3811 bytes --]

Am 26.05.2023 um 10:40 hat Kevin Wolf geschrieben:
> Am 25.05.2023 um 20:51 hat Stefan Hajnoczi geschrieben:
> > On Thu, May 25, 2023 at 02:47:07PM +0200, Kevin Wolf wrote:
> > > qcow2_open() doesn't work correctly when opening the 'file' child moves
> > > bs to an iothread, for several reasons:
> > > 
> > > - It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry()
> > >   coroutine, which involves dropping the AioContext lock for bs when it
> > >   is not in the main context - but we don't hold it, so this crashes.
> > > 
> > > - It runs the qcow2_open_entry() coroutine in the current thread instead
> > >   of the new AioContext of bs.
> > > 
> > > - qcow2_open_entry() doesn't notify the main loop when it's done.
> > > 
> > > This patches fixes these issues around delegating work to a coroutine.
> > > Temporarily dropping the main AioContext lock is not necessary because
> > > we know we run in the main thread.
> > > 
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > ---
> > >  block/qcow2.c | 8 ++++++--
> > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block/qcow2.c b/block/qcow2.c
> > > index b00b4e7575..7f3948360d 100644
> > > --- a/block/qcow2.c
> > > +++ b/block/qcow2.c
> > > @@ -1904,6 +1904,8 @@ static void coroutine_fn qcow2_open_entry(void *opaque)
> > >      qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, true,
> > >                               qoc->errp);
> > >      qemu_co_mutex_unlock(&s->lock);
> > > +
> > > +    aio_wait_kick();
> > >  }
> > >  
> > >  static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> > > @@ -1929,8 +1931,10 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
> > >  
> > >      assert(!qemu_in_coroutine());
> > >      assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> > > -    qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc));
> > > -    BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS);
> > > +
> > > +    aio_co_enter(bdrv_get_aio_context(bs),
> > > +                 qemu_coroutine_create(qcow2_open_entry, &qoc));
> > > +    AIO_WAIT_WHILE_UNLOCKED(NULL, qoc.ret == -EINPROGRESS);
> > 
> > Want to update the doc comment for bdrv_open_file_child() with a warning
> > that @parent's AioContext can change?
> 
> Ok, I'll squash in the following.
> 
> I seem to remember that bdrv_open_child() is actually wrong, too,
> regarding AioContext locking, but I didn't need to fix it for this test
> case. We need more test cases to break everything. :-)
> 
> And the earlier we can get rid of the AioContext lock the better,
> because it seems really hard to get right across the board.
> 
> Kevin
> 
> diff --git a/block.c b/block.c
> index a2f8d5a0c0..263e1e22f3 100644
> --- a/block.c
> +++ b/block.c
> @@ -3644,6 +3644,9 @@ done:
>   * BlockdevRef.
>   *
>   * The BlockdevRef will be removed from the options QDict.
> + *
> + * @parent can move to a different AioContext in this functions. Callers must

s/functions/function/, in fact. :-)

> + * make sure that their AioContext locking is still correct after this.
>   */
>  BdrvChild *bdrv_open_child(const char *filename,
>                             QDict *options, const char *bdref_key,
> @@ -3668,6 +3671,9 @@ BdrvChild *bdrv_open_child(const char *filename,
> 
>  /*
>   * Wrapper on bdrv_open_child() for most popular case: open primary child of bs.
> + *
> + * @parent can move to a different AioContext in this functions. Callers must

And here, too.

> + * make sure that their AioContext locking is still correct after this.
>   */
>  int bdrv_open_file_child(const char *filename,
>                           QDict *options, const char *bdref_key,

Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 08/12] copy-before-write: Fix open with child in iothread
  2023-05-25 12:47 ` [PATCH 08/12] copy-before-write: Fix open with child " Kevin Wolf
  2023-05-25 18:55   ` Stefan Hajnoczi
@ 2023-05-26 14:30   ` Eric Blake
  1 sibling, 0 replies; 29+ messages in thread
From: Eric Blake @ 2023-05-26 14:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, stefanha, hreitz, qemu-devel

On Thu, May 25, 2023 at 02:47:09PM +0200, Kevin Wolf wrote:
> The AioContext lock must not be held for bdrv_open_child(), but it is
> necessary for the followig operations, in particular those using nested

following

> event loops in coroutine wrappers.
> 
> Temporarily dropping the main AioContext lock is not necessary because
> we know we run in the main thread.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/copy-before-write.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-05-26 14:31 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-25 12:47 [PATCH 00/12] block: Fix blockdev-create with iothreads Kevin Wolf
2023-05-25 12:47 ` [PATCH 01/12] block-coroutine-wrapper: Take AioContext lock in no_co_wrappers Kevin Wolf
2023-05-25 18:15   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 02/12] block: Clarify locking rules for bdrv_open(_inherit)() Kevin Wolf
2023-05-25 18:16   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 03/12] block: Take main AioContext lock when calling bdrv_open() Kevin Wolf
2023-05-25 18:20   ` Stefan Hajnoczi
2023-05-26  8:27     ` Kevin Wolf
2023-05-25 12:47 ` [PATCH 04/12] block-backend: Fix blk_new_open() for iothreads Kevin Wolf
2023-05-25 18:37   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 05/12] mirror: Hold main AioContext lock for calling bdrv_open_backing_file() Kevin Wolf
2023-05-25 18:40   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 06/12] qcow2: Fix open with 'file' in iothread Kevin Wolf
2023-05-25 18:51   ` Stefan Hajnoczi
2023-05-26  8:40     ` Kevin Wolf
2023-05-26  9:04       ` Kevin Wolf
2023-05-25 12:47 ` [PATCH 07/12] raw-format: " Kevin Wolf
2023-05-25 18:52   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 08/12] copy-before-write: Fix open with child " Kevin Wolf
2023-05-25 18:55   ` Stefan Hajnoczi
2023-05-26 14:30   ` Eric Blake
2023-05-25 12:47 ` [PATCH 09/12] block: Take AioContext lock in bdrv_open_driver() Kevin Wolf
2023-05-25 18:56   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 10/12] block: Fix AioContext locking in bdrv_insert_node() Kevin Wolf
2023-05-25 18:58   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 11/12] iotests: Make verify_virtio_scsi_pci_or_ccw() public Kevin Wolf
2023-05-25 18:59   ` Stefan Hajnoczi
2023-05-25 12:47 ` [PATCH 12/12] iotests: Test blockdev-create in iothread Kevin Wolf
2023-05-25 19:01   ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).