qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] blcok: Start/end drain on correct AioContext
@ 2022-09-23 12:52 Hanna Reitz
  2022-09-23 12:52 ` [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS Hanna Reitz
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Hanna Reitz @ 2022-09-23 12:52 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Hanna Reitz, Kevin Wolf, Stefan Hajnoczi

Hi,

bdrv_replace_child_noperm() drains the child via
bdrv_parent_drained_{begin,end}_single().  When it removes a child, the
bdrv_parent_drained_end_single() at its end will be called on an empty
child, making the BDRV_POLL_WHILE() in it poll the main AioContext
(because c->bs is NULL).

That’s wrong, though, because it’s supposed to operate on the parent.
bdrv_parent_drained_end_single_no_poll() will have scheduled any BHs in
the parents’ AioContext, which may be anything, not necessarily the main
context.  Therefore, we must poll the parent’s context.

Patch 3 does this for both bdrv_parent_drained_{begin,end}_single().
Patch 1 ensures that we can legally call
bdrv_child_get_parent_aio_context() from those functions (currently
marked as GLOBAL_STATE_CODE(), which I don’t think it is), and patch 2
fixes blk_do_set_aio_context() to not cause an assertion failure if it
beginning a drain can end up in blk_get_aio_context() before blk->ctx
has been updated.


Hanna Reitz (3):
  block: bdrv_child_get_parent_aio_context is not GS
  block-backend: Update ctx immediately after root
  block: Start/end drain on correct AioContext

 block.c               | 2 +-
 block/block-backend.c | 4 +++-
 block/io.c            | 6 ++++--
 3 files changed, 8 insertions(+), 4 deletions(-)

-- 
2.36.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS
  2022-09-23 12:52 [PATCH 0/3] blcok: Start/end drain on correct AioContext Hanna Reitz
@ 2022-09-23 12:52 ` Hanna Reitz
  2022-10-07  8:44   ` Kevin Wolf
  2022-09-23 12:52 ` [PATCH 2/3] block-backend: Update ctx immediately after root Hanna Reitz
  2022-09-23 12:52 ` [PATCH 3/3] block: Start/end drain on correct AioContext Hanna Reitz
  2 siblings, 1 reply; 7+ messages in thread
From: Hanna Reitz @ 2022-09-23 12:52 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Hanna Reitz, Kevin Wolf, Stefan Hajnoczi

All implementations of bdrv_child_get_parent_aio_context() are IO_CODE
(or do not mark anything in the case of block jobs), so this too can be
IO_CODE.  By the definition of "I/O API functions" in block-io.h, this
is a strict relaxation, as I/O code can be run from both GS and I/O code
arbitrarily.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index bc85f46eed..7f2a9d4df0 100644
--- a/block.c
+++ b/block.c
@@ -1499,7 +1499,7 @@ const BdrvChildClass child_of_bds = {
 
 AioContext *bdrv_child_get_parent_aio_context(BdrvChild *c)
 {
-    GLOBAL_STATE_CODE();
+    IO_CODE();
     return c->klass->get_parent_aio_context(c);
 }
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] block-backend: Update ctx immediately after root
  2022-09-23 12:52 [PATCH 0/3] blcok: Start/end drain on correct AioContext Hanna Reitz
  2022-09-23 12:52 ` [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS Hanna Reitz
@ 2022-09-23 12:52 ` Hanna Reitz
  2022-10-07  8:48   ` Kevin Wolf
  2022-09-23 12:52 ` [PATCH 3/3] block: Start/end drain on correct AioContext Hanna Reitz
  2 siblings, 1 reply; 7+ messages in thread
From: Hanna Reitz @ 2022-09-23 12:52 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Hanna Reitz, Kevin Wolf, Stefan Hajnoczi

blk_get_aio_context() asserts that blk->ctx is always equal to the root
BDS's context (if there is a root BDS).  Therefore,
blk_do_set_aio_context() must update blk->ctx immediately after the root
BDS's context has changed.

Without this patch, the next patch would break iotest 238, because
bdrv_drained_begin() (called by blk_do_set_aio_context()) may then
invoke bdrv_child_get_parent_aio_context() on the root child, i.e.
blk_get_aio_context().  However, by this point, blk->ctx would not have
been updated and thus differ from the root node's context.  This patch
fixes that.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block/block-backend.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index d4a5df2ac2..abdb5ff5af 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2156,6 +2156,7 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
                 return ret;
             }
         }
+        blk->ctx = new_context;
         if (tgm->throttle_state) {
             bdrv_drained_begin(bs);
             throttle_group_detach_aio_context(tgm);
@@ -2164,9 +2165,10 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
         }
 
         bdrv_unref(bs);
+    } else {
+        blk->ctx = new_context;
     }
 
-    blk->ctx = new_context;
     return 0;
 }
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] block: Start/end drain on correct AioContext
  2022-09-23 12:52 [PATCH 0/3] blcok: Start/end drain on correct AioContext Hanna Reitz
  2022-09-23 12:52 ` [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS Hanna Reitz
  2022-09-23 12:52 ` [PATCH 2/3] block-backend: Update ctx immediately after root Hanna Reitz
@ 2022-09-23 12:52 ` Hanna Reitz
  2022-10-07  9:05   ` Kevin Wolf
  2 siblings, 1 reply; 7+ messages in thread
From: Hanna Reitz @ 2022-09-23 12:52 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, Hanna Reitz, Kevin Wolf, Stefan Hajnoczi

bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
parent, not on the child, so they should not attempt to get the context
to poll from the child but the parent instead.  BDRV_POLL_WHILE(c->bs)
does get the context from the child, so we should replace it with
AIO_WAIT_WHILE() on the parent's context instead.

This problem becomes apparent when bdrv_replace_child_noperm() invokes
bdrv_parent_drained_end_single() after removing a child from a subgraph
that is in an I/O thread.  By the time bdrv_parent_drained_end_single()
is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
poll the main loop instead of the I/O thread; but anything that
bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
want to run in the I/O thread, but because we poll the main loop, the
I/O thread is never unpaused, and nothing is run, resulting in a
deadlock.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 block/io.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/io.c b/block/io.c
index 0a8cbefe86..7df1129b3b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -71,9 +71,10 @@ static void bdrv_parent_drained_end_single_no_poll(BdrvChild *c,
 void bdrv_parent_drained_end_single(BdrvChild *c)
 {
     int drained_end_counter = 0;
+    AioContext *ctx = bdrv_child_get_parent_aio_context(c);
     IO_OR_GS_CODE();
     bdrv_parent_drained_end_single_no_poll(c, &drained_end_counter);
-    BDRV_POLL_WHILE(c->bs, qatomic_read(&drained_end_counter) > 0);
+    AIO_WAIT_WHILE(ctx, qatomic_read(&drained_end_counter) > 0);
 }
 
 static void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore,
@@ -116,13 +117,14 @@ static bool bdrv_parent_drained_poll(BlockDriverState *bs, BdrvChild *ignore,
 
 void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll)
 {
+    AioContext *ctx = bdrv_child_get_parent_aio_context(c);
     IO_OR_GS_CODE();
     c->parent_quiesce_counter++;
     if (c->klass->drained_begin) {
         c->klass->drained_begin(c);
     }
     if (poll) {
-        BDRV_POLL_WHILE(c->bs, bdrv_parent_drained_poll_single(c));
+        AIO_WAIT_WHILE(ctx, bdrv_parent_drained_poll_single(c));
     }
 }
 
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS
  2022-09-23 12:52 ` [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS Hanna Reitz
@ 2022-10-07  8:44   ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-10-07  8:44 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi, eesposit

Am 23.09.2022 um 14:52 hat Hanna Reitz geschrieben:
> All implementations of bdrv_child_get_parent_aio_context() are IO_CODE
> (or do not mark anything in the case of block jobs), so this too can be
> IO_CODE.  By the definition of "I/O API functions" in block-io.h, this
> is a strict relaxation, as I/O code can be run from both GS and I/O code
> arbitrarily.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

There are three implementations of .get_parent_aio_context in the tree:

1. child_of_bds_get_parent_aio_context() in block.c
   This is already IO_CODE(), good.

2. child_job_get_parent_aio_context() in blockjob.c
   This is explicitly marked GLOBAL_STATE_CODE() after Emanuele's series
   to avoid the AioContext lock in jobs. I suppose it could be made
   IO_CODE() if it also used JOB_LOCK_GUARD().

3. blk_root_get_parent_aio_context() in block-backend.c
   This doesn't have any annotation, but it only calls
   blk_get_aio_context(), which is IO_CODE. So this one is good, too.

Seems we just have a semantic merge conflict with Emanuele's series. Can
you rebase on top of my tree?

Kevin



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] block-backend: Update ctx immediately after root
  2022-09-23 12:52 ` [PATCH 2/3] block-backend: Update ctx immediately after root Hanna Reitz
@ 2022-10-07  8:48   ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-10-07  8:48 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi

Am 23.09.2022 um 14:52 hat Hanna Reitz geschrieben:
> blk_get_aio_context() asserts that blk->ctx is always equal to the root
> BDS's context (if there is a root BDS).  Therefore,
> blk_do_set_aio_context() must update blk->ctx immediately after the root
> BDS's context has changed.
> 
> Without this patch, the next patch would break iotest 238, because
> bdrv_drained_begin() (called by blk_do_set_aio_context()) may then
> invoke bdrv_child_get_parent_aio_context() on the root child, i.e.
> blk_get_aio_context().  However, by this point, blk->ctx would not have
> been updated and thus differ from the root node's context.  This patch
> fixes that.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
>  block/block-backend.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index d4a5df2ac2..abdb5ff5af 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -2156,6 +2156,7 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
>                  return ret;
>              }
>          }
> +        blk->ctx = new_context;
>          if (tgm->throttle_state) {
>              bdrv_drained_begin(bs);
>              throttle_group_detach_aio_context(tgm);
> @@ -2164,9 +2165,10 @@ static int blk_do_set_aio_context(BlockBackend *blk, AioContext *new_context,
>          }
>  
>          bdrv_unref(bs);
> +    } else {
> +        blk->ctx = new_context;
>      }
>  
> -    blk->ctx = new_context;
>      return 0;
>  }

Makes sense. Maybe in the first branch a comment wouldn't hurt (like
"make blk->ctx consistent with the root node again before any other
operations like drain").

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] block: Start/end drain on correct AioContext
  2022-09-23 12:52 ` [PATCH 3/3] block: Start/end drain on correct AioContext Hanna Reitz
@ 2022-10-07  9:05   ` Kevin Wolf
  0 siblings, 0 replies; 7+ messages in thread
From: Kevin Wolf @ 2022-10-07  9:05 UTC (permalink / raw)
  To: Hanna Reitz; +Cc: qemu-block, qemu-devel, Stefan Hajnoczi

Am 23.09.2022 um 14:52 hat Hanna Reitz geschrieben:
> bdrv_parent_drained_{begin,end}_single() are supposed to operate on the
> parent, not on the child, so they should not attempt to get the context
> to poll from the child but the parent instead.  BDRV_POLL_WHILE(c->bs)
> does get the context from the child, so we should replace it with
> AIO_WAIT_WHILE() on the parent's context instead.
> 
> This problem becomes apparent when bdrv_replace_child_noperm() invokes
> bdrv_parent_drained_end_single() after removing a child from a subgraph
> that is in an I/O thread.  By the time bdrv_parent_drained_end_single()
> is called, child->bs is NULL, and so BDRV_POLL_WHILE(c->bs, ...) will
> poll the main loop instead of the I/O thread; but anything that
> bdrv_parent_drained_end_single_no_poll() may have scheduled is going to
> want to run in the I/O thread, but because we poll the main loop, the
> I/O thread is never unpaused, and nothing is run, resulting in a
> deadlock.
> 
> Closes: https://gitlab.com/qemu-project/qemu/-/issues/1215
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

Hmm... Seeing a BdrvChild with child->bs == NULL outside of functions
directly manipulating it is surprising. I think we need to document that
at least for bdrv_parent_drained_begin/end_single(), that they can get
such BdrvChild objects passed.

Even then, polling with an incomplete BdrvChild in the graph doesn't
sound like the best idea either, but not sure how to avoid that best.
Maybe we would need a different order in bdrv_replace_child_noperm() if
either old_bs or new_bs is NULL.

Anyway, the code change itself looks reasonable:

Reviewed-by: Kevin Wolf <kwolf@redhat.com>



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-10-07  9:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-23 12:52 [PATCH 0/3] blcok: Start/end drain on correct AioContext Hanna Reitz
2022-09-23 12:52 ` [PATCH 1/3] block: bdrv_child_get_parent_aio_context is not GS Hanna Reitz
2022-10-07  8:44   ` Kevin Wolf
2022-09-23 12:52 ` [PATCH 2/3] block-backend: Update ctx immediately after root Hanna Reitz
2022-10-07  8:48   ` Kevin Wolf
2022-09-23 12:52 ` [PATCH 3/3] block: Start/end drain on correct AioContext Hanna Reitz
2022-10-07  9:05   ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).