qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: stefanha@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH for-8.2] export/vhost-user-blk: Fix consecutive drains
Date: Mon, 27 Nov 2023 12:55:20 +0100	[thread overview]
Message-ID: <ZWSDqCuPEI_MtH6u@redhat.com> (raw)
In-Reply-To: <20231124174436.46536-1-kwolf@redhat.com>

Am 24.11.2023 um 18:44 hat Kevin Wolf geschrieben:
> The vhost-user-blk export implement AioContext switches in its drain
> implementation. This means that on drain_begin, it detaches the server
> from its AioContext and on drain_end, attaches it again and schedules
> the server->co_trip coroutine in the updated AioContext.
> 
> However, nothing guarantees that server->co_trip is even safe to be
> scheduled. Not only is it unclear that the coroutine is actually in a
> state where it can be reentered externally without causing problems, but
> with two consecutive drains, it is possible that the scheduled coroutine
> didn't have a chance yet to run and trying to schedule an already
> scheduled coroutine a second time crashes with an assertion failure.
> 
> Following the model of NBD, this commit makes the vhost-user-blk export
> shut down server->co_trip during drain so that resuming the export means
> creating and scheduling a new coroutine, which is always safe.
> 
> There is one exception: If the drain call didn't poll (for example, this
> happens in the context of bdrv_graph_wrlock()), then the coroutine
> didn't have a chance to shut down. However, in this case the AioContext
> can't have changed; changing the AioContext always involves a polling
> drain. So in this case we can simply assert that the AioContext is
> unchanged and just leave the coroutine running or wake it up if it has
> yielded to wait for the AioContext to be attached again.
> 
> Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf
> Fixes: https://issues.redhat.com/browse/RHEL-1708
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/vhost-user-server.h     |  2 ++
>  block/export/vhost-user-blk-server.c |  9 +++++--
>  util/vhost-user-server.c             | 36 +++++++++++++++++++++++-----
>  3 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
> index 64ad701015..ca1713b53e 100644
> --- a/include/qemu/vhost-user-server.h
> +++ b/include/qemu/vhost-user-server.h
> @@ -45,6 +45,8 @@ typedef struct {
>      /* Protected by ctx lock */
>      bool in_qio_channel_yield;
>      bool wait_idle;
> +    bool quiescing;
> +    bool wake_on_ctx_attach;
>      VuDev vu_dev;
>      QIOChannel *ioc; /* The I/O channel with the client */
>      QIOChannelSocket *sioc; /* The underlying data channel with the client */
> diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
> index fe2cee3a78..16f48388d3 100644
> --- a/block/export/vhost-user-blk-server.c
> +++ b/block/export/vhost-user-blk-server.c
> @@ -283,6 +283,7 @@ static void vu_blk_drained_begin(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
>  
> +    vexp->vu_server.quiescing = true;
>      vhost_user_server_detach_aio_context(&vexp->vu_server);
>  }
>  
> @@ -291,19 +292,23 @@ static void vu_blk_drained_end(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
>  
> +    vexp->vu_server.quiescing = false;
>      vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
>  }
>  
>  /*
> - * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
> + * Ensures that bdrv_drained_begin() waits until in-flight requests complete
> + * and the server->co_trip coroutine has terminated. It will be restarted in
> + * vhost_user_server_attach_aio_context().
>   *
>   * Called with vexp->export.ctx acquired.
>   */
>  static bool vu_blk_drained_poll(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
> +    VuServer *server = &vexp->vu_server;
>  
> -    return vhost_user_server_has_in_flight(&vexp->vu_server);
> +    return server->co_trip || vhost_user_server_has_in_flight(server);
>  }
>  
>  static const BlockDevOps vu_blk_dev_ops = {
> diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
> index 5ccc6d24a0..23004d0c62 100644
> --- a/util/vhost-user-server.c
> +++ b/util/vhost-user-server.c
> @@ -133,7 +133,9 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
>                      server->in_qio_channel_yield = false;
>                  } else {
>                      /* Wait until attached to an AioContext again */
> +                    server->wake_on_ctx_attach = true;
>                      qemu_coroutine_yield();
> +                    assert(!server->wake_on_ctx_attach);
>                  }

Yielding here isn't good enough as drained_poll waits for the coroutine
to terminate, and if the coroutine is here, it will hang. v2 will return
instead.

Kevin



      reply	other threads:[~2023-11-27 11:56 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24 17:44 [PATCH for-8.2] export/vhost-user-blk: Fix consecutive drains Kevin Wolf
2023-11-27 11:55 ` Kevin Wolf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWSDqCuPEI_MtH6u@redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).