All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: qemu-block@nongnu.org
Cc: stefanha@redhat.com, qemu-devel@nongnu.org
Subject: Re: [PATCH for-8.2] export/vhost-user-blk: Fix consecutive drains
Date: Mon, 27 Nov 2023 12:55:20 +0100	[thread overview]
Message-ID: <ZWSDqCuPEI_MtH6u@redhat.com> (raw)
In-Reply-To: <20231124174436.46536-1-kwolf@redhat.com>

Am 24.11.2023 um 18:44 hat Kevin Wolf geschrieben:
> The vhost-user-blk export implement AioContext switches in its drain
> implementation. This means that on drain_begin, it detaches the server
> from its AioContext and on drain_end, attaches it again and schedules
> the server->co_trip coroutine in the updated AioContext.
> 
> However, nothing guarantees that server->co_trip is even safe to be
> scheduled. Not only is it unclear that the coroutine is actually in a
> state where it can be reentered externally without causing problems, but
> with two consecutive drains, it is possible that the scheduled coroutine
> didn't have a chance yet to run and trying to schedule an already
> scheduled coroutine a second time crashes with an assertion failure.
> 
> Following the model of NBD, this commit makes the vhost-user-blk export
> shut down server->co_trip during drain so that resuming the export means
> creating and scheduling a new coroutine, which is always safe.
> 
> There is one exception: If the drain call didn't poll (for example, this
> happens in the context of bdrv_graph_wrlock()), then the coroutine
> didn't have a chance to shut down. However, in this case the AioContext
> can't have changed; changing the AioContext always involves a polling
> drain. So in this case we can simply assert that the AioContext is
> unchanged and just leave the coroutine running or wake it up if it has
> yielded to wait for the AioContext to be attached again.
> 
> Fixes: e1054cd4aad03a493a5d1cded7508f7c348205bf
> Fixes: https://issues.redhat.com/browse/RHEL-1708
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/vhost-user-server.h     |  2 ++
>  block/export/vhost-user-blk-server.c |  9 +++++--
>  util/vhost-user-server.c             | 36 +++++++++++++++++++++++-----
>  3 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h
> index 64ad701015..ca1713b53e 100644
> --- a/include/qemu/vhost-user-server.h
> +++ b/include/qemu/vhost-user-server.h
> @@ -45,6 +45,8 @@ typedef struct {
>      /* Protected by ctx lock */
>      bool in_qio_channel_yield;
>      bool wait_idle;
> +    bool quiescing;
> +    bool wake_on_ctx_attach;
>      VuDev vu_dev;
>      QIOChannel *ioc; /* The I/O channel with the client */
>      QIOChannelSocket *sioc; /* The underlying data channel with the client */
> diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c
> index fe2cee3a78..16f48388d3 100644
> --- a/block/export/vhost-user-blk-server.c
> +++ b/block/export/vhost-user-blk-server.c
> @@ -283,6 +283,7 @@ static void vu_blk_drained_begin(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
>  
> +    vexp->vu_server.quiescing = true;
>      vhost_user_server_detach_aio_context(&vexp->vu_server);
>  }
>  
> @@ -291,19 +292,23 @@ static void vu_blk_drained_end(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
>  
> +    vexp->vu_server.quiescing = false;
>      vhost_user_server_attach_aio_context(&vexp->vu_server, vexp->export.ctx);
>  }
>  
>  /*
> - * Ensures that bdrv_drained_begin() waits until in-flight requests complete.
> + * Ensures that bdrv_drained_begin() waits until in-flight requests complete
> + * and the server->co_trip coroutine has terminated. It will be restarted in
> + * vhost_user_server_attach_aio_context().
>   *
>   * Called with vexp->export.ctx acquired.
>   */
>  static bool vu_blk_drained_poll(void *opaque)
>  {
>      VuBlkExport *vexp = opaque;
> +    VuServer *server = &vexp->vu_server;
>  
> -    return vhost_user_server_has_in_flight(&vexp->vu_server);
> +    return server->co_trip || vhost_user_server_has_in_flight(server);
>  }
>  
>  static const BlockDevOps vu_blk_dev_ops = {
> diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
> index 5ccc6d24a0..23004d0c62 100644
> --- a/util/vhost-user-server.c
> +++ b/util/vhost-user-server.c
> @@ -133,7 +133,9 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg *vmsg)
>                      server->in_qio_channel_yield = false;
>                  } else {
>                      /* Wait until attached to an AioContext again */
> +                    server->wake_on_ctx_attach = true;
>                      qemu_coroutine_yield();
> +                    assert(!server->wake_on_ctx_attach);
>                  }

Yielding here isn't good enough as drained_poll waits for the coroutine
to terminate, and if the coroutine is here, it will hang. v2 will return
instead.

Kevin



      reply	other threads:[~2023-11-27 11:56 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-24 17:44 [PATCH for-8.2] export/vhost-user-blk: Fix consecutive drains Kevin Wolf
2023-11-27 11:55 ` Kevin Wolf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZWSDqCuPEI_MtH6u@redhat.com \
    --to=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.