qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: <qemu-devel@nongnu.org>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Yajun Wu <yajunw@nvidia.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Parav Pandit <parav@nvidia.com>, <qemu-stable@nongnu.org>,
	Yanghang Liu <yanghliu@redhat.com>
Subject: Re: [PATCH 1/2] Revert "vhost-user: Monitor slave channel in vhost_user_read()"
Date: Thu, 19 Jan 2023 19:39:49 +0100	[thread overview]
Message-ID: <20230119193949.013738a0@bahia> (raw)
In-Reply-To: <20230119172424.478268-2-groug@kaod.org>

For some reason, the cover letter didn't make it, even though
git publish reported it was sent. I'll repost tomorrow if it
is still missing (or possibly craft a message manually with
the appropriate id if I find time).

Sorry for the inconvenience.

Cheers,

--
Greg


On Thu, 19 Jan 2023 18:24:23 +0100
Greg Kurz <groug@kaod.org> wrote:

> This reverts commit db8a3772e300c1a656331a92da0785d81667dc81.
> 
> Motivation : this is breaking vhost-user with DPDK as reported in [0].
> 
> Received unexpected msg type. Expected 22 received 40
> Fail to update device iotlb
> Received unexpected msg type. Expected 40 received 22
> Received unexpected msg type. Expected 22 received 11
> Fail to update device iotlb
> Received unexpected msg type. Expected 11 received 22
> vhost VQ 1 ring restore failed: -71: Protocol error (71)
> Received unexpected msg type. Expected 22 received 11
> Fail to update device iotlb
> Received unexpected msg type. Expected 11 received 22
> vhost VQ 0 ring restore failed: -71: Protocol error (71)
> unable to start vhost net: 71: falling back on userspace virtio
> 
> The failing sequence that leads to the first error is :
> - QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master
>   socket
> - QEMU starts a nested event loop in order to wait for the
>   VHOST_USER_GET_STATUS response and to be able to process messages from
>   the slave channel
> - DPDK sends a couple of legitimate IOTLB miss messages on the slave
>   channel
> - QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22)
>   updates on the master socket
> - QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG
>   but it gets the response for the VHOST_USER_GET_STATUS instead
> 
> The subsequent errors have the same root cause : the nested event loop
> breaks the order by design. It lures QEMU to expect responses to the
> latest message sent on the master socket to arrive first.
> 
> Since this was only needed for DAX enablement which is still not merged
> upstream, just drop the code for now. A working solution will have to
> be merged later on. Likely protect the master socket with a mutex
> and service the slave channel with a separate thread, as discussed with
> Maxime in the mail thread below.
> 
> [0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/
> 
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/virtio/vhost-user.c | 35 +++--------------------------------
>  1 file changed, 3 insertions(+), 32 deletions(-)
> 
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index d9ce0501b2c7..7fb78af22c56 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -356,35 +356,6 @@ end:
>      return G_SOURCE_REMOVE;
>  }
>  
> -static gboolean slave_read(QIOChannel *ioc, GIOCondition condition,
> -                           gpointer opaque);
> -
> -/*
> - * This updates the read handler to use a new event loop context.
> - * Event sources are removed from the previous context : this ensures
> - * that events detected in the previous context are purged. They will
> - * be re-detected and processed in the new context.
> - */
> -static void slave_update_read_handler(struct vhost_dev *dev,
> -                                      GMainContext *ctxt)
> -{
> -    struct vhost_user *u = dev->opaque;
> -
> -    if (!u->slave_ioc) {
> -        return;
> -    }
> -
> -    if (u->slave_src) {
> -        g_source_destroy(u->slave_src);
> -        g_source_unref(u->slave_src);
> -    }
> -
> -    u->slave_src = qio_channel_add_watch_source(u->slave_ioc,
> -                                                G_IO_IN | G_IO_HUP,
> -                                                slave_read, dev, NULL,
> -                                                ctxt);
> -}
> -
>  static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
>  {
>      struct vhost_user *u = dev->opaque;
> @@ -406,7 +377,6 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
>       * be prepared for re-entrancy. So we create a new one and switch chr
>       * to use it.
>       */
> -    slave_update_read_handler(dev, ctxt);
>      qemu_chr_be_update_read_handlers(chr->chr, ctxt);
>      qemu_chr_fe_add_watch(chr, G_IO_IN | G_IO_HUP, vhost_user_read_cb, &data);
>  
> @@ -418,7 +388,6 @@ static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg)
>       * context that have been processed by the nested loop are purged.
>       */
>      qemu_chr_be_update_read_handlers(chr->chr, prev_ctxt);
> -    slave_update_read_handler(dev, NULL);
>  
>      g_main_loop_unref(loop);
>      g_main_context_unref(ctxt);
> @@ -1807,7 +1776,9 @@ static int vhost_setup_slave_channel(struct vhost_dev *dev)
>          return -ECONNREFUSED;
>      }
>      u->slave_ioc = ioc;
> -    slave_update_read_handler(dev, NULL);
> +    u->slave_src = qio_channel_add_watch_source(u->slave_ioc,
> +                                                G_IO_IN | G_IO_HUP,
> +                                                slave_read, dev, NULL, NULL);
>  
>      if (reply_supported) {
>          msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;



  reply	other threads:[~2023-01-19 18:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-19 17:24 [PATCH 0/2] vhost-user: Remove the nested event loop to unbreak the DPDK use case Greg Kurz
2023-01-19 17:24 ` [PATCH 1/2] Revert "vhost-user: Monitor slave channel in vhost_user_read()" Greg Kurz
2023-01-19 18:39   ` Greg Kurz [this message]
2023-01-19 21:10   ` Stefan Hajnoczi
2023-01-20 15:03   ` Maxime Coquelin
2023-01-19 17:24 ` [PATCH 2/2] Revert "vhost-user: Introduce nested event loop " Greg Kurz
2023-01-20 15:13   ` Maxime Coquelin
2023-01-23  8:21 ` [PATCH 0/2] vhost-user: Remove the nested event loop to unbreak the DPDK use case Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230119193949.013738a0@bahia \
    --to=groug@kaod.org \
    --cc=dgilbert@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@nvidia.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=yajunw@nvidia.com \
    --cc=yanghliu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).