Re: [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Hao Chen <chenh@yusur.tech>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>,
	huangml@yusur.tech, zy@yusur.tech
Subject: Re: [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization
Date: Tue, 20 Feb 2024 06:43:28 -0500	[thread overview]
Message-ID: <20240220064027-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5176a8e4-dbdc-45e0-a1f2-d9cb3b71a6b1@redhat.com>

On Tue, Feb 20, 2024 at 12:26:49PM +0100, Maxime Coquelin wrote:
> 
> 
> On 2/13/24 11:05, Michael S. Tsirkin wrote:
> > On Fri, Jan 26, 2024 at 06:07:37PM +0800, Hao Chen wrote:
> > > I run "dpdk-vdpa" and "qemur-L2" in "qemu-L1".
> > > 
> > > In a nested virtualization environment, "qemu-L2" vhost-user socket sends
> > > a "VHOST_USER_IOTLB_MSG" message to "dpdk-vdpa" and blocks waiting for
> > > "dpdk-vdpa" to process the message.
> > > If "dpdk-vdpa" doesn't complete the processing of the "VHOST_USER_IOTLB_MSG"
> > > message and sends a message that needs to be replied in another thread,
> > > such as "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", "dpdk-vdpa" will also
> > > block and wait for "qemu-L2" to process this message. However, "qemu-L2"
> > > vhost-user's socket is blocking while waiting for a reply from "dpdk-vdpa"
> > > after processing the message "VHOSTr_USER_IOTLB_MSG", and
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" will not be processed.
> > > In this case, both "dpdk-vdpa" and "qemu-L2" are blocked on the
> > > vhost read, resulting in a deadlock.
> > > 
> > > You can modify "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" or
> > > "VHOST_USER_IOTLB_MSG" to "no need reply" to fix this issue.
> > > There are too many messages in dpdk that are similar to
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", and I would prefer the latter.
> > > 
> > > Fixes: 24e34754eb78 ("vhost-user: factor out msg head and payload")
> > > 
> > > Signed-off-by: Hao Chen <chenh@yusur.tech>
> > 
> > I would be very worried that IOTLB becomes stale and
> > guest memory is corrupted if we just proceed without waiting.
> > 
> > Maxime what do you think? How would you address the issue?
> 
> I agree with you, this is not possible.
> For example, in case of IOTLB invalidate, the frontend relies on the
> backend reply to ensure it is no more accessing the memory before
> proceeding.
> 
> The reply-ack for VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG request is
> less important, if it fails the host notifications won't work but would
> not risk corruption. Maybe on Qemu side we could fail init if processing
> the request fails, as I think that if negotiated, we can expect it to
> succeed.
> 
> What do you think about this proposal?
> 
> Regards,
> Maxime

Fundamentally, I think that if qemu blocks guest waiting for a rely
that is ok but it really has to process incoming messages meanwhile.
Same should apply to backend I think ...


> > 
> > 
> > > ---
> > >   hw/virtio/vhost-user.c | 10 ++--------
> > >   1 file changed, 2 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index f214df804b..02caa94b6c 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -2371,20 +2371,14 @@ static int vhost_user_net_set_mtu(struct vhost_dev *dev, uint16_t mtu)
> > >   static int vhost_user_send_device_iotlb_msg(struct vhost_dev *dev,
> > >                                               struct vhost_iotlb_msg *imsg)
> > >   {
> > > -    int ret;
> > >       VhostUserMsg msg = {
> > >           .hdr.request = VHOST_USER_IOTLB_MSG,
> > >           .hdr.size = sizeof(msg.payload.iotlb),
> > > -        .hdr.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY_MASK,
> > > +        .hdr.flags = VHOST_USER_VERSION,
> > >           .payload.iotlb = *imsg,
> > >       };
> > > -    ret = vhost_user_write(dev, &msg, NULL, 0);
> > > -    if (ret < 0) {
> > > -        return ret;
> > > -    }
> > > -
> > > -    return process_message_reply(dev, &msg);
> > > +    return vhost_user_write(dev, &msg, NULL, 0);
> > >   }
> > > -- 
> > > 2.27.0
> >

next prev parent reply	other threads:[~2024-02-20 11:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 10:07 [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization Hao Chen
2024-02-13 10:05 ` Michael S. Tsirkin
2024-02-20 11:26   ` Maxime Coquelin
2024-02-20 11:43     ` Michael S. Tsirkin [this message]
2024-02-21  9:02       ` Maxime Coquelin
2024-02-21  9:19         ` Hao Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240220064027-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=chenh@yusur.tech \
    --cc=huangml@yusur.tech \
    --cc=maxime.coquelin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zy@yusur.tech \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.