Re: [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Hao Chen <chenh@yusur.tech>,
	"open list:All patches CC here" <qemu-devel@nongnu.org>,
	huangml@yusur.tech, zy@yusur.tech
Subject: Re: [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization
Date: Tue, 20 Feb 2024 06:43:28 -0500	[thread overview]
Message-ID: <20240220064027-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <5176a8e4-dbdc-45e0-a1f2-d9cb3b71a6b1@redhat.com>

On Tue, Feb 20, 2024 at 12:26:49PM +0100, Maxime Coquelin wrote:
> 
> 
> On 2/13/24 11:05, Michael S. Tsirkin wrote:
> > On Fri, Jan 26, 2024 at 06:07:37PM +0800, Hao Chen wrote:
> > > I run "dpdk-vdpa" and "qemur-L2" in "qemu-L1".
> > > 
> > > In a nested virtualization environment, "qemu-L2" vhost-user socket sends
> > > a "VHOST_USER_IOTLB_MSG" message to "dpdk-vdpa" and blocks waiting for
> > > "dpdk-vdpa" to process the message.
> > > If "dpdk-vdpa" doesn't complete the processing of the "VHOST_USER_IOTLB_MSG"
> > > message and sends a message that needs to be replied in another thread,
> > > such as "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", "dpdk-vdpa" will also
> > > block and wait for "qemu-L2" to process this message. However, "qemu-L2"
> > > vhost-user's socket is blocking while waiting for a reply from "dpdk-vdpa"
> > > after processing the message "VHOSTr_USER_IOTLB_MSG", and
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" will not be processed.
> > > In this case, both "dpdk-vdpa" and "qemu-L2" are blocked on the
> > > vhost read, resulting in a deadlock.
> > > 
> > > You can modify "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG" or
> > > "VHOST_USER_IOTLB_MSG" to "no need reply" to fix this issue.
> > > There are too many messages in dpdk that are similar to
> > > "VHOST_USER_SLAVE_VRING_HOST_NOTIFIER_MSG", and I would prefer the latter.
> > > 
> > > Fixes: 24e34754eb78 ("vhost-user: factor out msg head and payload")
> > > 
> > > Signed-off-by: Hao Chen <chenh@yusur.tech>
> > 
> > I would be very worried that IOTLB becomes stale and
> > guest memory is corrupted if we just proceed without waiting.
> > 
> > Maxime what do you think? How would you address the issue?
> 
> I agree with you, this is not possible.
> For example, in case of IOTLB invalidate, the frontend relies on the
> backend reply to ensure it is no more accessing the memory before
> proceeding.
> 
> The reply-ack for VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG request is
> less important, if it fails the host notifications won't work but would
> not risk corruption. Maybe on Qemu side we could fail init if processing
> the request fails, as I think that if negotiated, we can expect it to
> succeed.
> 
> What do you think about this proposal?
> 
> Regards,
> Maxime

Fundamentally, I think that if qemu blocks guest waiting for a rely
that is ok but it really has to process incoming messages meanwhile.
Same should apply to backend I think ...


> > 
> > 
> > > ---
> > >   hw/virtio/vhost-user.c | 10 ++--------
> > >   1 file changed, 2 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > index f214df804b..02caa94b6c 100644
> > > --- a/hw/virtio/vhost-user.c
> > > +++ b/hw/virtio/vhost-user.c
> > > @@ -2371,20 +2371,14 @@ static int vhost_user_net_set_mtu(struct vhost_dev *dev, uint16_t mtu)
> > >   static int vhost_user_send_device_iotlb_msg(struct vhost_dev *dev,
> > >                                               struct vhost_iotlb_msg *imsg)
> > >   {
> > > -    int ret;
> > >       VhostUserMsg msg = {
> > >           .hdr.request = VHOST_USER_IOTLB_MSG,
> > >           .hdr.size = sizeof(msg.payload.iotlb),
> > > -        .hdr.flags = VHOST_USER_VERSION | VHOST_USER_NEED_REPLY_MASK,
> > > +        .hdr.flags = VHOST_USER_VERSION,
> > >           .payload.iotlb = *imsg,
> > >       };
> > > -    ret = vhost_user_write(dev, &msg, NULL, 0);
> > > -    if (ret < 0) {
> > > -        return ret;
> > > -    }
> > > -
> > > -    return process_message_reply(dev, &msg);
> > > +    return vhost_user_write(dev, &msg, NULL, 0);
> > >   }
> > > -- 
> > > 2.27.0
> >

next prev parent reply	other threads:[~2024-02-20 11:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26 10:07 [PATCH] vhost-user: fix the issue of vhost deadlock in nested virtualization Hao Chen
2024-02-13 10:05 ` Michael S. Tsirkin
2024-02-20 11:26   ` Maxime Coquelin
2024-02-20 11:43     ` Michael S. Tsirkin [this message]
2024-02-21  9:02       ` Maxime Coquelin
2024-02-21  9:19         ` Hao Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240220064027-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=chenh@yusur.tech \
    --cc=huangml@yusur.tech \
    --cc=maxime.coquelin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zy@yusur.tech \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).