qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eugenio Perez Martin <eperezma@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org, si-wei.liu@oracle.com,
	 Liuxiangdong <liuxiangdong5@huawei.com>,
	Zhu Lingshan <lingshan.zhu@intel.com>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	alvaro.karsz@solid-run.com,  Shannon Nelson <snelson@pensando.io>,
	Laurent Vivier <lvivier@redhat.com>,
	 Harpreet Singh Anand <hanand@xilinx.com>,
	Gautam Dawar <gdawar@xilinx.com>,
	 Stefano Garzarella <sgarzare@redhat.com>,
	Cornelia Huck <cohuck@redhat.com>, Cindy Lu <lulu@redhat.com>,
	 Eli Cohen <eli@mellanox.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 "Michael S. Tsirkin" <mst@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Parav Pandit <parav@mellanox.com>
Subject: Re: [RFC v2 04/13] vdpa: rewind at get_base, not set_base
Date: Mon, 16 Jan 2023 10:53:25 +0100	[thread overview]
Message-ID: <CAJaqyWeY30QETgksM2_zrc8xvOABSTAhwFUXRJRHumX0FFrqpw@mail.gmail.com> (raw)
In-Reply-To: <68d2c045-e260-140c-9525-2fc265ae9291@redhat.com>

On Mon, Jan 16, 2023 at 4:32 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2023/1/13 15:40, Eugenio Perez Martin 写道:
> > On Fri, Jan 13, 2023 at 5:10 AM Jason Wang <jasowang@redhat.com> wrote:
> >> On Fri, Jan 13, 2023 at 1:24 AM Eugenio Pérez <eperezma@redhat.com> wrote:
> >>> At this moment it is only possible to migrate to a vdpa device running
> >>> with x-svq=on. As a protective measure, the rewind of the inflight
> >>> descriptors was done at the destination. That way if the source sent a
> >>> virtqueue with inuse descriptors they are always discarded.
> >>>
> >>> Since this series allows to migrate also to passthrough devices with no
> >>> SVQ, the right thing to do is to rewind at the source so base of vrings
> >>> are correct.
> >>>
> >>> Support for inflight descriptors may be added in the future.
> >>>
> >>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>> ---
> >>>   include/hw/virtio/vhost-backend.h |  4 +++
> >>>   hw/virtio/vhost-vdpa.c            | 46 +++++++++++++++++++------------
> >>>   hw/virtio/vhost.c                 |  3 ++
> >>>   3 files changed, 36 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
> >>> index c5ab49051e..ec3fbae58d 100644
> >>> --- a/include/hw/virtio/vhost-backend.h
> >>> +++ b/include/hw/virtio/vhost-backend.h
> >>> @@ -130,6 +130,9 @@ typedef bool (*vhost_force_iommu_op)(struct vhost_dev *dev);
> >>>
> >>>   typedef int (*vhost_set_config_call_op)(struct vhost_dev *dev,
> >>>                                          int fd);
> >>> +
> >>> +typedef void (*vhost_reset_status_op)(struct vhost_dev *dev);
> >>> +
> >>>   typedef struct VhostOps {
> >>>       VhostBackendType backend_type;
> >>>       vhost_backend_init vhost_backend_init;
> >>> @@ -177,6 +180,7 @@ typedef struct VhostOps {
> >>>       vhost_get_device_id_op vhost_get_device_id;
> >>>       vhost_force_iommu_op vhost_force_iommu;
> >>>       vhost_set_config_call_op vhost_set_config_call;
> >>> +    vhost_reset_status_op vhost_reset_status;
> >>>   } VhostOps;
> >>>
> >>>   int vhost_backend_update_device_iotlb(struct vhost_dev *dev,
> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> >>> index 542e003101..28a52ddc78 100644
> >>> --- a/hw/virtio/vhost-vdpa.c
> >>> +++ b/hw/virtio/vhost-vdpa.c
> >>> @@ -1132,14 +1132,23 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started)
> >>>       if (started) {
> >>>           memory_listener_register(&v->listener, &address_space_memory);
> >>>           return vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
> >>> -    } else {
> >>> -        vhost_vdpa_reset_device(dev);
> >>> -        vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >>> -                                   VIRTIO_CONFIG_S_DRIVER);
> >>> -        memory_listener_unregister(&v->listener);
> >>> +    }
> >>>
> >>> -        return 0;
> >>> +    return 0;
> >>> +}
> >>> +
> >>> +static void vhost_vdpa_reset_status(struct vhost_dev *dev)
> >>> +{
> >>> +    struct vhost_vdpa *v = dev->opaque;
> >>> +
> >>> +    if (dev->vq_index + dev->nvqs != dev->vq_index_end) {
> >>> +        return;
> >>>       }
> >>> +
> >>> +    vhost_vdpa_reset_device(dev);
> >>> +    vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> >>> +                                VIRTIO_CONFIG_S_DRIVER);
> >>> +    memory_listener_unregister(&v->listener);
> >>>   }
> >>>
> >>>   static int vhost_vdpa_set_log_base(struct vhost_dev *dev, uint64_t base,
> >>> @@ -1182,18 +1191,7 @@ static int vhost_vdpa_set_vring_base(struct vhost_dev *dev,
> >>>                                          struct vhost_vring_state *ring)
> >>>   {
> >>>       struct vhost_vdpa *v = dev->opaque;
> >>> -    VirtQueue *vq = virtio_get_queue(dev->vdev, ring->index);
> >>>
> >>> -    /*
> >>> -     * vhost-vdpa devices does not support in-flight requests. Set all of them
> >>> -     * as available.
> >>> -     *
> >>> -     * TODO: This is ok for networking, but other kinds of devices might
> >>> -     * have problems with these retransmissions.
> >>> -     */
> >>> -    while (virtqueue_rewind(vq, 1)) {
> >>> -        continue;
> >>> -    }
> >>>       if (v->shadow_vqs_enabled) {
> >>>           /*
> >>>            * Device vring base was set at device start. SVQ base is handled by
> >>> @@ -1212,6 +1210,19 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev *dev,
> >>>       int ret;
> >>>
> >>>       if (v->shadow_vqs_enabled) {
> >>> +        VirtQueue *vq = virtio_get_queue(dev->vdev, ring->index);
> >>> +
> >>> +        /*
> >>> +         * vhost-vdpa devices does not support in-flight requests. Set all of
> >>> +         * them as available.
> >>> +         *
> >>> +         * TODO: This is ok for networking, but other kinds of devices might
> >>> +         * have problems with these retransmissions.
> >>> +         */
> >>> +        while (virtqueue_rewind(vq, 1)) {
> >>> +            continue;
> >>> +        }
> >>> +
> >>>           ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
> >>>           return 0;
> >>>       }
> >>> @@ -1326,4 +1337,5 @@ const VhostOps vdpa_ops = {
> >>>           .vhost_vq_get_addr = vhost_vdpa_vq_get_addr,
> >>>           .vhost_force_iommu = vhost_vdpa_force_iommu,
> >>>           .vhost_set_config_call = vhost_vdpa_set_config_call,
> >>> +        .vhost_reset_status = vhost_vdpa_reset_status,
> >> Can we simply use the NetClient stop method here?
> >>
> > Ouch, I squashed two patches by mistake here.
> >
> > All the vhost_reset_status part should be independent of this patch,
> > and I was especially interested in its feedback. It had this message:
> >
> >      vdpa: move vhost reset after get vring base
> >
> >      The function vhost.c:vhost_dev_stop calls vhost operation
> >      vhost_dev_start(false). In the case of vdpa it totally reset and wipes
> >      the device, making the fetching of the vring base (virtqueue state) totally
> >      useless.
> >
> >      The kernel backend does not use vhost_dev_start vhost op callback, but
> >      vhost-user do. A patch to make vhost_user_dev_start more similar to vdpa
> >      is desirable, but it can be added on top.
> >
> > I can resend the series splitting it again but conversation may
> > scatter between versions. Would you prefer me to send a new version?
>
>
> I think it can be done in next version (after we finalize the discussion
> for this version).
>
>
> >
> > Regarding the use of NetClient, it feels weird to call net specific
> > functions in VhostOps, doesn't it?
>
>
> Basically, I meant, the patch call vhost_reset_status() in
> vhost_dev_stop(). But we've already had vhost_dev_start ops where we
> implement per backend start/stop logic.
>
> I think it's better to do things in vhost_dev_start():
>
> For device that can do suspend, we can do suspend. For other we need to
> do reset as a workaround.
>

If the device implements _F_SUSPEND we can call suspend in
vhost_dev_start(false) and fetch the vq base after it. But we cannot
call vhost_dev_reset until we get the vq base. If we do it, we will
always get zero there.

If we don't reset the device at vhost_vdpa_dev_start(false) we need to
call a proper reset after getting the base, at least in vdpa. So to
create a new vhost_op should be the right thing to do, isn't it?

Hopefully with a better name than vhost_vdpa_reset_status, that's for sure :).

I'm not sure how vhost-user works with this or when it does reset the
indexes. My bet is that it never does at the device reinitialization
and it trusts VMM calls to vhost_user_set_base but I may be wrong.

Thanks!

> And if necessary, we can call nc client ops for net specific operations
> (if it has any).
>
> Thanks
>
>
> > At the moment vhost ops is
> > specialized in vhost-kernel, vhost-user and vhost-vdpa. If we want to
> > make it specific to the kind of device, that makes vhost-vdpa-net too.
> >
> > Thanks!
> >
> >
> >> Thanks
> >>
> >>>   };
> >>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >>> index eb8c4c378c..a266396576 100644
> >>> --- a/hw/virtio/vhost.c
> >>> +++ b/hw/virtio/vhost.c
> >>> @@ -2049,6 +2049,9 @@ void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings)
> >>>                                hdev->vqs + i,
> >>>                                hdev->vq_index + i);
> >>>       }
> >>> +    if (hdev->vhost_ops->vhost_reset_status) {
> >>> +        hdev->vhost_ops->vhost_reset_status(hdev);
> >>> +    }
> >>>
> >>>       if (vhost_dev_has_iommu(hdev)) {
> >>>           if (hdev->vhost_ops->vhost_set_iotlb_callback) {
> >>> --
> >>> 2.31.1
> >>>
>



  reply	other threads:[~2023-01-16  9:54 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-12 17:24 [RFC v2 00/13] Dinamycally switch to vhost shadow virtqueues at vdpa net migration Eugenio Pérez
2023-01-12 17:24 ` [RFC v2 01/13] vdpa: fix VHOST_BACKEND_F_IOTLB_ASID flag check Eugenio Pérez
2023-01-13  3:12   ` Jason Wang
2023-01-13  6:42     ` Eugenio Perez Martin
2023-01-16  3:01       ` Jason Wang
2023-01-12 17:24 ` [RFC v2 02/13] vdpa net: move iova tree creation from init to start Eugenio Pérez
2023-01-13  3:53   ` Jason Wang
2023-01-13  7:28     ` Eugenio Perez Martin
2023-01-16  3:05       ` Jason Wang
2023-01-16  9:14         ` Eugenio Perez Martin
2023-01-17  4:30           ` Jason Wang
2023-01-12 17:24 ` [RFC v2 03/13] vdpa: copy cvq shadow_data from data vqs, not from x-svq Eugenio Pérez
2023-01-12 17:24 ` [RFC v2 04/13] vdpa: rewind at get_base, not set_base Eugenio Pérez
2023-01-13  4:09   ` Jason Wang
2023-01-13  7:40     ` Eugenio Perez Martin
2023-01-16  3:32       ` Jason Wang
2023-01-16  9:53         ` Eugenio Perez Martin [this message]
2023-01-17  4:38           ` Jason Wang
2023-01-17  6:57             ` Eugenio Perez Martin
2023-01-12 17:24 ` [RFC v2 05/13] vdpa net: add migration blocker if cannot migrate cvq Eugenio Pérez
2023-01-13  4:24   ` Jason Wang
2023-01-13  7:46     ` Eugenio Perez Martin
2023-01-16  3:34       ` Jason Wang
2023-01-16  5:23         ` Michael S. Tsirkin
2023-01-16  9:33           ` Eugenio Perez Martin
2023-01-17  5:42             ` Jason Wang
2023-01-12 17:24 ` [RFC v2 06/13] vhost: delay set_vring_ready after DRIVER_OK Eugenio Pérez
2023-01-13  4:36   ` Jason Wang
2023-01-13  8:19     ` Eugenio Perez Martin
2023-01-13  9:51       ` Stefano Garzarella
2023-01-13 10:03         ` Eugenio Perez Martin
2023-01-13 10:37           ` Stefano Garzarella
2023-01-17 15:15           ` Maxime Coquelin
2023-01-16  6:36       ` Jason Wang
2023-01-16 16:16         ` Eugenio Perez Martin
2023-01-17  5:36           ` Jason Wang
2023-01-12 17:24 ` [RFC v2 07/13] vdpa: " Eugenio Pérez
2023-01-12 17:24 ` [RFC v2 08/13] vdpa: Negotiate _F_SUSPEND feature Eugenio Pérez
2023-01-13  4:39   ` Jason Wang
2023-01-13  8:45     ` Eugenio Perez Martin
2023-01-16  6:48       ` Jason Wang
2023-01-16 16:17         ` Eugenio Perez Martin
2023-01-12 17:24 ` [RFC v2 09/13] vdpa: add feature_log parameter to vhost_vdpa Eugenio Pérez
2023-01-12 17:24 ` [RFC v2 10/13] vdpa net: allow VHOST_F_LOG_ALL Eugenio Pérez
2023-01-13  4:42   ` Jason Wang
2023-01-12 17:24 ` [RFC v2 11/13] vdpa: add vdpa net migration state notifier Eugenio Pérez
2023-01-13  4:54   ` Jason Wang
2023-01-13  9:00     ` Eugenio Perez Martin
2023-01-16  6:51       ` Jason Wang
2023-01-16 15:21         ` Eugenio Perez Martin
2023-01-17  9:58       ` Dr. David Alan Gilbert
2023-01-17 10:23         ` Eugenio Perez Martin
2023-01-17 12:54           ` Dr. David Alan Gilbert
2023-02-02  1:52   ` Si-Wei Liu
2023-02-02 15:28     ` Eugenio Perez Martin
2023-02-04  2:03       ` Si-Wei Liu
2023-02-13  9:47         ` Eugenio Perez Martin
2023-02-13 22:36           ` Si-Wei Liu
2023-02-14 18:51             ` Eugenio Perez Martin
2023-02-12 14:31     ` Eli Cohen
2023-01-12 17:24 ` [RFC v2 12/13] vdpa: preemptive kick at enable Eugenio Pérez
2023-01-13  2:31   ` Jason Wang
2023-01-13  3:25     ` Zhu, Lingshan
2023-01-13  3:39       ` Jason Wang
2023-01-13  9:06         ` Eugenio Perez Martin
2023-01-16  7:02           ` Jason Wang
2023-02-02 16:55             ` Eugenio Perez Martin
2023-02-02  0:56           ` Si-Wei Liu
2023-02-02 16:53             ` Eugenio Perez Martin
2023-02-04 11:04               ` Si-Wei Liu
2023-02-05 10:00                 ` Michael S. Tsirkin
2023-02-06  5:08                   ` Si-Wei Liu
2023-01-12 17:24 ` [RFC v2 13/13] vdpa: Conditionally expose _F_LOG in vhost_net devices Eugenio Pérez
2023-02-02  1:00 ` [RFC v2 00/13] Dinamycally switch to vhost shadow virtqueues at vdpa net migration Si-Wei Liu
2023-02-02 11:27   ` Eugenio Perez Martin
2023-02-03  5:08     ` Si-Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJaqyWeY30QETgksM2_zrc8xvOABSTAhwFUXRJRHumX0FFrqpw@mail.gmail.com \
    --to=eperezma@redhat.com \
    --cc=alvaro.karsz@solid-run.com \
    --cc=arei.gonglei@huawei.com \
    --cc=cohuck@redhat.com \
    --cc=eli@mellanox.com \
    --cc=gdawar@xilinx.com \
    --cc=hanand@xilinx.com \
    --cc=jasowang@redhat.com \
    --cc=lingshan.zhu@intel.com \
    --cc=liuxiangdong5@huawei.com \
    --cc=lulu@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=mst@redhat.com \
    --cc=parav@mellanox.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=si-wei.liu@oracle.com \
    --cc=snelson@pensando.io \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).