From: Jason Wang <jasowang@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kvm@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org, "Daniel Daly" <dandaly0@gmail.com>,
virtualization@lists.linux-foundation.org,
"Liran Alon" <liralon@gmail.com>, "Eli Cohen" <eli@mellanox.com>,
"Nitin Shrivastav" <nitin.shrivastav@broadcom.com>,
"Alex Barba" <alex.barba@broadcom.com>,
"Christophe Fontaine" <cfontain@redhat.com>,
"Lee Ballard" <ballle98@gmail.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Lars Ganrot" <lars.ganrot@gmail.com>,
"Rob Miller" <rob.miller@broadcom.com>,
"Howard Cai" <howard.cai@gmail.com>,
"Parav Pandit" <parav@mellanox.com>, vm <vmireyno@marvell.com>,
"Salil Mehta" <mehta.salil.lnk@gmail.com>,
"Stephen Finucane" <stephenfin@redhat.com>,
"Xiao W Wang" <xiao.w.wang@intel.com>,
"Sean Mooney" <smooney@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Jim Harford" <jim.harford@broadcom.com>,
"Dmytro Kazantsev" <dmytro.kazantsev@gmail.com>,
"Siwei Liu" <loseweigh@gmail.com>,
"Harpreet Singh Anand" <hanand@xilinx.com>,
"Michael Lilja" <ml@napatech.com>,
"Max Gurtovoy" <maxgu14@gmail.com>
Subject: Re: [RFC PATCH 00/27] vDPA software assisted live migration
Date: Wed, 9 Dec 2020 04:26:50 -0500 (EST) [thread overview]
Message-ID: <1410217602.34486578.1607506010536.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <20201208093715.GX203660@stefanha-x1.localdomain>
----- Original Message -----
> On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote:
> > This series enable vDPA software assisted live migration for vhost-net
> > devices. This is a new method of vhost devices migration: Instead of
> > relay on vDPA device's dirty logging capability, SW assisted LM
> > intercepts dataplane, forwarding the descriptors between VM and device.
>
> Pros:
> + vhost/vDPA devices don't need to implement dirty memory logging
> + Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends
>
> Cons:
> - Not generic, relies on vhost-net-specific ioctls
> - Doesn't support VIRTIO Shared Memory Regions
> https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex
I may miss something but my understanding is that it's the
responsiblity of device to migrate this part?
> - Performance (see below)
>
> I think performance will be significantly lower when the shadow vq is
> enabled. Imagine a vDPA device with hardware vq doorbell registers
> mapped into the guest so the guest driver can directly kick the device.
> When the shadow vq is enabled a vmexit is needed to write to the shadow
> vq ioeventfd, then the host kernel scheduler switches to a QEMU thread
> to read the ioeventfd, the descriptors are translated, QEMU writes to
> the vhost hdev kick fd, the host kernel scheduler switches to the vhost
> worker thread, vhost/vDPA notifies the virtqueue, and finally the
> vDPA driver writes to the hardware vq doorbell register. That is a lot
> of overhead compared to writing to an exitless MMIO register!
I think it's a balance. E.g we can poll the virtqueue to have an
exitless doorbell.
>
> If the shadow vq was implemented in drivers/vhost/ and QEMU used the
> existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be
> reduced to just one set of ioeventfd/irqfd. In other words, the QEMU
> dirty memory logging happens asynchronously and isn't in the dataplane.
>
> In addition, hardware that supports dirty memory logging as well as
> software vDPA devices could completely eliminate the shadow vq for even
> better performance.
Yes. That's our plan. But the interface might require more thought.
E.g is the bitmap a good approach? To me reporting dirty pages via
virqueue is better since it get less footprint and is self throttled.
And we need an address space other than the one used by guest for
either bitmap for virtqueue.
>
> But performance is a question of "is it good enough?". Maybe this
> approach is okay and users don't expect good performance while dirty
> memory logging is enabled.
Yes, and actually such slow down may help for the converge of the
migration.
Note that the whole idea is try to have a generic solution for all
types of devices. It's good to consider the performance but for the
first stage, it should be sufficient to make it work and consider to
optimize on top.
> I just wanted to share the idea of moving the
> shadow vq into the kernel in case you like that approach better.
My understanding is to keep kernel as simple as possible and leave the
polices to userspace as much as possible. E.g it requires us to
disable doorbell mapping and irq offloading, all of which were under
the control of userspace.
Thanks
>
> Stefan
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2020-12-09 9:27 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20201120185105.279030-1-eperezma@redhat.com>
2020-11-25 7:08 ` [RFC PATCH 00/27] vDPA software assisted live migration Jason Wang
[not found] ` <CAJaqyWf+6yoMHJuLv=QGLMP4egmdm722=V2kKJ_aiQAfCCQOFw@mail.gmail.com>
2020-11-26 3:07 ` Jason Wang
[not found] ` <20201120185105.279030-24-eperezma@redhat.com>
2020-11-27 15:29 ` [RFC PATCH 23/27] vhost: unmap qemu's shadow virtqueues on sw " Stefano Garzarella
2020-11-27 15:44 ` [RFC PATCH 00/27] vDPA software assisted " Stefano Garzarella
[not found] ` <20201120185105.279030-3-eperezma@redhat.com>
2020-12-07 16:19 ` [RFC PATCH 02/27] vhost: Add device callback in vhost_migration_log Stefan Hajnoczi
[not found] ` <20201120185105.279030-5-eperezma@redhat.com>
2020-12-07 16:43 ` [RFC PATCH 04/27] vhost: add vhost_kernel_set_vring_enable Stefan Hajnoczi
[not found] ` <CAJaqyWd5oAJ4kJOhyDz+1KNvwzqJi3NO+5Z7X6W5ju2Va=LTMQ@mail.gmail.com>
2020-12-09 16:08 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-6-eperezma@redhat.com>
2020-12-07 16:52 ` [RFC PATCH 05/27] vhost: Add hdev->dev.sw_lm_vq_handler Stefan Hajnoczi
[not found] ` <CAJaqyWfSUHD0MU=1yfU1N6pZ4TU7prxyoG6NY-VyNGt=MO9H4g@mail.gmail.com>
2020-12-10 11:30 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-7-eperezma@redhat.com>
2020-12-07 16:58 ` [RFC PATCH 06/27] virtio: Add virtio_queue_get_used_notify_split Stefan Hajnoczi
[not found] ` <CAJaqyWc4oLzL02GKpPSwEGRxK+UxjOGBAPLzrgrgKRZd9C81GA@mail.gmail.com>
2021-03-02 11:22 ` Stefan Hajnoczi
[not found] ` <CAJaqyWd0iRUUW5Hu=U3mwQ4f43kA=bse3EkN4+QauFR4BJwObQ@mail.gmail.com>
2021-03-08 10:46 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-8-eperezma@redhat.com>
2020-12-07 17:42 ` [RFC PATCH 07/27] vhost: Route guest->host notification through qemu Stefan Hajnoczi
[not found] ` <CAJaqyWfiMsRP9FgSv7cOj=3jHx=DJS7hRJTMbRcTTHHWng0eKg@mail.gmail.com>
2020-12-10 11:50 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-9-eperezma@redhat.com>
2020-12-08 7:20 ` [RFC PATCH 08/27] vhost: Add a flag for software assisted Live Migration Stefan Hajnoczi
[not found] ` <20201120185105.279030-10-eperezma@redhat.com>
2020-12-08 7:34 ` [RFC PATCH 09/27] vhost: Route host->guest notification through qemu Stefan Hajnoczi
[not found] ` <20201120185105.279030-11-eperezma@redhat.com>
2020-12-08 7:49 ` [RFC PATCH 10/27] vhost: Allocate shadow vring Stefan Hajnoczi
2020-12-08 8:17 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-14-eperezma@redhat.com>
2020-12-08 8:16 ` [RFC PATCH 13/27] vhost: Send buffers to device Stefan Hajnoczi
[not found] ` <CAJaqyWf13ta5MtzmTUz2N5XnQ+ebqFPYAivdggL64LEQAf=y+A@mail.gmail.com>
2020-12-10 11:55 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-17-eperezma@redhat.com>
2020-12-08 8:25 ` [RFC PATCH 16/27] virtio: Expose virtqueue_alloc_element Stefan Hajnoczi
[not found] ` <CAJaqyWdN7iudf8mDN4k4Fs9j1x+ztZARuBbinPHD3ZQSMH1pyQ@mail.gmail.com>
2020-12-10 11:57 ` Stefan Hajnoczi
[not found] ` <20201120185105.279030-19-eperezma@redhat.com>
2020-12-08 8:41 ` [RFC PATCH 18/27] vhost: add vhost_vring_poll_rcu Stefan Hajnoczi
[not found] ` <20201120185105.279030-21-eperezma@redhat.com>
2020-12-08 8:50 ` [RFC PATCH 20/27] vhost: Return used buffers Stefan Hajnoczi
[not found] ` <20201120185105.279030-25-eperezma@redhat.com>
2020-12-08 9:02 ` [RFC PATCH 24/27] vhost: iommu changes Stefan Hajnoczi
2020-12-08 9:37 ` [RFC PATCH 00/27] vDPA software assisted live migration Stefan Hajnoczi
2020-12-09 9:26 ` Jason Wang [this message]
2020-12-09 15:57 ` Stefan Hajnoczi
2020-12-10 9:12 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1410217602.34486578.1607506010536.JavaMail.zimbra@redhat.com \
--to=jasowang@redhat.com \
--cc=alex.barba@broadcom.com \
--cc=ballle98@gmail.com \
--cc=cfontain@redhat.com \
--cc=dandaly0@gmail.com \
--cc=dmytro.kazantsev@gmail.com \
--cc=eli@mellanox.com \
--cc=eperezma@redhat.com \
--cc=hanand@xilinx.com \
--cc=howard.cai@gmail.com \
--cc=jim.harford@broadcom.com \
--cc=kvm@vger.kernel.org \
--cc=lars.ganrot@gmail.com \
--cc=liralon@gmail.com \
--cc=loseweigh@gmail.com \
--cc=maxgu14@gmail.com \
--cc=mehta.salil.lnk@gmail.com \
--cc=ml@napatech.com \
--cc=mst@redhat.com \
--cc=nitin.shrivastav@broadcom.com \
--cc=parav@mellanox.com \
--cc=qemu-devel@nongnu.org \
--cc=rob.miller@broadcom.com \
--cc=smooney@redhat.com \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=stephenfin@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
--cc=vmireyno@marvell.com \
--cc=xiao.w.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).