qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: jianfeng.tan@intel.com, virtio-dev@lists.oasis-open.org,
	mst@redhat.com, cunming.liang@intel.com, qemu-devel@nongnu.org,
	alex.williamson@redhat.com, xiao.w.wang@intel.com,
	stefanha@redhat.com, zhihong.wang@intel.com, pbonzini@redhat.com,
	dan.daly@intel.com
Subject: Re: [Qemu-devel] [virtio-dev] [RFC 0/3] Extend vhost-user to support VFIO based accelerators
Date: Thu, 4 Jan 2018 15:21:55 +0800	[thread overview]
Message-ID: <aca10596-f8d4-417e-26b6-ed09ef766552@redhat.com> (raw)
In-Reply-To: <20180104061853.jkdzbueashv2klka@debian-xvivbkq.sh.intel.com>



On 2018年01月04日 14:18, Tiwei Bie wrote:
> On Wed, Jan 03, 2018 at 10:34:36PM +0800, Jason Wang wrote:
>> On 2017年12月22日 14:41, Tiwei Bie wrote:
>>> This RFC patch set does some small extensions to vhost-user protocol
>>> to support VFIO based accelerators, and makes it possible to get the
>>> similar performance of VFIO passthru while keeping the virtio device
>>> emulation in QEMU.
>>>
>>> When we have virtio ring compatible devices, it's possible to setup
>>> the device (DMA mapping, PCI config, etc) based on the existing info
>>> (memory-table, features, vring info, etc) which is available on the
>>> vhost-backend (e.g. DPDK vhost library). Then, we will be able to
>>> use such devices to accelerate the emulated device for the VM. And
>>> we call it vDPA: vhost DataPath Acceleration. The key difference
>>> between VFIO passthru and vDPA is that, in vDPA only the data path
>>> (e.g. ring, notify and queue interrupt) is pass-throughed, the device
>>> control path (e.g. PCI configuration space and MMIO regions) is still
>>> defined and emulated by QEMU.
>>>
>>> The benefits of keeping virtio device emulation in QEMU compared
>>> with virtio device VFIO passthru include (but not limit to):
>>>
>>> - consistent device interface from guest OS;
>>> - max flexibility on control path and hardware design;
>>> - leveraging the existing virtio live-migration framework;
>>>
>>> But the critical issue in vDPA is that the data path performance is
>>> relatively low and some host threads are needed for the data path,
>>> because some necessary mechanisms are missing to support:
>>>
>>> 1) guest driver notifies the device directly;
>>> 2) device interrupts the guest directly;
>>>
>>> So this patch set does some small extensions to vhost-user protocol
>>> to make both of them possible. It leverages the same mechanisms (e.g.
>>> EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
>>> achieve the data path pass through.
>>>
>>> A new protocol feature bit is added to negotiate the accelerator feature
>>> support. Two new slave message types are added to enable the notify and
>>> interrupt passthru for each queue. From the view of vhost-user protocol
>>> design, it's very flexible. The passthru can be enabled/disabled for
>>> each queue individually, and it's possible to accelerate each queue by
>>> different devices. More design and implementation details can be found
>>> from the last patch.
>>>
>>> There are some rough edges in this patch set (so this is a RFC patch
>>> set for now), but it's never too early to hear the thoughts from the
>>> community! So any comments and suggestions would be really appreciated!
>>>
>>> Tiwei Bie (3):
>>>     vhost-user: support receiving file descriptors in slave_read
>>>     vhost-user: introduce shared vhost-user state
>>>     vhost-user: add VFIO based accelerators support
>>>
>>>    docs/interop/vhost-user.txt     |  57 ++++++
>>>    hw/scsi/vhost-user-scsi.c       |   6 +-
>>>    hw/vfio/common.c                |   2 +-
>>>    hw/virtio/vhost-user.c          | 430 +++++++++++++++++++++++++++++++++++++++-
>>>    hw/virtio/vhost.c               |   3 +-
>>>    hw/virtio/virtio-pci.c          |   8 -
>>>    hw/virtio/virtio-pci.h          |   8 +
>>>    include/hw/vfio/vfio.h          |   2 +
>>>    include/hw/virtio/vhost-user.h  |  43 ++++
>>>    include/hw/virtio/virtio-scsi.h |   6 +-
>>>    net/vhost-user.c                |  30 +--
>>>    11 files changed, 561 insertions(+), 34 deletions(-)
>>>    create mode 100644 include/hw/virtio/vhost-user.h
>>>
>> I may miss something, but may I ask why you must implement them through
>> vhost-use/dpdk. It looks to me you could put all of them in qemu which could
>> simplify a lots of things (just like userspace NVME driver wrote by Fam).
>>
> Thanks for your comments! :-)
>
> Yeah, you're right. We can also implement everything in QEMU
> like the userspace NVME driver by Fam. It was also described
> by Cunming on the KVM Forum 2017. Below is the link to the
> slides:
>
> https://events.static.linuxfound.org/sites/events/files/slides/KVM17%27-vDPA.pdf

Thanks for the pointer. Looks rather interesting.

>
> We're also working on it (including defining a standard device
> for vhost data path acceleration based on mdev to hide vendor
> specific details).

This is exactly what I mean. Form my point of view, there's no need for 
any extension for vhost protocol, we just need to reuse qemu iothread to 
implement a userspace vhost dataplane and do the mdev inside that thread.

>
> And IMO it's also not a bad idea to extend vhost-user protocol
> to support the accelerators if possible. And it could be more
> flexible because it could support (for example) below things
> easily without introducing any complex command line options or
> monitor commands to QEMU:

Maybe I was wrong but I don't think we care about the complexity of 
command line or monitor command in this case.

>
> - the switching among different accelerators and software version
>    can be done at runtime in vhost process;
> - use different accelerators to accelerate different queue pairs
>    or just accelerate some (instead of all) queue pairs;

Well, technically, if we want, these could be implemented in qemu too.

And here's some more advantages if you implement it in qemu:

1) Avoid extra dependency like dpdk
2) More flexible, mdev could even choose to not use VFIO or not depend 
on vDPA
3) More efficient guest IOMMU integration especially for dynamic 
mappings (device IOTLB transactions could be done by function calls 
instead of slow UDP messages)
4) Zerocopy (for non intel vDPA) is more easier to be implemented
5) Compare to vhost-user, tightly coupled with device emulation can 
simplify lots of things (an example is programmable flow director/RSS 
implementation). And any future enhancement to virtio does not need to 
introduce new type of vhost-user messages.

I don't object vhost-user/dpdk method but I second for implementing all 
the stuffs in qemu.

Thanks

>
> Best regards,
> Tiwei Bie
>

  reply	other threads:[~2018-01-04  7:22 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-22  6:41 [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators Tiwei Bie
2017-12-22  6:41 ` [Qemu-devel] [RFC 1/3] vhost-user: support receiving file descriptors in slave_read Tiwei Bie
2017-12-22  6:41 ` [Qemu-devel] [RFC 2/3] vhost-user: introduce shared vhost-user state Tiwei Bie
2017-12-22  6:41 ` [Qemu-devel] [RFC 3/3] vhost-user: add VFIO based accelerators support Tiwei Bie
2018-01-16 17:23   ` Alex Williamson
2018-01-17  5:00     ` Tiwei Bie
2018-01-02  2:42 ` [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators Alexey Kardashevskiy
2018-01-02  5:49   ` Liang, Cunming
2018-01-02  6:01     ` Alexey Kardashevskiy
2018-01-02  6:48       ` Liang, Cunming
2018-01-03 14:34 ` [Qemu-devel] [virtio-dev] " Jason Wang
2018-01-04  6:18   ` Tiwei Bie
2018-01-04  7:21     ` Jason Wang [this message]
2018-01-05  6:58       ` Liang, Cunming
2018-01-05  8:38         ` Jason Wang
2018-01-05 10:25           ` Liang, Cunming
2018-01-08  3:23             ` Jason Wang
2018-01-08  8:23               ` [Qemu-devel] [virtio-dev] " Liang, Cunming
2018-01-08 10:06                 ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aca10596-f8d4-417e-26b6-ed09ef766552@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=jianfeng.tan@intel.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=tiwei.bie@intel.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=xiao.w.wang@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).