From: "Michael S. Tsirkin" <mst@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
alex.williamson@redhat.com, jasowang@redhat.com,
pbonzini@redhat.com, stefanha@redhat.com,
cunming.liang@intel.com, dan.daly@intel.com,
jianfeng.tan@intel.com, zhihong.wang@intel.com,
xiao.w.wang@intel.com
Subject: Re: [Qemu-devel] [PATCH v2 0/6] Extend vhost-user to support VFIO based accelerators
Date: Thu, 22 Mar 2018 16:55:39 +0200 [thread overview]
Message-ID: <20180322165441-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180319071537.28649-1-tiwei.bie@intel.com>
On Mon, Mar 19, 2018 at 03:15:31PM +0800, Tiwei Bie wrote:
> This patch set does some small extensions to vhost-user protocol
> to support VFIO based accelerators, and makes it possible to get
> the similar performance of VFIO based PCI passthru while keeping
> the virtio device emulation in QEMU.
I love your patches!
Yet there are some things to improve.
Posting comments separately as individual messages.
> How does accelerator accelerate vhost (data path)
> =================================================
>
> Any virtio ring compatible devices potentially can be used as the
> vhost data path accelerators. We can setup the accelerator based
> on the informations (e.g. memory table, features, ring info, etc)
> available on the vhost backend. And accelerator will be able to use
> the virtio ring provided by the virtio driver in the VM directly.
> So the virtio driver in the VM can exchange e.g. network packets
> with the accelerator directly via the virtio ring. That is to say,
> we will be able to use the accelerator to accelerate the vhost
> data path. We call it vDPA: vhost Data Path Acceleration.
>
> Notice: Although the accelerator can talk with the virtio driver
> in the VM via the virtio ring directly. The control path events
> (e.g. device start/stop) in the VM will still be trapped and handled
> by QEMU, and QEMU will deliver such events to the vhost backend
> via standard vhost protocol.
>
> Below link is an example showing how to setup a such environment
> via nested VM. In this case, the virtio device in the outer VM is
> the accelerator. It will be used to accelerate the virtio device
> in the inner VM. In reality, we could use virtio ring compatible
> hardware device as the accelerators.
>
> http://dpdk.org/ml/archives/dev/2017-December/085044.html
>
> In above example, it doesn't require any changes to QEMU, but
> it has lower performance compared with the traditional VFIO
> based PCI passthru. And that's the problem this patch set wants
> to solve.
>
> The performance issue of vDPA/vhost-user and solutions
> ======================================================
>
> For vhost-user backend, the critical issue in vDPA is that the
> data path performance is relatively low and some host threads are
> needed for the data path, because some necessary mechanisms are
> missing to support:
>
> 1) guest driver notifies the device directly;
> 2) device interrupts the guest directly;
>
> So this patch set does some small extensions to the vhost-user
> protocol to make both of them possible. It leverages the same
> mechanisms (e.g. EPT and Posted-Interrupt on Intel platform) as
> the PCI passthru.
>
> A new protocol feature bit is added to negotiate the accelerator
> feature support. Two new slave message types are added to control
> the notify region and queue interrupt passthru for each queue.
> >From the view of vhost-user protocol design, it's very flexible.
> The passthru can be enabled/disabled for each queue individually,
> and it's possible to accelerate each queue by different devices.
> More design and implementation details can be found from the last
> patch.
>
> Difference between vDPA and PCI passthru
> ========================================
>
> The key difference between PCI passthru and vDPA is that, in vDPA
> only the data path of the device (e.g. DMA ring, notify region and
> queue interrupt) is pass-throughed to the VM, the device control
> path (e.g. PCI configuration space and MMIO regions) is still
> defined and emulated by QEMU.
>
> The benefits of keeping virtio device emulation in QEMU compared
> with virtio device PCI passthru include (but not limit to):
>
> - consistent device interface for guest OS in the VM;
> - max flexibility on the hardware (i.e. the accelerators) design;
> - leveraging the existing virtio live-migration framework;
>
> Why extend vhost-user for vDPA
> ==============================
>
> We have already implemented various virtual switches (e.g. OVS-DPDK)
> based on vhost-user for VMs in the Cloud. They are purely software
> running on CPU cores. When we have accelerators for such NFVi applications,
> it's ideal if the applications could keep using the original interface
> (i.e. vhost-user netdev) with QEMU, and infrastructure is able to decide
> when and how to switch between CPU and accelerators within the interface.
> And the switching (i.e. switch between CPU and accelerators) can be done
> flexibly and quickly inside the applications.
>
> More details about this can be found from the Cunming's discussions on
> the RFC patch set.
>
> Update notes
> ============
>
> IOMMU feature bit check is removed in this version, because:
>
> The IOMMU feature is negotiable, when an accelerator is used and
> it doesn't support virtual IOMMU, its driver just won't provide
> this feature bit when vhost library querying its features. And if
> it supports the virtual IOMMU, its driver can provide this feature
> bit. It's not reasonable to add this limitation in this patch set.
>
> The previous links:
> RFC: http://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg04844.html
> v1: http://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06028.html
>
> v1 -> v2:
> - Add some explanations about why extend vhost-user in commit log (Paolo);
> - Bug fix in slave_read() according to Stefan's fix in DPDK;
> - Remove IOMMU feature check and related commit log;
> - Some minor refinements;
> - Rebase to the latest QEMU;
>
> RFC -> v1:
> - Add some details about how vDPA works in cover letter (Alexey)
> - Add some details about the OVS offload use-case in cover letter (Jason)
> - Move PCI specific stuffs out of vhost-user (Jason)
> - Handle the virtual IOMMU case (Jason)
> - Move VFIO group management code into vfio/common.c (Alex)
> - Various refinements;
> (approximately sorted by comment posting time)
>
> Tiwei Bie (6):
> vhost-user: support receiving file descriptors in slave_read
> vhost-user: introduce shared vhost-user state
> virtio: support adding sub-regions for notify region
> vfio: support getting VFIOGroup from groupfd
> vfio: remove DPRINTF() definition from vfio-common.h
> vhost-user: add VFIO based accelerators support
>
> Makefile.target | 4 +
> docs/interop/vhost-user.txt | 57 +++++++++
> hw/scsi/vhost-user-scsi.c | 6 +-
> hw/vfio/common.c | 97 +++++++++++++++-
> hw/virtio/vhost-user.c | 248 +++++++++++++++++++++++++++++++++++++++-
> hw/virtio/virtio-pci.c | 48 ++++++++
> hw/virtio/virtio-pci.h | 5 +
> hw/virtio/virtio.c | 39 +++++++
> include/hw/vfio/vfio-common.h | 11 +-
> include/hw/virtio/vhost-user.h | 34 ++++++
> include/hw/virtio/virtio-scsi.h | 6 +-
> include/hw/virtio/virtio.h | 5 +
> include/qemu/osdep.h | 1 +
> net/vhost-user.c | 30 ++---
> scripts/create_config | 3 +
> 15 files changed, 561 insertions(+), 33 deletions(-)
> create mode 100644 include/hw/virtio/vhost-user.h
>
> --
> 2.11.0
next prev parent reply other threads:[~2018-03-22 14:56 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-19 7:15 [Qemu-devel] [PATCH v2 0/6] Extend vhost-user to support VFIO based accelerators Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 1/6] vhost-user: support receiving file descriptors in slave_read Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 2/6] vhost-user: introduce shared vhost-user state Tiwei Bie
2018-03-22 15:13 ` Michael S. Tsirkin
2018-03-27 13:32 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 3/6] virtio: support adding sub-regions for notify region Tiwei Bie
2018-03-22 14:57 ` Michael S. Tsirkin
2018-03-27 13:47 ` Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 4/6] vfio: support getting VFIOGroup from groupfd Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 5/6] vfio: remove DPRINTF() definition from vfio-common.h Tiwei Bie
2018-03-22 15:15 ` Michael S. Tsirkin
2018-03-27 13:33 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-03-19 7:15 ` [Qemu-devel] [PATCH v2 6/6] vhost-user: add VFIO based accelerators support Tiwei Bie
2018-03-22 16:19 ` Michael S. Tsirkin
2018-03-27 11:06 ` Tiwei Bie
2018-03-27 13:59 ` Tiwei Bie
2018-03-22 14:55 ` Michael S. Tsirkin [this message]
2018-03-23 8:54 ` [Qemu-devel] [PATCH v2 0/6] Extend vhost-user to support VFIO based accelerators Tiwei Bie
2018-03-22 16:40 ` Michael S. Tsirkin
2018-03-28 12:24 ` Tiwei Bie
2018-03-28 15:33 ` Michael S. Tsirkin
2018-03-29 3:33 ` [Qemu-devel] [virtio-dev] " Tiwei Bie
2018-03-29 4:16 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180322165441-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=cunming.liang@intel.com \
--cc=dan.daly@intel.com \
--cc=jasowang@redhat.com \
--cc=jianfeng.tan@intel.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=tiwei.bie@intel.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=xiao.w.wang@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).