netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tiwei Bie <tiwei.bie@intel.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, alex.williamson@redhat.com,
	maxime.coquelin@redhat.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, dan.daly@intel.com,
	cunming.liang@intel.com, zhihong.wang@intel.com
Subject: Re: [RFC v2] vhost: introduce mdev based hardware vhost backend
Date: Wed, 3 Jul 2019 19:52:45 +0800	[thread overview]
Message-ID: <20190703115245.GA22374@___> (raw)
In-Reply-To: <7b8279b2-aa7e-7adc-eeff-20dfaf4400d0@redhat.com>

On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote:
> On 2019/7/3 下午5:13, Tiwei Bie wrote:
> > Details about this can be found here:
> > 
> > https://lwn.net/Articles/750770/
> > 
> > What's new in this version
> > ==========================
> > 
> > A new VFIO device type is introduced - vfio-vhost. This addressed
> > some comments from here: https://patchwork.ozlabs.org/cover/984763/
> > 
> > Below is the updated device interface:
> > 
> > Currently, there are two regions of this device: 1) CONFIG_REGION
> > (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the
> > device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which
> > can be used to notify the device.
> > 
> > 1. CONFIG_REGION
> > 
> > The region described by CONFIG_REGION is the main control interface.
> > Messages will be written to or read from this region.
> > 
> > The message type is determined by the `request` field in message
> > header. The message size is encoded in the message header too.
> > The message format looks like this:
> > 
> > struct vhost_vfio_op {
> > 	__u64 request;
> > 	__u32 flags;
> > 	/* Flag values: */
> >   #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
> > 	__u32 size;
> > 	union {
> > 		__u64 u64;
> > 		struct vhost_vring_state state;
> > 		struct vhost_vring_addr addr;
> > 	} payload;
> > };
> > 
> > The existing vhost-kernel ioctl cmds are reused as the message
> > requests in above structure.
> 
> 
> Still a comments like V1. What's the advantage of inventing a new protocol?

I'm trying to make it work in VFIO's way..

> I believe either of the following should be better:
> 
> - using vhost ioctl,  we can start from SET_VRING_KICK/SET_VRING_CALL and
> extend it with e.g notify region. The advantages is that all exist userspace
> program could be reused without modification (or minimal modification). And
> vhost API hides lots of details that is not necessary to be understood by
> application (e.g in the case of container).

Do you mean reusing vhost's ioctl on VFIO device fd directly,
or introducing another mdev driver (i.e. vhost_mdev instead of
using the existing vfio_mdev) for mdev device?

> 
> - using PCI layout, then you don't even need to re-invent notifiy region at
> all and we can pass-through them to guest.

Like what you said previously, virtio has transports other than PCI.
And it will look a bit odd when using transports other than PCI..

> 
> Personally, I prefer vhost ioctl.

+1

> 
> 
> > 
[...]
> > 
> > 3. VFIO interrupt ioctl API
> > 
> > VFIO interrupt ioctl API is used to setup device interrupts.
> > IRQ-bypass can also be supported.
> > 
> > Currently, the data path interrupt can be configured via the
> > VFIO_VHOST_VQ_IRQ_INDEX with virtqueue's callfd.
> 
> 
> How about DMA API? Do you expect to use VFIO IOMMU API or using vhost
> SET_MEM_TABLE? VFIO IOMMU API is more generic for sure but with
> SET_MEM_TABLE DMA can be done at the level of parent device which means it
> can work for e.g the card with on-chip IOMMU.

Agree. In this RFC, it assumes userspace will use VFIO IOMMU API
to do the DMA programming. But like what you said, there could be
a problem when using cards with on-chip IOMMU.

> 
> And what's the plan for vIOMMU?

As this RFC assumes userspace will use VFIO IOMMU API, userspace
just needs to follow the same way like what vfio-pci device does
in QEMU to support vIOMMU.

> 
> 
> > 
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> >   drivers/vhost/Makefile     |   2 +
> >   drivers/vhost/vdpa.c       | 770 +++++++++++++++++++++++++++++++++++++
> >   include/linux/vdpa_mdev.h  |  72 ++++
> >   include/uapi/linux/vfio.h  |  19 +
> >   include/uapi/linux/vhost.h |  25 ++
> >   5 files changed, 888 insertions(+)
> >   create mode 100644 drivers/vhost/vdpa.c
> >   create mode 100644 include/linux/vdpa_mdev.h
> 
> 
> We probably need some sample parent device implementation. It could be a
> software datapath like e.g we can start from virtio-net device in guest or a
> vhost/tap on host.

Yeah, something like this would be interesting!

Thanks,
Tiwei

> 
> Thanks
> 
> 
> > 

  reply	other threads:[~2019-07-03 11:54 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-03  9:13 [RFC v2] vhost: introduce mdev based hardware vhost backend Tiwei Bie
2019-07-03 10:09 ` Jason Wang
2019-07-03 11:52   ` Tiwei Bie [this message]
2019-07-03 12:16     ` Jason Wang
2019-07-03 13:08       ` Tiwei Bie
2019-07-04  4:31         ` Jason Wang
2019-07-04  6:21           ` Tiwei Bie
2019-07-04  6:35             ` Jason Wang
2019-07-04  7:02               ` Tiwei Bie
2019-07-05  0:30                 ` Jason Wang
2019-07-05  2:23                   ` Tiwei Bie
2019-07-05 14:49             ` Alex Williamson
2019-07-08  6:16               ` Tiwei Bie
2019-07-09  2:50                 ` Jason Wang
2019-07-09  6:33                   ` Tiwei Bie
2019-07-10  2:26                     ` Jason Wang
2019-07-10  6:22                       ` Tiwei Bie
2019-07-10  7:22                         ` Jason Wang
2019-07-18 10:31                           ` Jason Wang
2019-07-03 18:31 ` Alex Williamson
2019-07-04  1:36   ` Tiwei Bie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190703115245.GA22374@___ \
    --to=tiwei.bie@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).