All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tiwei Bie <tiwei.bie@intel.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, alex.williamson@redhat.com,
	maxime.coquelin@redhat.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, dan.daly@intel.com,
	cunming.liang@intel.com, zhihong.wang@intel.com,
	lingshan.zhu@intel.com
Subject: Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
Date: Tue, 17 Sep 2019 18:58:02 +0800	[thread overview]
Message-ID: <20190917105801.GA24855@___> (raw)
In-Reply-To: <993841ed-942e-c90b-8016-8e7dc76bf13a@redhat.com>

On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote:
> On 2019/9/17 上午9:02, Tiwei Bie wrote:
> > This RFC is to demonstrate below ideas,
> > 
> > a) Build vhost-mdev on top of the same abstraction defined in
> >     the virtio-mdev series [1];
> > 
> > b) Introduce /dev/vhost-mdev to do vhost ioctls and support
> >     setting mdev device as backend;
> > 
> > Now the userspace API looks like this:
> > 
> > - Userspace generates a compatible mdev device;
> > 
> > - Userspace opens this mdev device with VFIO API (including
> >    doing IOMMU programming for this mdev device with VFIO's
> >    container/group based interface);
> > 
> > - Userspace opens /dev/vhost-mdev and gets vhost fd;
> > 
> > - Userspace uses vhost ioctls to setup vhost (userspace should
> >    do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device
> >    fd first before doing other vhost ioctls);
> > 
> > Only compile test has been done for this series for now.
> 
> 
> Have a hard thought on the architecture:

Thanks a lot! Do appreciate it!

> 
> 1) Create a vhost char device and pass vfio mdev device fd to it as a
> backend and translate vhost-mdev ioctl to virtio mdev transport (e.g
> read/write). DMA was done through the VFIO DMA mapping on the container that
> is attached.

Yeah, that's what we are doing in this series.

> 
> We have two more choices:
> 
> 2) Use vfio-mdev but do not create vhost-mdev device, instead, just
> implement vhost ioctl on vfio_device_ops, and translate them into
> virtio-mdev transport or just pass ioctl to parent.

Yeah. Instead of introducing /dev/vhost-mdev char device, do
vhost ioctls on VFIO device fd directly. That's what we did
in RFC v3.

> 
> 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still
> try to add dev to vfio group and talk to parent with device specific ops

If my understanding is correct, this means we need to introduce
a new VFIO device driver to replace the existing vfio-mdev driver
in our case. Below is a quick draft just to show my understanding:

#include <linux/init.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/vfio.h>
#include <linux/mdev.h>

#include "mdev_private.h"

/* XXX: we need a proper way to include below vhost header. */
#include "../../vhost/vhost.h"

static int vfio_vhost_mdev_open(void *device_data)
{
	if (!try_module_get(THIS_MODULE))
		return -ENODEV;

	/* ... */
	vhost_dev_init(...);

	return 0;
}

static void vfio_vhost_mdev_release(void *device_data)
{
	/* ... */
	module_put(THIS_MODULE);
}

static long vfio_vhost_mdev_unlocked_ioctl(void *device_data,
					   unsigned int cmd, unsigned long arg)
{
	struct mdev_device *mdev = device_data;
	struct mdev_parent *parent = mdev->parent;

	/*
	 * Use vhost ioctls.
	 *
	 * We will have a different parent_ops design.
	 * And potentially, we can share the same parent_ops
	 * with virtio_mdev.
	 */
	switch (cmd) {
	case VHOST_GET_FEATURES:
		parent->ops->get_features(mdev, ...);
		break;
	/* ... */
	}

	return 0;
}

static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf,
				    size_t count, loff_t *ppos)
{
	/* ... */
	return 0;
}

static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf,
				     size_t count, loff_t *ppos)
{
	/* ... */
	return 0;
}

static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma)
{
	/* ... */
	return 0;
}

static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = {
	.name		= "vfio-vhost-mdev",
	.open		= vfio_vhost_mdev_open,
	.release	= vfio_vhost_mdev_release,
	.ioctl		= vfio_vhost_mdev_unlocked_ioctl,
	.read		= vfio_vhost_mdev_read,
	.write		= vfio_vhost_mdev_write,
	.mmap		= vfio_vhost_mdev_mmap,
};

static int vfio_vhost_mdev_probe(struct device *dev)
{
	struct mdev_device *mdev = to_mdev_device(dev);

	/* ... */
	return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev);
}

static void vfio_vhost_mdev_remove(struct device *dev)
{
	/* ... */
	vfio_del_group_dev(dev);
}

static struct mdev_driver vfio_vhost_mdev_driver = {
	.name	= "vfio_vhost_mdev",
	.probe	= vfio_vhost_mdev_probe,
	.remove	= vfio_vhost_mdev_remove,
};

static int __init vfio_vhost_mdev_init(void)
{
	return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE);
}
module_init(vfio_vhost_mdev_init)

static void __exit vfio_vhost_mdev_exit(void)
{
	mdev_unregister_driver(&vfio_vhost_mdev_driver);
}
module_exit(vfio_vhost_mdev_exit)

> 
> So I have some questions:
> 
> 1) Compared to method 2, what's the advantage of creating a new vhost char
> device? I guess it's for keep the API compatibility?

One benefit is that we can avoid doing vhost ioctls on
VFIO device fd.

> 
> 2) For method 2, is there any easy way for user/admin to distinguish e.g
> ordinary vfio-mdev for vhost from ordinary vfio-mdev?

I think device-api could be a choice.

> I saw you introduce
> ops matching helper but it's not friendly to management.

The ops matching helper is just to check whether a given
vfio-device is based on a mdev device.

> 
> 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that
> assumes the parameter comes from userspace, it prevents support kernel
> virtio drivers.
> 
> 4) So comes the idea of method 3, since it register a new vhost-mdev driver,
> we can use device specific ops instead of VFIO ones, then we can have a
> common API between vDPA parent and vhost-mdev/virtio-mdev drivers.

As the above draft shows, this requires introducing a new
VFIO device driver. I think Alex's opinion matters here.

Thanks,
Tiwei

> 
> What's your thoughts?
> 
> Thanks
> 
> 
> > 
> > RFCv3: https://patchwork.kernel.org/patch/11117785/
> > 
> > [1] https://lkml.org/lkml/2019/9/10/135
> > 
> > Tiwei Bie (3):
> >    vfio: support getting vfio device from device fd
> >    vfio: support checking vfio driver by device ops
> >    vhost: introduce mdev based hardware backend
> > 
> >   drivers/vfio/mdev/vfio_mdev.c    |   3 +-
> >   drivers/vfio/vfio.c              |  32 +++
> >   drivers/vhost/Kconfig            |   9 +
> >   drivers/vhost/Makefile           |   3 +
> >   drivers/vhost/mdev.c             | 462 +++++++++++++++++++++++++++++++
> >   drivers/vhost/vhost.c            |  39 ++-
> >   drivers/vhost/vhost.h            |   6 +
> >   include/linux/vfio.h             |  11 +
> >   include/uapi/linux/vhost.h       |  10 +
> >   include/uapi/linux/vhost_types.h |   5 +
> >   10 files changed, 573 insertions(+), 7 deletions(-)
> >   create mode 100644 drivers/vhost/mdev.c
> > 

  parent reply	other threads:[~2019-09-17 11:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17  1:02 [RFC v4 0/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17  1:02 ` [RFC v4 1/3] vfio: support getting vfio device from device fd Tiwei Bie
2019-09-17  1:02 ` Tiwei Bie
2019-09-17  1:02 ` [RFC v4 2/3] vfio: support checking vfio driver by device ops Tiwei Bie
2019-09-17  1:02 ` Tiwei Bie
2019-09-17  1:02 ` [RFC v4 3/3] vhost: introduce mdev based hardware backend Tiwei Bie
2019-09-17  1:02   ` Tiwei Bie
2019-09-17  7:26   ` Jason Wang
2019-09-20  4:21     ` Tiwei Bie
2019-09-20  4:21     ` Tiwei Bie
2019-09-17  7:26   ` Jason Wang
2019-09-17  1:29 ` [RFC v4 0/3] " Jason Wang
2019-09-17  1:29 ` Jason Wang
2019-09-17  3:32 ` Jason Wang
2019-09-17  3:32 ` Jason Wang
2019-09-17 10:58   ` Tiwei Bie
2019-09-17 10:58   ` Tiwei Bie [this message]
2019-09-18  5:51     ` Jason Wang
2019-09-18  5:51     ` Jason Wang
2019-09-18 14:32       ` Michael S. Tsirkin
2019-09-18 14:32       ` Michael S. Tsirkin
2019-09-19 13:08         ` Jason Wang
2019-09-19 13:08         ` Jason Wang
2019-09-19 15:45           ` Tiwei Bie
2019-09-19 15:45           ` Tiwei Bie
2019-09-20  0:59             ` Jason Wang
2019-09-20  0:59               ` Jason Wang
2019-09-20  1:30             ` Jason Wang
2019-09-20  1:30               ` Jason Wang
2019-09-20  2:16               ` Tiwei Bie
2019-09-20  2:36                 ` Jason Wang
2019-09-20  2:36                   ` Jason Wang
2019-09-20  2:16               ` Tiwei Bie
  -- strict thread matches above, loose matches on Subject: below --
2019-09-17  1:02 Tiwei Bie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190917105801.GA24855@___ \
    --to=tiwei.bie@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lingshan.zhu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.