From: Alex Williamson <alex.williamson@redhat.com>
To: "Liu, Yi L" <yi.l.liu@intel.com>
Cc: "jgg@nvidia.com" <jgg@nvidia.com>,
"Tian, Kevin" <kevin.tian@intel.com>,
"joro@8bytes.org" <joro@8bytes.org>,
"robin.murphy@arm.com" <robin.murphy@arm.com>,
"cohuck@redhat.com" <cohuck@redhat.com>,
"eric.auger@redhat.com" <eric.auger@redhat.com>,
"nicolinc@nvidia.com" <nicolinc@nvidia.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"mjrosato@linux.ibm.com" <mjrosato@linux.ibm.com>,
"chao.p.peng@linux.intel.com" <chao.p.peng@linux.intel.com>,
"yi.y.sun@linux.intel.com" <yi.y.sun@linux.intel.com>,
"peterx@redhat.com" <peterx@redhat.com>,
"jasowang@redhat.com" <jasowang@redhat.com>,
"shameerali.kolothum.thodi@huawei.com"
<shameerali.kolothum.thodi@huawei.com>,
"lulu@redhat.com" <lulu@redhat.com>,
"suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>,
"intel-gvt-dev@lists.freedesktop.org"
<intel-gvt-dev@lists.freedesktop.org>,
"intel-gfx@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>,
"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
"Hao, Xudong" <xudong.hao@intel.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"Xu, Terrence" <terrence.xu@intel.com>,
"Jiang, Yanting" <yanting.jiang@intel.com>,
"Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
"clegoate@redhat.com" <clegoate@redhat.com>
Subject: Re: [PATCH v11 20/23] vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT
Date: Wed, 24 May 2023 09:31:42 -0600 [thread overview]
Message-ID: <20230524093142.3cac798e.alex.williamson@redhat.com> (raw)
In-Reply-To: <DS0PR11MB75292161F081F27C0650EFB3C3419@DS0PR11MB7529.namprd11.prod.outlook.com>
On Wed, 24 May 2023 02:12:14 +0000
"Liu, Yi L" <yi.l.liu@intel.com> wrote:
> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, May 23, 2023 11:50 PM
> >
> > On Tue, 23 May 2023 01:20:17 +0000
> > "Liu, Yi L" <yi.l.liu@intel.com> wrote:
> >
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, May 23, 2023 6:16 AM
> > > >
> > > > On Sat, 13 May 2023 06:28:24 -0700
> > > > Yi Liu <yi.l.liu@intel.com> wrote:
> > > >
> > > > > This adds ioctl for userspace to attach device cdev fd to and detach
> > > > > from IOAS/hw_pagetable managed by iommufd.
> > > > >
> > > > > VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach vfio device to IOAS, hw_pagetable
> > > > > managed by iommufd. Attach can be
> > > > > undo by VFIO_DEVICE_DETACH_IOMMUFD_PT
> > > > > or device fd close.
> > > > > VFIO_DEVICE_DETACH_IOMMUFD_PT: detach vfio device from the current
> > attached
> > > > > IOAS or hw_pagetable managed by iommufd.
> > > > >
> > > > > Tested-by: Yanting Jiang <yanting.jiang@intel.com>
> > > > > Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > > > > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > > > > ---
> > > > > drivers/vfio/device_cdev.c | 66 ++++++++++++++++++++++++++++++++++++++
> > > > > drivers/vfio/iommufd.c | 18 +++++++++++
> > > > > drivers/vfio/vfio.h | 18 +++++++++++
> > > > > drivers/vfio/vfio_main.c | 8 +++++
> > > > > include/uapi/linux/vfio.h | 52 ++++++++++++++++++++++++++++++
> > > > > 5 files changed, 162 insertions(+)
> > > > >
> > > > > diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> > > > > index 291cc678a18b..3f14edb80a93 100644
> > > > > --- a/drivers/vfio/device_cdev.c
> > > > > +++ b/drivers/vfio/device_cdev.c
> > > > > @@ -174,6 +174,72 @@ long vfio_device_ioctl_bind_iommufd(struct
> > vfio_device_file
> > > > *df,
> > > > > return ret;
> > > > > }
> > > > >
> > > > > +int vfio_ioctl_device_attach(struct vfio_device_file *df,
> > > > > + struct vfio_device_attach_iommufd_pt __user *arg)
> > > > > +{
> > > > > + struct vfio_device *device = df->device;
> > > > > + struct vfio_device_attach_iommufd_pt attach;
> > > > > + unsigned long minsz;
> > > > > + int ret;
> > > > > +
> > > > > + minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
> > > > > +
> > > > > + if (copy_from_user(&attach, arg, minsz))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + if (attach.argsz < minsz || attach.flags)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* ATTACH only allowed for cdev fds */
> > > > > + if (df->group)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + mutex_lock(&device->dev_set->lock);
> > > > > + ret = vfio_iommufd_attach(device, &attach.pt_id);
> > > > > + if (ret)
> > > > > + goto out_unlock;
> > > > > +
> > > > > + ret = copy_to_user(&arg->pt_id, &attach.pt_id,
> > > > > + sizeof(attach.pt_id)) ? -EFAULT : 0;
> > > > > + if (ret)
> > > > > + goto out_detach;
> > > > > + mutex_unlock(&device->dev_set->lock);
> > > > > +
> > > > > + return 0;
> > > > > +
> > > > > +out_detach:
> > > > > + vfio_iommufd_detach(device);
> > > > > +out_unlock:
> > > > > + mutex_unlock(&device->dev_set->lock);
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > > +int vfio_ioctl_device_detach(struct vfio_device_file *df,
> > > > > + struct vfio_device_detach_iommufd_pt __user *arg)
> > > > > +{
> > > > > + struct vfio_device *device = df->device;
> > > > > + struct vfio_device_detach_iommufd_pt detach;
> > > > > + unsigned long minsz;
> > > > > +
> > > > > + minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
> > > > > +
> > > > > + if (copy_from_user(&detach, arg, minsz))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + if (detach.argsz < minsz || detach.flags)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* DETACH only allowed for cdev fds */
> > > > > + if (df->group)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + mutex_lock(&device->dev_set->lock);
> > > > > + vfio_iommufd_detach(device);
> > > > > + mutex_unlock(&device->dev_set->lock);
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > > static char *vfio_device_devnode(const struct device *dev, umode_t *mode)
> > > > > {
> > > > > return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev));
> > > > > diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
> > > > > index 83575b65ea01..799ea322a7d4 100644
> > > > > --- a/drivers/vfio/iommufd.c
> > > > > +++ b/drivers/vfio/iommufd.c
> > > > > @@ -112,6 +112,24 @@ void vfio_iommufd_unbind(struct vfio_device_file *df)
> > > > > vdev->ops->unbind_iommufd(vdev);
> > > > > }
> > > > >
> > > > > +int vfio_iommufd_attach(struct vfio_device *vdev, u32 *pt_id)
> > > > > +{
> > > > > + lockdep_assert_held(&vdev->dev_set->lock);
> > > > > +
> > > > > + if (vfio_device_is_noiommu(vdev))
> > > > > + return 0;
> > > >
> > > > Isn't this an invalid operation for a noiommu cdev, ie. -EINVAL? We
> > > > return success and copy back the provided pt_id, why would a user not
> > > > consider it a bug that they can't use whatever value was there with
> > > > iommufd?
> > >
> > > Yes, this is the question I asked in [1]. At that time, it appears to me
> > > that better to allow it [2]. Maybe it's more suitable to ask it here.
> >
> > From an API perspective it seems wrong. We return success without
> > doing anything. A user would be right to consider it a bug that the
> > attach operation works but there's not actually any association to the
> > IOAS. Thanks,
>
> The current version is kind of tradeoff based on prior remarks when
> I asked the question. As prior comment[2], it appears to me the attach
> shall success for noiommu devices as well, but per your remark it seems
> not in plan. So anyway, we may just fail the attach/detach for noiommu
> devices. Is it?
If a user creates an ioas within an iommufd, attaches a device to that
ioas and populates it with mappings, wouldn't the user expect the
device to have access to and honor those mappings? I think that's the
path we're headed down if we report a successful attach of a noiommu
device to an ioas.
We need to keep in mind that noiommu was meant to be a minimally
intrusive mechanism to provide a dummy vfio IOMMU backend and satisfy
the group requirements, solely for the purpose of making use of the
vfio device interface and without providing any DMA mapping services or
expectations. IMO, an argument that we need the attach op to succeed in
order to avoid too much disruption in userspace code is nonsense. On
the contrary, userspace needs to be very aware of this difference and
we shouldn't invest effort trying to make noiommu more convenient to
use. It's inherently unsafe.
I'm not fond of what a mess noiommu has become with cdev, we're well
beyond the minimal code trickery of the legacy implementation. I hate
to ask, but could we reiterate our requirements for noiommu as a part of
the native iommufd interface for vfio? The nested userspace requirement
is gone now that hypervisors have vIOMMU support, so my assumption is
that this is only for bare metal systems without an IOMMU, which
ideally are less and less prevalent. Are there any noiommu userspaces
that are actually going to adopt the noiommu cdev interface? What
terrible things happen if noiommu only exists in the vfio group compat
interface to iommufd and at some distant point in the future dies when
that gets disabled?
> btw. Should we document it somewhere as well? E.g. noiommu userspace
> does not support attach/detach? Userspace should know it is opening
> noiommu devices.
Documentation never hurts. This is such a specialized use case I'm not
sure we've bothered to do much documentation for noiommu previously.
Thanks,
Alex
next prev parent reply other threads:[~2023-05-24 15:32 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-13 13:28 [PATCH v11 00/23] Add vfio_device cdev for iommufd support Yi Liu
2023-05-13 13:28 ` [PATCH v11 01/23] vfio: Allocate per device file structure Yi Liu
2023-05-13 13:28 ` [PATCH v11 02/23] vfio: Refine vfio file kAPIs for KVM Yi Liu
2023-05-13 13:28 ` [PATCH v11 03/23] vfio: Accept vfio device file in the KVM facing kAPI Yi Liu
2023-05-22 19:42 ` Alex Williamson
2023-05-13 13:28 ` [PATCH v11 04/23] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd Yi Liu
2023-05-13 13:28 ` [PATCH v11 05/23] kvm/vfio: Accept vfio device file from userspace Yi Liu
2023-05-13 13:28 ` [PATCH v11 06/23] vfio: Pass struct vfio_device_file * to vfio_device_open/close() Yi Liu
2023-05-13 13:28 ` [PATCH v11 07/23] vfio: Block device access via device fd until device is opened Yi Liu
2023-05-13 13:28 ` [PATCH v11 08/23] vfio: Add cdev_device_open_cnt to vfio_group Yi Liu
2023-05-13 13:28 ` [PATCH v11 09/23] vfio: Make vfio_device_open() single open for device cdev path Yi Liu
2023-05-13 13:28 ` [PATCH v11 10/23] vfio-iommufd: Move noiommu compat probe out of vfio_iommufd_bind() Yi Liu
2023-05-22 20:24 ` Alex Williamson
2023-05-23 0:45 ` Liu, Yi L
2023-05-13 13:28 ` [PATCH v11 11/23] vfio-iommufd: Split bind/attach into two steps Yi Liu
2023-05-13 13:28 ` [PATCH v11 12/23] vfio: Record devid in vfio_device_file Yi Liu
2023-05-13 13:28 ` [PATCH v11 13/23] vfio-iommufd: Add detach_ioas support for physical VFIO devices Yi Liu
2023-05-22 20:46 ` Alex Williamson
2023-05-22 20:59 ` Alex Williamson
2023-05-23 1:34 ` Liu, Yi L
2023-05-13 13:28 ` [PATCH v11 14/23] iommufd/device: Add iommufd_access_detach() API Yi Liu
2023-05-13 13:28 ` [PATCH v11 15/23] vfio-iommufd: Add detach_ioas support for emulated VFIO devices Yi Liu
2023-05-13 13:28 ` [PATCH v11 16/23] vfio: Name noiommu vfio_device with "noiommu-" prefix Yi Liu
2023-05-13 13:28 ` [PATCH v11 17/23] vfio: Move vfio_device_group_unregister() to be the first operation in unregister Yi Liu
2023-05-13 13:28 ` [PATCH v11 18/23] vfio: Add cdev for vfio_device Yi Liu
2023-05-13 13:28 ` [PATCH v11 19/23] vfio: Add VFIO_DEVICE_BIND_IOMMUFD Yi Liu
2023-05-22 22:01 ` Alex Williamson
2023-05-23 1:41 ` Liu, Yi L
2023-05-23 15:51 ` Alex Williamson
2023-05-24 2:20 ` Liu, Yi L
2023-05-24 2:39 ` Tian, Kevin
2023-05-24 2:40 ` Liu, Yi L
2023-05-24 8:31 ` Liu, Yi L
2023-05-13 13:28 ` [PATCH v11 20/23] vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT Yi Liu
2023-05-22 22:15 ` Alex Williamson
2023-05-23 1:20 ` Liu, Yi L
2023-05-23 15:50 ` Alex Williamson
2023-05-24 2:12 ` Liu, Yi L
2023-05-24 15:31 ` Alex Williamson [this message]
2023-05-25 3:03 ` Liu, Yi L
2023-05-25 15:59 ` Alex Williamson
2023-05-26 8:38 ` Liu, Yi L
2023-06-06 14:42 ` Jason Gunthorpe
2023-06-06 14:40 ` Jason Gunthorpe
2023-05-13 13:28 ` [PATCH v11 21/23] vfio: Determine noiommu device in __vfio_register_dev() Yi Liu
2023-05-22 23:04 ` Alex Williamson
2023-05-23 2:13 ` Liu, Yi L
2023-05-24 8:14 ` Liu, Yi L
2023-05-13 13:28 ` [PATCH v11 22/23] vfio: Compile vfio_group infrastructure optionally Yi Liu
2023-05-13 13:28 ` [PATCH v11 23/23] docs: vfio: Add vfio device cdev description Yi Liu
2023-05-18 5:39 ` [PATCH v11 00/23] Add vfio_device cdev for iommufd support Xu, Terrence
2023-05-23 7:42 ` Duan, Zhenzhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230524093142.3cac798e.alex.williamson@redhat.com \
--to=alex.williamson@redhat.com \
--cc=chao.p.peng@linux.intel.com \
--cc=clegoate@redhat.com \
--cc=cohuck@redhat.com \
--cc=eric.auger@redhat.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lulu@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=terrence.xu@intel.com \
--cc=xudong.hao@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yanting.jiang@intel.com \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox