From: Yi Liu <yi.l.liu@intel.com>
To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com
Cc: joro@8bytes.org, robin.murphy@arm.com, cohuck@redhat.com,
eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org,
mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com,
jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com,
lulu@redhat.com, suravee.suthikulpanit@amd.com,
intel-gvt-dev@lists.freedesktop.org,
intel-gfx@lists.freedesktop.org, linux-s390@vger.kernel.org,
xudong.hao@intel.com, yan.y.zhao@intel.com,
terrence.xu@intel.com, yanting.jiang@intel.com
Subject: [PATCH v2 10/10] vfio/pci: Add VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO
Date: Mon, 27 Mar 2023 02:34:58 -0700 [thread overview]
Message-ID: <20230327093458.44939-11-yi.l.liu@intel.com> (raw)
In-Reply-To: <20230327093458.44939-1-yi.l.liu@intel.com>
This is a preparation for vfio device cdev as cdev gives userspace the
capability to open device cdev fd and management stack (e.g. libvirt)
could pass the device fd to the actual user (e.g. QEMU). As a result,
the actual user has no idea about the device's bus:devfn information.
This is a problem when user uses VFIO_DEVICE_GET_PCI_HOT_RESET_INFO to
know the hot reset affected device scope as this ioctl returns bus:devfn
info. For the fd passing usage, the acutal user cannot map the bus:devfn
to the devices it has opened via the fd passed from management stack. So
a new ioctl is required.
This new ioctl reports the list of iommufd dev_id that is opened by the
user. If there is affected device that is not bound to vfio driver or
opened by another user, this command shall fail with -EPERM. For the
noiommu mode in the vfio device cdev path, this shall fail as no dev_id
would be generated, hence nothing to report.
This ioctl is useless to the users that open vfio group as such users
have no idea about the iommufd dev_id and it can use the existing
VFIO_DEVICE_GET_PCI_HOT_RESET_INFO. The user that uses the traditional
mode vfio group/container would be failed if invoking this ioctl. But
the user that uses the iommufd compat mode vfio group/container shall
succeed. This is harmless as long as user cannot make use of it and
should use VFIO_DEVICE_GET_PCI_HOT_RESET_INFO.
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/vfio/pci/vfio_pci_core.c | 98 ++++++++++++++++++++++++++++++++
include/uapi/linux/vfio.h | 47 +++++++++++++++
2 files changed, 145 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 19f5b075d70a..45edf4e9b98b 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1181,6 +1181,102 @@ static int vfio_pci_ioctl_reset(struct vfio_pci_core_device *vdev,
return ret;
}
+static struct pci_dev *
+vfio_pci_dev_set_resettable(struct vfio_device_set *dev_set);
+
+static int vfio_pci_ioctl_get_pci_hot_reset_group_info(
+ struct vfio_pci_core_device *vdev,
+ struct vfio_pci_hot_reset_group_info __user *arg)
+{
+ unsigned long minsz =
+ offsetofend(struct vfio_pci_hot_reset_group_info, count);
+ struct vfio_pci_hot_reset_group_info hdr;
+ struct iommufd_ctx *iommufd, *cur_iommufd;
+ u32 count = 0, index = 0, *devices = NULL;
+ struct vfio_pci_core_device *cur;
+ bool slot = false;
+ int ret = 0;
+
+ if (copy_from_user(&hdr, arg, minsz))
+ return -EFAULT;
+
+ if (hdr.argsz < minsz)
+ return -EINVAL;
+
+ hdr.flags = 0;
+
+ /* Can we do a slot or bus reset or neither? */
+ if (!pci_probe_reset_slot(vdev->pdev->slot))
+ slot = true;
+ else if (pci_probe_reset_bus(vdev->pdev->bus))
+ return -ENODEV;
+
+ mutex_lock(&vdev->vdev.dev_set->lock);
+ if (!vfio_pci_dev_set_resettable(vdev->vdev.dev_set)) {
+ ret = -EPERM;
+ goto out_unlock;
+ }
+
+ iommufd = vfio_iommufd_physical_ictx(&vdev->vdev);
+ if (!iommufd) {
+ ret = -EPERM;
+ goto out_unlock;
+ }
+
+ /* How many devices are affected? */
+ ret = vfio_pci_for_each_slot_or_bus(vdev->pdev, vfio_pci_count_devs,
+ &count, slot);
+ if (ret)
+ goto out_unlock;
+
+ WARN_ON(!count); /* Should always be at least one */
+
+ /*
+ * If there's enough space, fill it now, otherwise return -ENOSPC and
+ * the number of devices affected.
+ */
+ if (hdr.argsz < sizeof(hdr) + (count * sizeof(*devices))) {
+ ret = -ENOSPC;
+ hdr.count = count;
+ goto reset_info_exit;
+ }
+
+ devices = kcalloc(count, sizeof(*devices), GFP_KERNEL);
+ if (!devices) {
+ ret = -ENOMEM;
+ goto reset_info_exit;
+ }
+
+ list_for_each_entry(cur, &vdev->vdev.dev_set->device_list, vdev.dev_set_list) {
+ cur_iommufd = vfio_iommufd_physical_ictx(&cur->vdev);
+ if (cur->vdev.open_count) {
+ if (cur_iommufd != iommufd) {
+ ret = -EPERM;
+ break;
+ }
+ ret = vfio_iommufd_physical_devid(&cur->vdev, &devices[index]);
+ if (ret)
+ break;
+ index++;
+ }
+ }
+
+reset_info_exit:
+ if (copy_to_user(arg, &hdr, minsz))
+ ret = -EFAULT;
+
+ if (!ret) {
+ if (copy_to_user(&arg->devices, devices,
+ hdr.count * sizeof(*devices)))
+ ret = -EFAULT;
+ }
+
+ kfree(devices);
+out_unlock:
+ mutex_unlock(&vdev->vdev.dev_set->lock);
+ return ret;
+}
+
static int vfio_pci_ioctl_get_pci_hot_reset_info(
struct vfio_pci_core_device *vdev,
struct vfio_pci_hot_reset_info __user *arg)
@@ -1404,6 +1500,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
return vfio_pci_ioctl_get_irq_info(vdev, uarg);
case VFIO_DEVICE_GET_PCI_HOT_RESET_INFO:
return vfio_pci_ioctl_get_pci_hot_reset_info(vdev, uarg);
+ case VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO:
+ return vfio_pci_ioctl_get_pci_hot_reset_group_info(vdev, uarg);
case VFIO_DEVICE_GET_REGION_INFO:
return vfio_pci_ioctl_get_region_info(vdev, uarg);
case VFIO_DEVICE_IOEVENTFD:
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 25432ef213ee..61b801dfd40b 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -669,6 +669,53 @@ struct vfio_pci_hot_reset_info {
#define VFIO_DEVICE_GET_PCI_HOT_RESET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
+/**
+ * VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 12,
+ * struct vfio_pci_hot_reset_group_info)
+ *
+ * This is used in the vfio device cdev mode. It returns the list of
+ * affected devices (represented by iommufd dev_id) when hot reset is
+ * issued on the current device with which this ioctl is invoked. It
+ * only includes the devices that are opened by the current user in the
+ * time of this command is invoked. This list may change when user opens
+ * new device or close opened device, hence user should re-invoke it
+ * after open/close devices. This command has no guarantee on the result
+ * of VFIO_DEVICE_PCI_HOT_RESET since the not-opened affected device can
+ * be by other users in the window between the two ioctls. If the affected
+ * devices are opened by multiple users, the VFIO_DEVICE_PCI_HOT_RESET
+ * shall fail, detail can check the description of VFIO_DEVICE_PCI_HOT_RESET.
+ *
+ * For the users that open vfio group/container, this ioctl is useless as
+ * they have no idea about the iommufd dev_id returned by this ioctl. For
+ * the users of the traditional mode vfio group/container, this ioctl will
+ * fail as this mode does not use iommufd hence no dev_id to report back.
+ * For the users of the iommufd compat mode vfio group/container, this ioctl
+ * would succeed as this mode uses iommufd as container fd. But such users
+ * still have no idea about the iommufd dev_id as the dev_id is only stored
+ * in kernel in this mode. For the users of the vfio group/container, the
+ * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO should be used to know the hot reset
+ * affected devices.
+ *
+ * Return: 0 on success, -errno on failure:
+ * -enospc = insufficient buffer;
+ * -enodev = unsupported for device;
+ * -eperm = no permission for device, this error comes:
+ * - when there are affected devices that are opened but
+ * not bound to the same iommufd with the current device
+ * with which this ioctl is invoked,
+ * - there are affected devices that are not bound to vfio
+ * driver yet.
+ * - no valid iommufd is bound (e.g. noiommu mode)
+ */
+struct vfio_pci_hot_reset_group_info {
+ __u32 argsz;
+ __u32 flags;
+ __u32 count;
+ __u32 devices[];
+};
+
+#define VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO _IO(VFIO_TYPE, VFIO_BASE + 18)
+
/**
* VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13,
* struct vfio_pci_hot_reset)
--
2.34.1
next prev parent reply other threads:[~2023-03-27 9:35 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-27 9:34 [PATCH v2 00/10] Introduce new methods for verifying ownership in vfio PCI hot reset Yi Liu
2023-03-27 9:34 ` [PATCH v2 01/10] vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset() Yi Liu
2023-03-27 9:34 ` [PATCH v2 02/10] vfio/pci: Only check ownership of opened devices in hot reset Yi Liu
2023-03-27 9:34 ` [PATCH v2 03/10] vfio/pci: Move the existing hot reset logic to be a helper Yi Liu
2023-03-30 23:39 ` Jason Gunthorpe
2023-03-30 23:44 ` Jason Gunthorpe
2023-03-27 9:34 ` [PATCH v2 04/10] vfio-iommufd: Add helper to retrieve iommufd_ctx and devid for vfio_device Yi Liu
2023-03-30 23:44 ` Jason Gunthorpe
2023-03-27 9:34 ` [PATCH v2 05/10] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET Yi Liu
2023-03-30 23:47 ` Jason Gunthorpe
2023-03-27 9:34 ` [PATCH v2 06/10] vfio: Refine vfio file kAPIs for vfio PCI hot reset Yi Liu
2023-03-30 23:48 ` Jason Gunthorpe
2023-03-27 9:34 ` [PATCH v2 07/10] vfio: Accpet device file from vfio PCI hot reset path Yi Liu
2023-03-30 23:49 ` Jason Gunthorpe
2023-03-27 9:34 ` [PATCH v2 08/10] vfio/pci: Renaming for accepting device fd in " Yi Liu
2023-03-27 9:34 ` [PATCH v2 09/10] vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl Yi Liu
2023-03-30 23:50 ` Jason Gunthorpe
2023-03-27 9:34 ` Yi Liu [this message]
2023-03-27 19:26 ` [PATCH v2 10/10] vfio/pci: Add VFIO_DEVICE_GET_PCI_HOT_RESET_GROUP_INFO Alex Williamson
2023-03-27 20:40 ` Alex Williamson
2023-03-28 3:45 ` Liu, Yi L
2023-03-28 3:32 ` Liu, Yi L
2023-03-28 6:19 ` Tian, Kevin
2023-03-28 14:25 ` Alex Williamson
2023-03-28 14:38 ` Liu, Yi L
2023-03-28 14:46 ` Alex Williamson
2023-03-28 15:00 ` Liu, Yi L
2023-03-28 15:18 ` Alex Williamson
2023-03-28 15:45 ` Liu, Yi L
2023-03-28 16:00 ` Alex Williamson
2023-03-29 3:13 ` Liu, Yi L
2023-03-29 9:41 ` Tian, Kevin
2023-03-29 15:49 ` Alex Williamson
2023-03-29 15:57 ` Jason Gunthorpe
2023-03-30 1:17 ` Tian, Kevin
2023-03-30 22:38 ` Jason Gunthorpe
2023-03-30 12:48 ` Liu, Yi L
2023-03-30 12:56 ` Liu, Yi L
2023-03-30 22:44 ` Jason Gunthorpe
2023-03-30 23:05 ` Alex Williamson
2023-03-30 23:18 ` Jason Gunthorpe
2023-03-29 15:50 ` Jason Gunthorpe
2023-03-30 1:10 ` Tian, Kevin
2023-03-30 1:33 ` Tian, Kevin
2023-03-28 16:29 ` Jason Gunthorpe
2023-03-28 19:09 ` Alex Williamson
2023-03-28 19:22 ` Jason Gunthorpe
2023-03-28 12:40 ` Jason Gunthorpe
2023-03-28 14:45 ` Liu, Yi L
2023-03-31 3:14 ` [PATCH v2 00/10] Introduce new methods for verifying ownership in vfio PCI hot reset Jiang, Yanting
2023-03-31 13:24 ` Alex Williamson
2023-04-03 2:04 ` Jiang, Yanting
2023-03-31 5:01 ` Jiang, Yanting
2023-03-31 17:27 ` Xu, Terrence
2023-03-31 17:49 ` Alex Williamson
2023-04-01 9:15 ` Xu, Terrence
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230327093458.44939-11-yi.l.liu@intel.com \
--to=yi.l.liu@intel.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@linux.intel.com \
--cc=cohuck@redhat.com \
--cc=eric.auger@redhat.com \
--cc=intel-gfx@lists.freedesktop.org \
--cc=intel-gvt-dev@lists.freedesktop.org \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=lulu@redhat.com \
--cc=mjrosato@linux.ibm.com \
--cc=nicolinc@nvidia.com \
--cc=peterx@redhat.com \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suravee.suthikulpanit@amd.com \
--cc=terrence.xu@intel.com \
--cc=xudong.hao@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yanting.jiang@intel.com \
--cc=yi.y.sun@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox