public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] vfio-pci support pasid attach/detach
@ 2024-11-08 12:17 Yi Liu
  2024-11-08 12:17 ` [PATCH v5 1/5] ida: Add ida_find_first_range() Yi Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will

This adds the pasid attach/detach uAPIs for userspace to attach/detach
a PASID of a device to/from a given ioas/hwpt. Only vfio-pci driver is
enabled in this series. After this series, PASID-capable devices bound
with vfio-pci can report PASID capability to userspace and VM to enable
PASID usages like Shared Virtual Addressing (SVA).

Based on the discussion about reporting the vPASID to VM [1], it's agreed
that we will let the userspace VMM to synthesize the vPASID capability.
The VMM needs to figure out a hole to put the vPASID cap. This includes
the hidden bits handling for some devices. While, it's up to the userspace,
it's not the focus of this series.

This series first adds the helpers for pasid attach in vfio core and then
extends the device cdev attach/detach ioctls for pasid attach/detach. In the
end of this series, the IOMMU_GET_HW_INFO ioctl is extended to report the
PCI PASID capability to the userspace. Userspace should check this before
using any PASID related uAPIs provided by VFIO, which is the agreement in [2].
This series depends on the iommufd pasid attach/detach series [3].

The completed code can be found at [4], tested with a hacky Qemu branch [5].

[1] https://lore.kernel.org/kvm/BN9PR11MB5276318969A212AD0649C7BE8CBE2@BN9PR11MB5276.namprd11.prod.outlook.com/
[2] https://lore.kernel.org/kvm/4f2daf50-a5ad-4599-ab59-bcfc008688d8@intel.com/
[3] https://lore.kernel.org/linux-iommu/20241104132513.15890-1-yi.l.liu@intel.com/
[4] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid
[5] https://github.com/yiliu1765/qemu/tree/wip/zhenzhong/iommufd_nesting_rfcv2-test-pasid

Change log:

v5:
 - Fix a wrong return value (Alex)
 - Fix the polic of setting the xend array per flag extension (Alex)
 - A separate patch to generalize the code of copy user data (Alex)

v4: https://lore.kernel.org/kvm/20241104132732.16759-1-yi.l.liu@intel.com/
 - Add acked-by for the ida patch from Matthew
 - Add r-b from Kevin and Jason on patch 01, 02 and 04 of v3
 - Add common code to copy user data for the user struct with new fields
 - Extend the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT to support pasid, patch 03
   is updated per this change. Hence drop r-b of it. (Kevin, Alex)
 - Add t-b from Zhangfei for patch 4 of v3
 - Nits from Vasant

v3: https://lore.kernel.org/linux-iommu/20240912131729.14951-1-yi.l.liu@intel.com/
 - Misc enhancement on patch 01 of v2 (Alex, Jason)
 - Add Jason's r-b to patch 03 of v2
 - Drop the logic that report PASID via VFIO_DEVICE_FEATURE ioctl
 - Extend IOMMU_GET_HW_INFO to report PASID support (Kevin, Jason, Alex)

v2: https://lore.kernel.org/kvm/20240412082121.33382-1-yi.l.liu@intel.com/
 - Use IDA to track if PASID is attached or not in VFIO. (Jason)
 - Fix the issue of calling pasid_at[de]tach_ioas callback unconditionally (Alex)
 - Fix the wrong data copy in vfio_df_ioctl_pasid_detach_pt() (Zhenzhong)
 - Minor tweaks in comments (Kevin)

v1: https://lore.kernel.org/kvm/20231127063909.129153-1-yi.l.liu@intel.com/
 - Report PASID capability via VFIO_DEVICE_FEATURE (Alex)

rfc: https://lore.kernel.org/linux-iommu/20230926093121.18676-1-yi.l.liu@intel.com/

Regards,
	Yi Liu

Yi Liu (5):
  ida: Add ida_find_first_range()
  vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices
  vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
  vfio: Generalize the logic of copying user data for struct with
    extension
  iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability

 drivers/iommu/iommufd/device.c | 24 +++++++++++-
 drivers/pci/ats.c              | 33 ++++++++++++++++
 drivers/vfio/device_cdev.c     | 62 +++++++++++++++++++++---------
 drivers/vfio/iommufd.c         | 50 ++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci.c    |  2 +
 drivers/vfio/vfio.h            | 18 +++++++++
 drivers/vfio/vfio_main.c       | 55 ++++++++++++++++++++++++++
 include/linux/idr.h            | 11 ++++++
 include/linux/pci-ats.h        |  3 ++
 include/linux/vfio.h           | 11 ++++++
 include/uapi/linux/iommufd.h   | 14 ++++++-
 include/uapi/linux/vfio.h      | 29 +++++++++-----
 lib/idr.c                      | 67 ++++++++++++++++++++++++++++++++
 lib/test_ida.c                 | 70 ++++++++++++++++++++++++++++++++++
 14 files changed, 419 insertions(+), 30 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/5] ida: Add ida_find_first_range()
  2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
@ 2024-11-08 12:17 ` Yi Liu
  2024-11-08 12:17 ` [PATCH v5 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will, Matthew Wilcox

There is no helpers for user to check if a given ID is allocated or not,
neither a helper to loop all the allocated IDs in an IDA and do something
for cleanup. With the two needs, a helper to get the lowest allocated ID
of a range and two variants based on it.

Caller can check if a given ID is allocated or not by:

	bool ida_exists(struct ida *ida, unsigned int id)

Caller can iterate all allocated IDs by:

	int id;
	while ((id = ida_find_first(&pasid_ida)) >= 0) {
		//anything to do with the allocated ID
		ida_free(pasid_ida, pasid);
	}

Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/idr.h | 11 +++++++
 lib/idr.c           | 67 +++++++++++++++++++++++++++++++++++++++++++
 lib/test_ida.c      | 70 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 148 insertions(+)

diff --git a/include/linux/idr.h b/include/linux/idr.h
index da5f5fa4a3a6..718f9b1b91af 100644
--- a/include/linux/idr.h
+++ b/include/linux/idr.h
@@ -257,6 +257,7 @@ struct ida {
 int ida_alloc_range(struct ida *, unsigned int min, unsigned int max, gfp_t);
 void ida_free(struct ida *, unsigned int id);
 void ida_destroy(struct ida *ida);
+int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max);
 
 /**
  * ida_alloc() - Allocate an unused ID.
@@ -328,4 +329,14 @@ static inline bool ida_is_empty(const struct ida *ida)
 {
 	return xa_empty(&ida->xa);
 }
+
+static inline bool ida_exists(struct ida *ida, unsigned int id)
+{
+	return ida_find_first_range(ida, id, id) == id;
+}
+
+static inline int ida_find_first(struct ida *ida)
+{
+	return ida_find_first_range(ida, 0, ~0);
+}
 #endif /* __IDR_H__ */
diff --git a/lib/idr.c b/lib/idr.c
index da36054c3ca0..e2adc457abb4 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -476,6 +476,73 @@ int ida_alloc_range(struct ida *ida, unsigned int min, unsigned int max,
 }
 EXPORT_SYMBOL(ida_alloc_range);
 
+/**
+ * ida_find_first_range - Get the lowest used ID.
+ * @ida: IDA handle.
+ * @min: Lowest ID to get.
+ * @max: Highest ID to get.
+ *
+ * Get the lowest used ID between @min and @max, inclusive.  The returned
+ * ID will not exceed %INT_MAX, even if @max is larger.
+ *
+ * Context: Any context. Takes and releases the xa_lock.
+ * Return: The lowest used ID, or errno if no used ID is found.
+ */
+int ida_find_first_range(struct ida *ida, unsigned int min, unsigned int max)
+{
+	unsigned long index = min / IDA_BITMAP_BITS;
+	unsigned int offset = min % IDA_BITMAP_BITS;
+	unsigned long *addr, size, bit;
+	unsigned long tmp = 0;
+	unsigned long flags;
+	void *entry;
+	int ret;
+
+	if ((int)min < 0)
+		return -EINVAL;
+	if ((int)max < 0)
+		max = INT_MAX;
+
+	xa_lock_irqsave(&ida->xa, flags);
+
+	entry = xa_find(&ida->xa, &index, max / IDA_BITMAP_BITS, XA_PRESENT);
+	if (!entry) {
+		ret = -ENOENT;
+		goto err_unlock;
+	}
+
+	if (index > min / IDA_BITMAP_BITS)
+		offset = 0;
+	if (index * IDA_BITMAP_BITS + offset > max) {
+		ret = -ENOENT;
+		goto err_unlock;
+	}
+
+	if (xa_is_value(entry)) {
+		tmp = xa_to_value(entry);
+		addr = &tmp;
+		size = BITS_PER_XA_VALUE;
+	} else {
+		addr = ((struct ida_bitmap *)entry)->bitmap;
+		size = IDA_BITMAP_BITS;
+	}
+
+	bit = find_next_bit(addr, size, offset);
+
+	xa_unlock_irqrestore(&ida->xa, flags);
+
+	if (bit == size ||
+	    index * IDA_BITMAP_BITS + bit > max)
+		return -ENOENT;
+
+	return index * IDA_BITMAP_BITS + bit;
+
+err_unlock:
+	xa_unlock_irqrestore(&ida->xa, flags);
+	return ret;
+}
+EXPORT_SYMBOL(ida_find_first_range);
+
 /**
  * ida_free() - Release an allocated ID.
  * @ida: IDA handle.
diff --git a/lib/test_ida.c b/lib/test_ida.c
index c80155a1956d..63078f8dc13f 100644
--- a/lib/test_ida.c
+++ b/lib/test_ida.c
@@ -189,6 +189,75 @@ static void ida_check_bad_free(struct ida *ida)
 	IDA_BUG_ON(ida, !ida_is_empty(ida));
 }
 
+/*
+ * Check ida_find_first_range() and varriants.
+ */
+static void ida_check_find_first(struct ida *ida)
+{
+	/* IDA is empty; all of the below should be not exist */
+	IDA_BUG_ON(ida, ida_exists(ida, 0));
+	IDA_BUG_ON(ida, ida_exists(ida, 3));
+	IDA_BUG_ON(ida, ida_exists(ida, 63));
+	IDA_BUG_ON(ida, ida_exists(ida, 1023));
+	IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1));
+
+	/* IDA contains a single value entry */
+	IDA_BUG_ON(ida, ida_alloc_min(ida, 3, GFP_KERNEL) != 3);
+	IDA_BUG_ON(ida, ida_exists(ida, 0));
+	IDA_BUG_ON(ida, !ida_exists(ida, 3));
+	IDA_BUG_ON(ida, ida_exists(ida, 63));
+	IDA_BUG_ON(ida, ida_exists(ida, 1023));
+	IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1));
+
+	IDA_BUG_ON(ida, ida_alloc_min(ida, 63, GFP_KERNEL) != 63);
+	IDA_BUG_ON(ida, ida_exists(ida, 0));
+	IDA_BUG_ON(ida, !ida_exists(ida, 3));
+	IDA_BUG_ON(ida, !ida_exists(ida, 63));
+	IDA_BUG_ON(ida, ida_exists(ida, 1023));
+	IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1));
+
+	/* IDA contains a single bitmap */
+	IDA_BUG_ON(ida, ida_alloc_min(ida, 1023, GFP_KERNEL) != 1023);
+	IDA_BUG_ON(ida, ida_exists(ida, 0));
+	IDA_BUG_ON(ida, !ida_exists(ida, 3));
+	IDA_BUG_ON(ida, !ida_exists(ida, 63));
+	IDA_BUG_ON(ida, !ida_exists(ida, 1023));
+	IDA_BUG_ON(ida, ida_exists(ida, (1 << 20) - 1));
+
+	/* IDA contains a tree */
+	IDA_BUG_ON(ida, ida_alloc_min(ida, (1 << 20) - 1, GFP_KERNEL) != (1 << 20) - 1);
+	IDA_BUG_ON(ida, ida_exists(ida, 0));
+	IDA_BUG_ON(ida, !ida_exists(ida, 3));
+	IDA_BUG_ON(ida, !ida_exists(ida, 63));
+	IDA_BUG_ON(ida, !ida_exists(ida, 1023));
+	IDA_BUG_ON(ida, !ida_exists(ida, (1 << 20) - 1));
+
+	/* Now try to find first */
+	IDA_BUG_ON(ida, ida_find_first(ida) != 3);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, -1, 2) != -EINVAL);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 2) != -ENOENT); // no used ID
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 0, 3) != 3);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 1, 3) != 3);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 3, 3) != 3);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 2, 4) != 3);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 3) != -ENOENT); // min > max, fail
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 60) != -ENOENT); // no used ID
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 4, 64) != 63);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 63, 63) != 63);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 64, 1026) != 1023);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, 1023) != 1023);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 1023, (1 << 20) - 1) != 1023);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, 1024, (1 << 20) - 1) != (1 << 20) - 1);
+	IDA_BUG_ON(ida, ida_find_first_range(ida, (1 << 20), INT_MAX) != -ENOENT);
+
+	ida_free(ida, 3);
+	ida_free(ida, 63);
+	ida_free(ida, 1023);
+	ida_free(ida, (1 << 20) - 1);
+
+	IDA_BUG_ON(ida, !ida_is_empty(ida));
+}
+
 static DEFINE_IDA(ida);
 
 static int ida_checks(void)
@@ -202,6 +271,7 @@ static int ida_checks(void)
 	ida_check_max(&ida);
 	ida_check_conv(&ida);
 	ida_check_bad_free(&ida);
+	ida_check_find_first(&ida);
 
 	printk("IDA: %u of %u tests passed\n", tests_passed, tests_run);
 	return (tests_run != tests_passed) ? 0 : -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices
  2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
  2024-11-08 12:17 ` [PATCH v5 1/5] ida: Add ida_find_first_range() Yi Liu
@ 2024-11-08 12:17 ` Yi Liu
  2024-11-08 12:17 ` [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Yi Liu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will

This adds pasid_at|de]tach_ioas ops for attaching hwpt to pasid of a
device and the helpers for it. For now, only vfio-pci supports pasid
attach/detach.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/iommufd.c      | 50 +++++++++++++++++++++++++++++++++++++
 drivers/vfio/pci/vfio_pci.c |  2 ++
 include/linux/vfio.h        | 11 ++++++++
 3 files changed, 63 insertions(+)

diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c
index 82eba6966fa5..039c0157ac1c 100644
--- a/drivers/vfio/iommufd.c
+++ b/drivers/vfio/iommufd.c
@@ -119,14 +119,22 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
 	if (IS_ERR(idev))
 		return PTR_ERR(idev);
 	vdev->iommufd_device = idev;
+	ida_init(&vdev->pasids);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_bind);
 
 void vfio_iommufd_physical_unbind(struct vfio_device *vdev)
 {
+	int pasid;
+
 	lockdep_assert_held(&vdev->dev_set->lock);
 
+	while ((pasid = ida_find_first(&vdev->pasids)) >= 0) {
+		iommufd_device_pasid_detach(vdev->iommufd_device, pasid);
+		ida_free(&vdev->pasids, pasid);
+	}
+
 	if (vdev->iommufd_attached) {
 		iommufd_device_detach(vdev->iommufd_device);
 		vdev->iommufd_attached = false;
@@ -168,6 +176,48 @@ void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev)
 }
 EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas);
 
+int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev,
+					    u32 pasid, u32 *pt_id)
+{
+	int rc;
+
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (WARN_ON(!vdev->iommufd_device))
+		return -EINVAL;
+
+	if (ida_exists(&vdev->pasids, pasid))
+		return iommufd_device_pasid_replace(vdev->iommufd_device,
+						    pasid, pt_id);
+
+	rc = ida_alloc_range(&vdev->pasids, pasid, pasid, GFP_KERNEL);
+	if (rc < 0)
+		return rc;
+
+	rc = iommufd_device_pasid_attach(vdev->iommufd_device, pasid, pt_id);
+	if (rc)
+		ida_free(&vdev->pasids, pasid);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_attach_ioas);
+
+void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev,
+					     u32 pasid)
+{
+	lockdep_assert_held(&vdev->dev_set->lock);
+
+	if (WARN_ON(!vdev->iommufd_device))
+		return;
+
+	if (!ida_exists(&vdev->pasids, pasid))
+		return;
+
+	iommufd_device_pasid_detach(vdev->iommufd_device, pasid);
+	ida_free(&vdev->pasids, pasid);
+}
+EXPORT_SYMBOL_GPL(vfio_iommufd_physical_pasid_detach_ioas);
+
 /*
  * The emulated standard ops mean that vfio_device is going to use the
  * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index e727941f589d..6f7ae7e5b7b0 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -144,6 +144,8 @@ static const struct vfio_device_ops vfio_pci_ops = {
 	.unbind_iommufd	= vfio_iommufd_physical_unbind,
 	.attach_ioas	= vfio_iommufd_physical_attach_ioas,
 	.detach_ioas	= vfio_iommufd_physical_detach_ioas,
+	.pasid_attach_ioas	= vfio_iommufd_physical_pasid_attach_ioas,
+	.pasid_detach_ioas	= vfio_iommufd_physical_pasid_detach_ioas,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 000a6cab2d31..11b3b453752e 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -67,6 +67,7 @@ struct vfio_device {
 	struct inode *inode;
 #if IS_ENABLED(CONFIG_IOMMUFD)
 	struct iommufd_device *iommufd_device;
+	struct ida pasids;
 	u8 iommufd_attached:1;
 #endif
 	u8 cdev_opened:1;
@@ -91,6 +92,8 @@ struct vfio_device {
  *		 bound iommufd. Undo in unbind_iommufd if @detach_ioas is not
  *		 called.
  * @detach_ioas: Opposite of attach_ioas
+ * @pasid_attach_ioas: The pasid variation of attach_ioas
+ * @pasid_detach_ioas: Opposite of pasid_attach_ioas
  * @open_device: Called when the first file descriptor is opened for this device
  * @close_device: Opposite of open_device
  * @read: Perform read(2) on device file descriptor
@@ -115,6 +118,8 @@ struct vfio_device_ops {
 	void	(*unbind_iommufd)(struct vfio_device *vdev);
 	int	(*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
 	void	(*detach_ioas)(struct vfio_device *vdev);
+	int	(*pasid_attach_ioas)(struct vfio_device *vdev, u32 pasid, u32 *pt_id);
+	void	(*pasid_detach_ioas)(struct vfio_device *vdev, u32 pasid);
 	int	(*open_device)(struct vfio_device *vdev);
 	void	(*close_device)(struct vfio_device *vdev);
 	ssize_t	(*read)(struct vfio_device *vdev, char __user *buf,
@@ -139,6 +144,8 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev,
 void vfio_iommufd_physical_unbind(struct vfio_device *vdev);
 int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id);
 void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev);
+int vfio_iommufd_physical_pasid_attach_ioas(struct vfio_device *vdev, u32 pasid, u32 *pt_id);
+void vfio_iommufd_physical_pasid_detach_ioas(struct vfio_device *vdev, u32 pasid);
 int vfio_iommufd_emulated_bind(struct vfio_device *vdev,
 			       struct iommufd_ctx *ictx, u32 *out_device_id);
 void vfio_iommufd_emulated_unbind(struct vfio_device *vdev);
@@ -166,6 +173,10 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx)
 	((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL)
 #define vfio_iommufd_physical_detach_ioas \
 	((void (*)(struct vfio_device *vdev)) NULL)
+#define vfio_iommufd_physical_pasid_attach_ioas \
+	((int (*)(struct vfio_device *vdev, u32 pasid, u32 *pt_id)) NULL)
+#define vfio_iommufd_physical_pasid_detach_ioas \
+	((void (*)(struct vfio_device *vdev, u32 pasid)) NULL)
 #define vfio_iommufd_emulated_bind                                      \
 	((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx,   \
 		  u32 *out_device_id)) NULL)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
  2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
  2024-11-08 12:17 ` [PATCH v5 1/5] ida: Add ida_find_first_range() Yi Liu
  2024-11-08 12:17 ` [PATCH v5 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
@ 2024-11-08 12:17 ` Yi Liu
  2024-11-12  0:02   ` Alex Williamson
  2024-11-08 12:17 ` [PATCH v5 4/5] vfio: Add vfio_copy_user_data() Yi Liu
  2024-11-08 12:17 ` [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Yi Liu
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will

This extends the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT ioctls to attach/detach
a given pasid of a vfio device to/from an IOAS/HWPT.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/device_cdev.c | 69 +++++++++++++++++++++++++++++++++-----
 include/uapi/linux/vfio.h  | 29 ++++++++++------
 2 files changed, 80 insertions(+), 18 deletions(-)

diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
index bb1817bd4ff3..4519f482e212 100644
--- a/drivers/vfio/device_cdev.c
+++ b/drivers/vfio/device_cdev.c
@@ -162,9 +162,9 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
 int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
 			    struct vfio_device_attach_iommufd_pt __user *arg)
 {
-	struct vfio_device *device = df->device;
 	struct vfio_device_attach_iommufd_pt attach;
-	unsigned long minsz;
+	struct vfio_device *device = df->device;
+	unsigned long minsz, xend = 0;
 	int ret;
 
 	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
@@ -172,11 +172,38 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
 	if (copy_from_user(&attach, arg, minsz))
 		return -EFAULT;
 
-	if (attach.argsz < minsz || attach.flags)
+	if (attach.argsz < minsz)
 		return -EINVAL;
 
+	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
+		return -EINVAL;
+
+	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
+		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
+
+	/*
+	 * xend may be equal to minsz if a flag is defined for reusing a
+	 * reserved field or a special usage of an existing field.
+	 */
+	if (xend > minsz) {
+		if (attach.argsz < xend)
+			return -EINVAL;
+
+		if (copy_from_user((void *)&attach + minsz,
+				   (void __user *)arg + minsz, xend - minsz))
+			return -EFAULT;
+	}
+
+	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
+	    !device->ops->pasid_attach_ioas)
+		return -EOPNOTSUPP;
+
 	mutex_lock(&device->dev_set->lock);
-	ret = device->ops->attach_ioas(device, &attach.pt_id);
+	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
+		ret = device->ops->pasid_attach_ioas(device, attach.pasid,
+						     &attach.pt_id);
+	else
+		ret = device->ops->attach_ioas(device, &attach.pt_id);
 	if (ret)
 		goto out_unlock;
 
@@ -198,20 +225,46 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
 int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
 			    struct vfio_device_detach_iommufd_pt __user *arg)
 {
-	struct vfio_device *device = df->device;
 	struct vfio_device_detach_iommufd_pt detach;
-	unsigned long minsz;
+	struct vfio_device *device = df->device;
+	unsigned long minsz, xend = 0;
 
 	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
 
 	if (copy_from_user(&detach, arg, minsz))
 		return -EFAULT;
 
-	if (detach.argsz < minsz || detach.flags)
+	if (detach.argsz < minsz)
+		return -EINVAL;
+
+	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
 		return -EINVAL;
 
+	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
+		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
+
+	/*
+	 * xend may be equal to minsz if a flag is defined for reusing a
+	 * reserved field or a special usage of an existing field.
+	 */
+	if (xend > minsz) {
+		if (detach.argsz < xend)
+			return -EINVAL;
+
+		if (copy_from_user((void *)&detach + minsz,
+				   (void __user *)arg + minsz, xend - minsz))
+			return -EFAULT;
+	}
+
+	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
+	    !device->ops->pasid_detach_ioas)
+		return -EOPNOTSUPP;
+
 	mutex_lock(&device->dev_set->lock);
-	device->ops->detach_ioas(device);
+	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
+		device->ops->pasid_detach_ioas(device, detach.pasid);
+	else
+		device->ops->detach_ioas(device);
 	mutex_unlock(&device->dev_set->lock);
 
 	return 0;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index c8dbf8219c4f..6899da70b929 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -931,29 +931,34 @@ struct vfio_device_bind_iommufd {
  * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19,
  *					struct vfio_device_attach_iommufd_pt)
  * @argsz:	User filled size of this data.
- * @flags:	Must be 0.
+ * @flags:	Flags for attach.
  * @pt_id:	Input the target id which can represent an ioas or a hwpt
  *		allocated via iommufd subsystem.
  *		Output the input ioas id or the attached hwpt id which could
  *		be the specified hwpt itself or a hwpt automatically created
  *		for the specified ioas by kernel during the attachment.
+ * @pasid:	The pasid to be attached, only meaningful when
+ *		VFIO_DEVICE_ATTACH_PASID is set in @flags
  *
  * Associate the device with an address space within the bound iommufd.
  * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close.  This is only
  * allowed on cdev fds.
  *
- * If a vfio device is currently attached to a valid hw_pagetable, without doing
- * a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl
- * passing in another hw_pagetable (hwpt) id is allowed. This action, also known
- * as a hw_pagetable replacement, will replace the device's currently attached
- * hw_pagetable with a new hw_pagetable corresponding to the given pt_id.
+ * If a vfio device or a pasid of this device is currently attached to a valid
+ * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second
+ * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed.
+ * This action, also known as a hw_pagetable replacement, will replace the
+ * currently attached hwpt of the device or the pasid of this device with a new
+ * hwpt corresponding to the given pt_id.
  *
  * Return: 0 on success, -errno on failure.
  */
 struct vfio_device_attach_iommufd_pt {
 	__u32	argsz;
 	__u32	flags;
+#define VFIO_DEVICE_ATTACH_PASID	(1 << 0)
 	__u32	pt_id;
+	__u32	pasid;
 };
 
 #define VFIO_DEVICE_ATTACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 19)
@@ -962,17 +967,21 @@ struct vfio_device_attach_iommufd_pt {
  * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20,
  *					struct vfio_device_detach_iommufd_pt)
  * @argsz:	User filled size of this data.
- * @flags:	Must be 0.
+ * @flags:	Flags for detach.
+ * @pasid:	The pasid to be detached, only meaningful when
+ *		VFIO_DEVICE_DETACH_PASID is set in @flags
  *
- * Remove the association of the device and its current associated address
- * space.  After it, the device should be in a blocking DMA state.  This is only
- * allowed on cdev fds.
+ * Remove the association of the device or a pasid of the device and its current
+ * associated address space.  After it, the device or the pasid should be in a
+ * blocking DMA state.  This is only allowed on cdev fds.
  *
  * Return: 0 on success, -errno on failure.
  */
 struct vfio_device_detach_iommufd_pt {
 	__u32	argsz;
 	__u32	flags;
+#define VFIO_DEVICE_DETACH_PASID	(1 << 0)
+	__u32	pasid;
 };
 
 #define VFIO_DEVICE_DETACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 20)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
                   ` (2 preceding siblings ...)
  2024-11-08 12:17 ` [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Yi Liu
@ 2024-11-08 12:17 ` Yi Liu
  2024-11-12  0:03   ` Alex Williamson
  2024-11-08 12:17 ` [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Yi Liu
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will

This generalizes the logic of copying user data when the user struct
Have new fields introduced. The helpers can be used by the vfio uapis
that have the argsz and flags fields in the beginning 8 bytes.

As an example, the vfio_device_{at|de}tach_iommufd_pt paths are updated
to use the helpers.

The flags may be defined to mark a new field in the structure, reuse
reserved fields, or special handling of an existing field. The extended
size would differ for different flags. Each user API that wants to use
the generalized helpers should define an array to store the corresponding
extended sizes for each defined flag.

For example, we start out with the below, minsz is 12.

  struct vfio_foo_struct {
  	__u32   argsz;
  	__u32   flags;
  	__u32   pt_id;
  };

And then here it becomes:

  struct vfio_foo_struct {
  	__u32   argsz;
  	__u32   flags;
  #define VFIO_FOO_STRUCT_PASID   (1 << 0)
  	__u32   pt_id;
  	__u32   pasid;
  };

The array is { 16 }.

If the next flag is simply related to the processing of @pt_id and
doesn't require @pasid, then the extended size of the new flag is
12. The array become { 16, 12 }

  struct vfio_foo_struct {
  	__u32   argsz;
  	__u32   flags;
  #define VFIO_FOO_STRUCT_PASID   (1 << 0)
  #define VFIO_FOO_STRUCT_SPECICAL_PTID   (1 << 1)
  	__u32   pt_id;
  	__u32   pasid;
  };

Similarly, rather than adding new field, we might have reused a previously
reserved field, for instance what if we already expanded the structure
as the below, array is already { 24 }.

  struct vfio_foo_struct {
  	__u32   argsz;
  	__u32   flags;
  #define VFIO_FOO_STRUCT_XXX     (1 << 0)
  	__u32   pt_id;
  	__u32   reserved;
  	__u64   xxx;
  };

If we then want to add @pasid, we might really prefer to take advantage
of that reserved field and the array becomes { 24, 16 }.

  struct vfio_foo_struct {
  	__u32   argsz;
  	__u32   flags;
  #define VFIO_FOO_STRUCT_XXX     (1 << 0)
  #define VFIO_FOO_STRUCT_PASID   (1 << 1)
  	__u32   pt_id;
  	__u32   reserved;
  	__u64   xxx;
  };

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/device_cdev.c | 81 +++++++++++++-------------------------
 drivers/vfio/vfio.h        | 18 +++++++++
 drivers/vfio/vfio_main.c   | 55 ++++++++++++++++++++++++++
 3 files changed, 100 insertions(+), 54 deletions(-)

diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
index 4519f482e212..35c7664b9a97 100644
--- a/drivers/vfio/device_cdev.c
+++ b/drivers/vfio/device_cdev.c
@@ -159,40 +159,33 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
 	vfio_device_unblock_group(device);
 }
 
+#define VFIO_ATTACH_FLAGS_MASK VFIO_DEVICE_ATTACH_PASID
+static unsigned long
+vfio_attach_xends[ilog2(VFIO_ATTACH_FLAGS_MASK) + 1] = {
+	XEND_SIZE(VFIO_DEVICE_ATTACH_PASID,
+		  struct vfio_device_attach_iommufd_pt, pasid),
+};
+
+#define VFIO_DETACH_FLAGS_MASK VFIO_DEVICE_DETACH_PASID
+static unsigned long
+vfio_detach_xends[ilog2(VFIO_DETACH_FLAGS_MASK) + 1] = {
+	XEND_SIZE(VFIO_DEVICE_DETACH_PASID,
+		  struct vfio_device_detach_iommufd_pt, pasid),
+};
+
 int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
 			    struct vfio_device_attach_iommufd_pt __user *arg)
 {
 	struct vfio_device_attach_iommufd_pt attach;
 	struct vfio_device *device = df->device;
-	unsigned long minsz, xend = 0;
 	int ret;
 
-	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
-
-	if (copy_from_user(&attach, arg, minsz))
-		return -EFAULT;
-
-	if (attach.argsz < minsz)
-		return -EINVAL;
-
-	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
-		return -EINVAL;
-
-	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
-		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
-
-	/*
-	 * xend may be equal to minsz if a flag is defined for reusing a
-	 * reserved field or a special usage of an existing field.
-	 */
-	if (xend > minsz) {
-		if (attach.argsz < xend)
-			return -EINVAL;
-
-		if (copy_from_user((void *)&attach + minsz,
-				   (void __user *)arg + minsz, xend - minsz))
-			return -EFAULT;
-	}
+	ret = vfio_copy_user_data((void __user *)arg, &attach,
+				  struct vfio_device_attach_iommufd_pt,
+				  pt_id, VFIO_ATTACH_FLAGS_MASK,
+				  vfio_attach_xends);
+	if (ret)
+		return ret;
 
 	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
 	    !device->ops->pasid_attach_ioas)
@@ -227,34 +220,14 @@ int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
 {
 	struct vfio_device_detach_iommufd_pt detach;
 	struct vfio_device *device = df->device;
-	unsigned long minsz, xend = 0;
-
-	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
-
-	if (copy_from_user(&detach, arg, minsz))
-		return -EFAULT;
-
-	if (detach.argsz < minsz)
-		return -EINVAL;
-
-	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
-		return -EINVAL;
-
-	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
-		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
-
-	/*
-	 * xend may be equal to minsz if a flag is defined for reusing a
-	 * reserved field or a special usage of an existing field.
-	 */
-	if (xend > minsz) {
-		if (detach.argsz < xend)
-			return -EINVAL;
+	int ret;
 
-		if (copy_from_user((void *)&detach + minsz,
-				   (void __user *)arg + minsz, xend - minsz))
-			return -EFAULT;
-	}
+	ret = vfio_copy_user_data((void __user *)arg, &detach,
+				  struct vfio_device_detach_iommufd_pt,
+				  flags, VFIO_DETACH_FLAGS_MASK,
+				  vfio_detach_xends);
+	if (ret)
+		return ret;
 
 	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
 	    !device->ops->pasid_detach_ioas)
diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
index 50128da18bca..87bed550c46e 100644
--- a/drivers/vfio/vfio.h
+++ b/drivers/vfio/vfio.h
@@ -34,6 +34,24 @@ void vfio_df_close(struct vfio_device_file *df);
 struct vfio_device_file *
 vfio_allocate_device_file(struct vfio_device *device);
 
+int vfio_copy_from_user(void *buffer, void __user *arg,
+			unsigned long minsz, u32 flags_mask,
+			unsigned long *xend_array);
+
+#define vfio_copy_user_data(_arg, _local_buffer, _struct, _min_last,          \
+			    _flags_mask, _xend_array)                         \
+	vfio_copy_from_user(_local_buffer, _arg,                              \
+			    offsetofend(_struct, _min_last) +                \
+			    BUILD_BUG_ON_ZERO(offsetof(_struct, argsz) !=     \
+					      0) +                            \
+			    BUILD_BUG_ON_ZERO(offsetof(_struct, flags) !=     \
+					      sizeof(u32)),                   \
+			    _flags_mask, _xend_array)
+
+#define XEND_SIZE(_flag, _struct, _xlast)                                    \
+	[ilog2(_flag)] = offsetofend(_struct, _xlast) +                      \
+			 BUILD_BUG_ON_ZERO(_flag == 0)                       \
+
 extern const struct file_operations vfio_device_fops;
 
 #ifdef CONFIG_VFIO_NOIOMMU
diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
index a5a62d9d963f..c61336ea5123 100644
--- a/drivers/vfio/vfio_main.c
+++ b/drivers/vfio/vfio_main.c
@@ -1694,6 +1694,61 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
 }
 EXPORT_SYMBOL(vfio_dma_rw);
 
+/**
+ * vfio_copy_from_user - Copy the user struct that may have extended fields
+ *
+ * @buffer: The local buffer to store the data copied from user
+ * @arg: The user buffer pointer
+ * @minsz: The minimum size of the user struct
+ * @flags_mask: The combination of all the falgs defined
+ * @xend_array: The array that stores the xend size for set flags.
+ *
+ * This helper requires the user struct put the argsz and flags fields in
+ * the first 8 bytes.
+ *
+ * Return 0 for success, otherwise -errno
+ */
+int vfio_copy_from_user(void *buffer, void __user *arg,
+			unsigned long minsz, u32 flags_mask,
+			unsigned long *xend_array)
+{
+	unsigned long xend = minsz;
+	struct user_header {
+		u32 argsz;
+		u32 flags;
+	} *header;
+	unsigned long flags;
+	u32 flag;
+
+	if (copy_from_user(buffer, arg, minsz))
+		return -EFAULT;
+
+	header = (struct user_header *)buffer;
+	if (header->argsz < minsz)
+		return -EINVAL;
+
+	if (header->flags & ~flags_mask)
+		return -EINVAL;
+
+	/* Loop each set flag to decide the xend */
+	flags = header->flags;
+	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
+		if (xend_array[flag] > xend)
+			xend = xend_array[flag];
+	}
+
+	if (xend > minsz) {
+		if (header->argsz < xend)
+			return -EINVAL;
+
+		if (copy_from_user(buffer + minsz,
+				   arg + minsz, xend - minsz))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
 /*
  * Module/class support
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability
  2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
                   ` (3 preceding siblings ...)
  2024-11-08 12:17 ` [PATCH v5 4/5] vfio: Add vfio_copy_user_data() Yi Liu
@ 2024-11-08 12:17 ` Yi Liu
  2024-12-11  2:43   ` Zhangfei Gao
  4 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2024-11-08 12:17 UTC (permalink / raw)
  To: alex.williamson, jgg, kevin.tian, baolu.lu
  Cc: joro, eric.auger, nicolinc, kvm, chao.p.peng, yi.l.liu, iommu,
	zhenzhong.duan, vasant.hegde, will, Zhangfei Gao

PASID usage requires PASID support in both device and IOMMU. Since the
iommu drivers always enable the PASID capability for the device if it
is supported, so it is reasonable to extend the IOMMU_GET_HW_INFO to
report the PASID capability to userspace.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> #aarch64 platform
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c | 24 +++++++++++++++++++++++-
 drivers/pci/ats.c              | 33 +++++++++++++++++++++++++++++++++
 include/linux/pci-ats.h        |  3 +++
 include/uapi/linux/iommufd.h   | 14 +++++++++++++-
 4 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index a52937fba366..3620a039d3fa 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -3,6 +3,8 @@
  */
 #include <linux/iommu.h>
 #include <linux/iommufd.h>
+#include <linux/pci.h>
+#include <linux/pci-ats.h>
 #include <linux/slab.h>
 #include <uapi/linux/iommufd.h>
 
@@ -1248,7 +1250,8 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd)
 	void *data;
 	int rc;
 
-	if (cmd->flags || cmd->__reserved)
+	if (cmd->flags || cmd->__reserved[0] || cmd->__reserved[1] ||
+	    cmd->__reserved[2])
 		return -EOPNOTSUPP;
 
 	idev = iommufd_get_device(ucmd, cmd->dev_id);
@@ -1305,6 +1308,25 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd)
 	if (device_iommu_capable(idev->dev, IOMMU_CAP_DIRTY_TRACKING))
 		cmd->out_capabilities |= IOMMU_HW_CAP_DIRTY_TRACKING;
 
+	cmd->out_max_pasid_log2 = 0;
+
+	if (dev_is_pci(idev->dev)) {
+		struct pci_dev *pdev = to_pci_dev(idev->dev);
+		int ctrl;
+
+		ctrl = pci_pasid_ctrl_status(pdev);
+		if (ctrl >= 0 && (ctrl & PCI_PASID_CTRL_ENABLE)) {
+			cmd->out_max_pasid_log2 =
+					ilog2(idev->dev->iommu->max_pasids);
+			if (ctrl & PCI_PASID_CTRL_EXEC)
+				cmd->out_capabilities |=
+						IOMMU_HW_CAP_PCI_PASID_EXEC;
+			if (ctrl & PCI_PASID_CTRL_PRIV)
+				cmd->out_capabilities |=
+						IOMMU_HW_CAP_PCI_PASID_PRIV;
+		}
+	}
+
 	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
 out_free:
 	kfree(data);
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index 6afff1f1b143..c35465120329 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -538,4 +538,37 @@ int pci_max_pasids(struct pci_dev *pdev)
 	return (1 << FIELD_GET(PCI_PASID_CAP_WIDTH, supported));
 }
 EXPORT_SYMBOL_GPL(pci_max_pasids);
+
+/**
+ * pci_pasid_ctrl_status - Check the PASID status
+ * @pdev: PCI device structure
+ *
+ * Returns a negative value when no PASID capability is present.
+ * Otherwise the value of the control register is returned.
+ * Status reported are:
+ *
+ * PCI_PASID_CTRL_ENABLE - PASID enabled
+ * PCI_PASID_CTRL_EXEC - Execute permission enabled
+ * PCI_PASID_CTRL_PRIV - Privileged mode enabled
+ */
+int pci_pasid_ctrl_status(struct pci_dev *pdev)
+{
+	int pasid;
+	u16 ctrl;
+
+	if (pdev->is_virtfn)
+		pdev = pci_physfn(pdev);
+
+	pasid = pdev->pasid_cap;
+	if (!pasid)
+		return -EINVAL;
+
+	pci_read_config_word(pdev, pasid + PCI_PASID_CTRL, &ctrl);
+
+	ctrl &= PCI_PASID_CTRL_ENABLE | PCI_PASID_CTRL_EXEC |
+		PCI_PASID_CTRL_PRIV;
+
+	return ctrl;
+}
+EXPORT_SYMBOL_GPL(pci_pasid_ctrl_status);
 #endif /* CONFIG_PCI_PASID */
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index 0e8b74e63767..f4e1d2010287 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -42,6 +42,7 @@ int pci_enable_pasid(struct pci_dev *pdev, int features);
 void pci_disable_pasid(struct pci_dev *pdev);
 int pci_pasid_features(struct pci_dev *pdev);
 int pci_max_pasids(struct pci_dev *pdev);
+int pci_pasid_ctrl_status(struct pci_dev *pdev);
 #else /* CONFIG_PCI_PASID */
 static inline int pci_enable_pasid(struct pci_dev *pdev, int features)
 { return -EINVAL; }
@@ -50,6 +51,8 @@ static inline int pci_pasid_features(struct pci_dev *pdev)
 { return -EINVAL; }
 static inline int pci_max_pasids(struct pci_dev *pdev)
 { return -EINVAL; }
+static inline int pci_pasid_ctrl_status(struct pci_dev *pdev)
+{ return -EINVAL; }
 #endif /* CONFIG_PCI_PASID */
 
 #endif /* LINUX_PCI_ATS_H */
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index cc1b94f43538..e5709502d8e4 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -547,9 +547,17 @@ enum iommu_hw_info_type {
  *                                   IOMMU_HWPT_GET_DIRTY_BITMAP
  *                                   IOMMU_HWPT_SET_DIRTY_TRACKING
  *
+ * @IOMMU_HW_CAP_PASID_EXEC: Execute Permission Supported, user ignores it
+ *                           when the struct iommu_hw_info::out_max_pasid_log2
+ *                           is zero.
+ * @IOMMU_HW_CAP_PASID_PRIV: Privileged Mode Supported, user ignores it
+ *                           when the struct iommu_hw_info::out_max_pasid_log2
+ *                           is zero.
  */
 enum iommufd_hw_capabilities {
 	IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0,
+	IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1,
+	IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2,
 };
 
 /**
@@ -565,6 +573,9 @@ enum iommufd_hw_capabilities {
  *                 iommu_hw_info_type.
  * @out_capabilities: Output the generic iommu capability info type as defined
  *                    in the enum iommu_hw_capabilities.
+ * @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support.
+ *                      PCI devices turn to out_capabilities to check if the
+ *                      specific capabilities is supported or not.
  * @__reserved: Must be 0
  *
  * Query an iommu type specific hardware information data from an iommu behind
@@ -588,7 +599,8 @@ struct iommu_hw_info {
 	__u32 data_len;
 	__aligned_u64 data_uptr;
 	__u32 out_data_type;
-	__u32 __reserved;
+	__u8 out_max_pasid_log2;
+	__u8 __reserved[3];
 	__aligned_u64 out_capabilities;
 };
 #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
  2024-11-08 12:17 ` [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Yi Liu
@ 2024-11-12  0:02   ` Alex Williamson
  2024-11-12  1:53     ` Yi Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2024-11-12  0:02 UTC (permalink / raw)
  To: Yi Liu
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On Fri,  8 Nov 2024 04:17:40 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This extends the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT ioctls to attach/detach
> a given pasid of a vfio device to/from an IOAS/HWPT.
> 
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/vfio/device_cdev.c | 69 +++++++++++++++++++++++++++++++++-----
>  include/uapi/linux/vfio.h  | 29 ++++++++++------
>  2 files changed, 80 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> index bb1817bd4ff3..4519f482e212 100644
> --- a/drivers/vfio/device_cdev.c
> +++ b/drivers/vfio/device_cdev.c
> @@ -162,9 +162,9 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
>  int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>  			    struct vfio_device_attach_iommufd_pt __user *arg)
>  {
> -	struct vfio_device *device = df->device;
>  	struct vfio_device_attach_iommufd_pt attach;
> -	unsigned long minsz;
> +	struct vfio_device *device = df->device;
> +	unsigned long minsz, xend = 0;
>  	int ret;
>  
>  	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
> @@ -172,11 +172,38 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>  	if (copy_from_user(&attach, arg, minsz))
>  		return -EFAULT;
>  
> -	if (attach.argsz < minsz || attach.flags)
> +	if (attach.argsz < minsz)
>  		return -EINVAL;
>  
> +	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
> +		return -EINVAL;
> +
> +	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
> +		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
> +
> +	/*
> +	 * xend may be equal to minsz if a flag is defined for reusing a
> +	 * reserved field or a special usage of an existing field.
> +	 */
> +	if (xend > minsz) {
> +		if (attach.argsz < xend)
> +			return -EINVAL;
> +
> +		if (copy_from_user((void *)&attach + minsz,
> +				   (void __user *)arg + minsz, xend - minsz))
> +			return -EFAULT;
> +	}
> +
> +	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
> +	    !device->ops->pasid_attach_ioas)
> +		return -EOPNOTSUPP;
> +
>  	mutex_lock(&device->dev_set->lock);
> -	ret = device->ops->attach_ioas(device, &attach.pt_id);
> +	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)

I'd just do the ops test here:
							{
		if (!device->ops->pasid_attach_ios)
			ret = -EOPNOTSUPP;
		else...

> +		ret = device->ops->pasid_attach_ioas(device, attach.pasid,
> +						     &attach.pt_id);

	} else {

(Obviously if we weren't about to generalize the prior chunk of code,
we'd test ops before the 2nd copy_from_user)  Thanks,

Alex

> +	else
> +		ret = device->ops->attach_ioas(device, &attach.pt_id);
>  	if (ret)
>  		goto out_unlock;
>  
> @@ -198,20 +225,46 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>  int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
>  			    struct vfio_device_detach_iommufd_pt __user *arg)
>  {
> -	struct vfio_device *device = df->device;
>  	struct vfio_device_detach_iommufd_pt detach;
> -	unsigned long minsz;
> +	struct vfio_device *device = df->device;
> +	unsigned long minsz, xend = 0;
>  
>  	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
>  
>  	if (copy_from_user(&detach, arg, minsz))
>  		return -EFAULT;
>  
> -	if (detach.argsz < minsz || detach.flags)
> +	if (detach.argsz < minsz)
> +		return -EINVAL;
> +
> +	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
>  		return -EINVAL;
>  
> +	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
> +		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
> +
> +	/*
> +	 * xend may be equal to minsz if a flag is defined for reusing a
> +	 * reserved field or a special usage of an existing field.
> +	 */
> +	if (xend > minsz) {
> +		if (detach.argsz < xend)
> +			return -EINVAL;
> +
> +		if (copy_from_user((void *)&detach + minsz,
> +				   (void __user *)arg + minsz, xend - minsz))
> +			return -EFAULT;
> +	}
> +
> +	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
> +	    !device->ops->pasid_detach_ioas)
> +		return -EOPNOTSUPP;
> +
>  	mutex_lock(&device->dev_set->lock);
> -	device->ops->detach_ioas(device);
> +	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
> +		device->ops->pasid_detach_ioas(device, detach.pasid);
> +	else
> +		device->ops->detach_ioas(device);
>  	mutex_unlock(&device->dev_set->lock);
>  
>  	return 0;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index c8dbf8219c4f..6899da70b929 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -931,29 +931,34 @@ struct vfio_device_bind_iommufd {
>   * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19,
>   *					struct vfio_device_attach_iommufd_pt)
>   * @argsz:	User filled size of this data.
> - * @flags:	Must be 0.
> + * @flags:	Flags for attach.
>   * @pt_id:	Input the target id which can represent an ioas or a hwpt
>   *		allocated via iommufd subsystem.
>   *		Output the input ioas id or the attached hwpt id which could
>   *		be the specified hwpt itself or a hwpt automatically created
>   *		for the specified ioas by kernel during the attachment.
> + * @pasid:	The pasid to be attached, only meaningful when
> + *		VFIO_DEVICE_ATTACH_PASID is set in @flags
>   *
>   * Associate the device with an address space within the bound iommufd.
>   * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close.  This is only
>   * allowed on cdev fds.
>   *
> - * If a vfio device is currently attached to a valid hw_pagetable, without doing
> - * a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl
> - * passing in another hw_pagetable (hwpt) id is allowed. This action, also known
> - * as a hw_pagetable replacement, will replace the device's currently attached
> - * hw_pagetable with a new hw_pagetable corresponding to the given pt_id.
> + * If a vfio device or a pasid of this device is currently attached to a valid
> + * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second
> + * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed.
> + * This action, also known as a hw_pagetable replacement, will replace the
> + * currently attached hwpt of the device or the pasid of this device with a new
> + * hwpt corresponding to the given pt_id.
>   *
>   * Return: 0 on success, -errno on failure.
>   */
>  struct vfio_device_attach_iommufd_pt {
>  	__u32	argsz;
>  	__u32	flags;
> +#define VFIO_DEVICE_ATTACH_PASID	(1 << 0)
>  	__u32	pt_id;
> +	__u32	pasid;
>  };
>  
>  #define VFIO_DEVICE_ATTACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 19)
> @@ -962,17 +967,21 @@ struct vfio_device_attach_iommufd_pt {
>   * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20,
>   *					struct vfio_device_detach_iommufd_pt)
>   * @argsz:	User filled size of this data.
> - * @flags:	Must be 0.
> + * @flags:	Flags for detach.
> + * @pasid:	The pasid to be detached, only meaningful when
> + *		VFIO_DEVICE_DETACH_PASID is set in @flags
>   *
> - * Remove the association of the device and its current associated address
> - * space.  After it, the device should be in a blocking DMA state.  This is only
> - * allowed on cdev fds.
> + * Remove the association of the device or a pasid of the device and its current
> + * associated address space.  After it, the device or the pasid should be in a
> + * blocking DMA state.  This is only allowed on cdev fds.
>   *
>   * Return: 0 on success, -errno on failure.
>   */
>  struct vfio_device_detach_iommufd_pt {
>  	__u32	argsz;
>  	__u32	flags;
> +#define VFIO_DEVICE_DETACH_PASID	(1 << 0)
> +	__u32	pasid;
>  };
>  
>  #define VFIO_DEVICE_DETACH_IOMMUFD_PT		_IO(VFIO_TYPE, VFIO_BASE + 20)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-08 12:17 ` [PATCH v5 4/5] vfio: Add vfio_copy_user_data() Yi Liu
@ 2024-11-12  0:03   ` Alex Williamson
  2024-11-12  9:18     ` Yi Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2024-11-12  0:03 UTC (permalink / raw)
  To: Yi Liu
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On Fri,  8 Nov 2024 04:17:41 -0800
Yi Liu <yi.l.liu@intel.com> wrote:

> This generalizes the logic of copying user data when the user struct
> Have new fields introduced. The helpers can be used by the vfio uapis
> that have the argsz and flags fields in the beginning 8 bytes.
> 
> As an example, the vfio_device_{at|de}tach_iommufd_pt paths are updated
> to use the helpers.
> 
> The flags may be defined to mark a new field in the structure, reuse
> reserved fields, or special handling of an existing field. The extended
> size would differ for different flags. Each user API that wants to use
> the generalized helpers should define an array to store the corresponding
> extended sizes for each defined flag.
> 
> For example, we start out with the below, minsz is 12.
> 
>   struct vfio_foo_struct {
>   	__u32   argsz;
>   	__u32   flags;
>   	__u32   pt_id;
>   };
> 
> And then here it becomes:
> 
>   struct vfio_foo_struct {
>   	__u32   argsz;
>   	__u32   flags;
>   #define VFIO_FOO_STRUCT_PASID   (1 << 0)
>   	__u32   pt_id;
>   	__u32   pasid;
>   };
> 
> The array is { 16 }.
> 
> If the next flag is simply related to the processing of @pt_id and
> doesn't require @pasid, then the extended size of the new flag is
> 12. The array become { 16, 12 }
> 
>   struct vfio_foo_struct {
>   	__u32   argsz;
>   	__u32   flags;
>   #define VFIO_FOO_STRUCT_PASID   (1 << 0)
>   #define VFIO_FOO_STRUCT_SPECICAL_PTID   (1 << 1)
>   	__u32   pt_id;
>   	__u32   pasid;
>   };
> 
> Similarly, rather than adding new field, we might have reused a previously
> reserved field, for instance what if we already expanded the structure
> as the below, array is already { 24 }.
> 
>   struct vfio_foo_struct {
>   	__u32   argsz;
>   	__u32   flags;
>   #define VFIO_FOO_STRUCT_XXX     (1 << 0)
>   	__u32   pt_id;
>   	__u32   reserved;
>   	__u64   xxx;
>   };
> 
> If we then want to add @pasid, we might really prefer to take advantage
> of that reserved field and the array becomes { 24, 16 }.
> 
>   struct vfio_foo_struct {
>   	__u32   argsz;
>   	__u32   flags;
>   #define VFIO_FOO_STRUCT_XXX     (1 << 0)
>   #define VFIO_FOO_STRUCT_PASID   (1 << 1)
>   	__u32   pt_id;
>   	__u32   reserved;

I think this was supposed to be s/reserved/pasid/

>   	__u64   xxx;
>   };
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/vfio/device_cdev.c | 81 +++++++++++++-------------------------
>  drivers/vfio/vfio.h        | 18 +++++++++
>  drivers/vfio/vfio_main.c   | 55 ++++++++++++++++++++++++++
>  3 files changed, 100 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> index 4519f482e212..35c7664b9a97 100644
> --- a/drivers/vfio/device_cdev.c
> +++ b/drivers/vfio/device_cdev.c
> @@ -159,40 +159,33 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
>  	vfio_device_unblock_group(device);
>  }
>  
> +#define VFIO_ATTACH_FLAGS_MASK VFIO_DEVICE_ATTACH_PASID
> +static unsigned long
> +vfio_attach_xends[ilog2(VFIO_ATTACH_FLAGS_MASK) + 1] = {
> +	XEND_SIZE(VFIO_DEVICE_ATTACH_PASID,
> +		  struct vfio_device_attach_iommufd_pt, pasid),
> +};
> +
> +#define VFIO_DETACH_FLAGS_MASK VFIO_DEVICE_DETACH_PASID
> +static unsigned long
> +vfio_detach_xends[ilog2(VFIO_DETACH_FLAGS_MASK) + 1] = {
> +	XEND_SIZE(VFIO_DEVICE_DETACH_PASID,
> +		  struct vfio_device_detach_iommufd_pt, pasid),
> +};
> +
>  int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>  			    struct vfio_device_attach_iommufd_pt __user *arg)
>  {
>  	struct vfio_device_attach_iommufd_pt attach;
>  	struct vfio_device *device = df->device;
> -	unsigned long minsz, xend = 0;
>  	int ret;
>  
> -	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
> -
> -	if (copy_from_user(&attach, arg, minsz))
> -		return -EFAULT;
> -
> -	if (attach.argsz < minsz)
> -		return -EINVAL;
> -
> -	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
> -		return -EINVAL;
> -
> -	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
> -		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
> -
> -	/*
> -	 * xend may be equal to minsz if a flag is defined for reusing a
> -	 * reserved field or a special usage of an existing field.
> -	 */
> -	if (xend > minsz) {
> -		if (attach.argsz < xend)
> -			return -EINVAL;
> -
> -		if (copy_from_user((void *)&attach + minsz,
> -				   (void __user *)arg + minsz, xend - minsz))
> -			return -EFAULT;
> -	}
> +	ret = vfio_copy_user_data((void __user *)arg, &attach,
> +				  struct vfio_device_attach_iommufd_pt,
> +				  pt_id, VFIO_ATTACH_FLAGS_MASK,
> +				  vfio_attach_xends);
> +	if (ret)
> +		return ret;
>  
>  	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
>  	    !device->ops->pasid_attach_ioas)
> @@ -227,34 +220,14 @@ int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
>  {
>  	struct vfio_device_detach_iommufd_pt detach;
>  	struct vfio_device *device = df->device;
> -	unsigned long minsz, xend = 0;
> -
> -	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
> -
> -	if (copy_from_user(&detach, arg, minsz))
> -		return -EFAULT;
> -
> -	if (detach.argsz < minsz)
> -		return -EINVAL;
> -
> -	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
> -		return -EINVAL;
> -
> -	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
> -		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
> -
> -	/*
> -	 * xend may be equal to minsz if a flag is defined for reusing a
> -	 * reserved field or a special usage of an existing field.
> -	 */
> -	if (xend > minsz) {
> -		if (detach.argsz < xend)
> -			return -EINVAL;
> +	int ret;
>  
> -		if (copy_from_user((void *)&detach + minsz,
> -				   (void __user *)arg + minsz, xend - minsz))
> -			return -EFAULT;
> -	}
> +	ret = vfio_copy_user_data((void __user *)arg, &detach,
> +				  struct vfio_device_detach_iommufd_pt,
> +				  flags, VFIO_DETACH_FLAGS_MASK,
> +				  vfio_detach_xends);
> +	if (ret)
> +		return ret;
>  
>  	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
>  	    !device->ops->pasid_detach_ioas)
> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> index 50128da18bca..87bed550c46e 100644
> --- a/drivers/vfio/vfio.h
> +++ b/drivers/vfio/vfio.h
> @@ -34,6 +34,24 @@ void vfio_df_close(struct vfio_device_file *df);
>  struct vfio_device_file *
>  vfio_allocate_device_file(struct vfio_device *device);
>  
> +int vfio_copy_from_user(void *buffer, void __user *arg,
> +			unsigned long minsz, u32 flags_mask,
> +			unsigned long *xend_array);
> +
> +#define vfio_copy_user_data(_arg, _local_buffer, _struct, _min_last,          \
> +			    _flags_mask, _xend_array)                         \
> +	vfio_copy_from_user(_local_buffer, _arg,                              \
> +			    offsetofend(_struct, _min_last) +                \
> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, argsz) !=     \
> +					      0) +                            \
> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, flags) !=     \
> +					      sizeof(u32)),                   \
> +			    _flags_mask, _xend_array)
> +
> +#define XEND_SIZE(_flag, _struct, _xlast)                                    \
> +	[ilog2(_flag)] = offsetofend(_struct, _xlast) +                      \
> +			 BUILD_BUG_ON_ZERO(_flag == 0)                       \
> +
>  extern const struct file_operations vfio_device_fops;
>  
>  #ifdef CONFIG_VFIO_NOIOMMU
> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> index a5a62d9d963f..c61336ea5123 100644
> --- a/drivers/vfio/vfio_main.c
> +++ b/drivers/vfio/vfio_main.c
> @@ -1694,6 +1694,61 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
>  }
>  EXPORT_SYMBOL(vfio_dma_rw);
>  
> +/**
> + * vfio_copy_from_user - Copy the user struct that may have extended fields
> + *
> + * @buffer: The local buffer to store the data copied from user
> + * @arg: The user buffer pointer
> + * @minsz: The minimum size of the user struct
> + * @flags_mask: The combination of all the falgs defined
> + * @xend_array: The array that stores the xend size for set flags.
> + *
> + * This helper requires the user struct put the argsz and flags fields in
> + * the first 8 bytes.
> + *
> + * Return 0 for success, otherwise -errno
> + */
> +int vfio_copy_from_user(void *buffer, void __user *arg,

This should probably be prefixed with an underscore and note that
callers should use the wrapper function to impose the parameter
checking.

> +			unsigned long minsz, u32 flags_mask,
> +			unsigned long *xend_array)
> +{
> +	unsigned long xend = minsz;
> +	struct user_header {
> +		u32 argsz;
> +		u32 flags;
> +	} *header;
> +	unsigned long flags;
> +	u32 flag;
> +
> +	if (copy_from_user(buffer, arg, minsz))
> +		return -EFAULT;
> +
> +	header = (struct user_header *)buffer;
> +	if (header->argsz < minsz)
> +		return -EINVAL;
> +
> +	if (header->flags & ~flags_mask)
> +		return -EINVAL;

I'm already wrestling with whether this is an over engineered solution
to remove a couple dozen lines of mostly duplicate logic between attach
and detach, but a couple points that could make it more versatile:

(1) Test xend_array here:

	if (!xend_array)
		return 0;

(2) Return ssize_t/-errno for the caller to know the resulting copy
size.

> +
> +	/* Loop each set flag to decide the xend */
> +	flags = header->flags;
> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
> +		if (xend_array[flag] > xend)
> +			xend = xend_array[flag];

Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
least long enough to match the highest bit in flags?  Thanks,

Alex

> +	}
> +
> +	if (xend > minsz) {
> +		if (header->argsz < xend)
> +			return -EINVAL;
> +
> +		if (copy_from_user(buffer + minsz,
> +				   arg + minsz, xend - minsz))
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * Module/class support
>   */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid
  2024-11-12  0:02   ` Alex Williamson
@ 2024-11-12  1:53     ` Yi Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Yi Liu @ 2024-11-12  1:53 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On 2024/11/12 08:02, Alex Williamson wrote:
> On Fri,  8 Nov 2024 04:17:40 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
>> This extends the VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT ioctls to attach/detach
>> a given pasid of a vfio device to/from an IOAS/HWPT.
>>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>>   drivers/vfio/device_cdev.c | 69 +++++++++++++++++++++++++++++++++-----
>>   include/uapi/linux/vfio.h  | 29 ++++++++++------
>>   2 files changed, 80 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
>> index bb1817bd4ff3..4519f482e212 100644
>> --- a/drivers/vfio/device_cdev.c
>> +++ b/drivers/vfio/device_cdev.c
>> @@ -162,9 +162,9 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
>>   int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>>   			    struct vfio_device_attach_iommufd_pt __user *arg)
>>   {
>> -	struct vfio_device *device = df->device;
>>   	struct vfio_device_attach_iommufd_pt attach;
>> -	unsigned long minsz;
>> +	struct vfio_device *device = df->device;
>> +	unsigned long minsz, xend = 0;
>>   	int ret;
>>   
>>   	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
>> @@ -172,11 +172,38 @@ int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>>   	if (copy_from_user(&attach, arg, minsz))
>>   		return -EFAULT;
>>   
>> -	if (attach.argsz < minsz || attach.flags)
>> +	if (attach.argsz < minsz)
>>   		return -EINVAL;
>>   
>> +	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
>> +		return -EINVAL;
>> +
>> +	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
>> +		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
>> +
>> +	/*
>> +	 * xend may be equal to minsz if a flag is defined for reusing a
>> +	 * reserved field or a special usage of an existing field.
>> +	 */
>> +	if (xend > minsz) {
>> +		if (attach.argsz < xend)
>> +			return -EINVAL;
>> +
>> +		if (copy_from_user((void *)&attach + minsz,
>> +				   (void __user *)arg + minsz, xend - minsz))
>> +			return -EFAULT;
>> +	}
>> +
>> +	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
>> +	    !device->ops->pasid_attach_ioas)
>> +		return -EOPNOTSUPP;
>> +
>>   	mutex_lock(&device->dev_set->lock);
>> -	ret = device->ops->attach_ioas(device, &attach.pt_id);
>> +	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
> 
> I'd just do the ops test here:
> 							{
> 		if (!device->ops->pasid_attach_ios)
> 			ret = -EOPNOTSUPP;
> 		else...
> 
>> +		ret = device->ops->pasid_attach_ioas(device, attach.pasid,
>> +						     &attach.pt_id);

got it.

> 	} else {
> 
> (Obviously if we weren't about to generalize the prior chunk of code,
> we'd test ops before the 2nd copy_from_user)  Thanks,

yes. that's the trade-off for the generalization. :)

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-12  0:03   ` Alex Williamson
@ 2024-11-12  9:18     ` Yi Liu
  2024-11-12 13:52       ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2024-11-12  9:18 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On 2024/11/12 08:03, Alex Williamson wrote:
> On Fri,  8 Nov 2024 04:17:41 -0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
>> This generalizes the logic of copying user data when the user struct
>> Have new fields introduced. The helpers can be used by the vfio uapis
>> that have the argsz and flags fields in the beginning 8 bytes.
>>
>> As an example, the vfio_device_{at|de}tach_iommufd_pt paths are updated
>> to use the helpers.
>>
>> The flags may be defined to mark a new field in the structure, reuse
>> reserved fields, or special handling of an existing field. The extended
>> size would differ for different flags. Each user API that wants to use
>> the generalized helpers should define an array to store the corresponding
>> extended sizes for each defined flag.
>>
>> For example, we start out with the below, minsz is 12.
>>
>>    struct vfio_foo_struct {
>>    	__u32   argsz;
>>    	__u32   flags;
>>    	__u32   pt_id;
>>    };
>>
>> And then here it becomes:
>>
>>    struct vfio_foo_struct {
>>    	__u32   argsz;
>>    	__u32   flags;
>>    #define VFIO_FOO_STRUCT_PASID   (1 << 0)
>>    	__u32   pt_id;
>>    	__u32   pasid;
>>    };
>>
>> The array is { 16 }.
>>
>> If the next flag is simply related to the processing of @pt_id and
>> doesn't require @pasid, then the extended size of the new flag is
>> 12. The array become { 16, 12 }
>>
>>    struct vfio_foo_struct {
>>    	__u32   argsz;
>>    	__u32   flags;
>>    #define VFIO_FOO_STRUCT_PASID   (1 << 0)
>>    #define VFIO_FOO_STRUCT_SPECICAL_PTID   (1 << 1)
>>    	__u32   pt_id;
>>    	__u32   pasid;
>>    };
>>
>> Similarly, rather than adding new field, we might have reused a previously
>> reserved field, for instance what if we already expanded the structure
>> as the below, array is already { 24 }.
>>
>>    struct vfio_foo_struct {
>>    	__u32   argsz;
>>    	__u32   flags;
>>    #define VFIO_FOO_STRUCT_XXX     (1 << 0)
>>    	__u32   pt_id;
>>    	__u32   reserved;
>>    	__u64   xxx;
>>    };
>>
>> If we then want to add @pasid, we might really prefer to take advantage
>> of that reserved field and the array becomes { 24, 16 }.
>>
>>    struct vfio_foo_struct {
>>    	__u32   argsz;
>>    	__u32   flags;
>>    #define VFIO_FOO_STRUCT_XXX     (1 << 0)
>>    #define VFIO_FOO_STRUCT_PASID   (1 << 1)
>>    	__u32   pt_id;
>>    	__u32   reserved;
> 
> I think this was supposed to be s/reserved/pasid/

you are right.

>>    	__u64   xxx;
>>    };
>>
>> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>>   drivers/vfio/device_cdev.c | 81 +++++++++++++-------------------------
>>   drivers/vfio/vfio.h        | 18 +++++++++
>>   drivers/vfio/vfio_main.c   | 55 ++++++++++++++++++++++++++
>>   3 files changed, 100 insertions(+), 54 deletions(-)
>>
>> diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
>> index 4519f482e212..35c7664b9a97 100644
>> --- a/drivers/vfio/device_cdev.c
>> +++ b/drivers/vfio/device_cdev.c
>> @@ -159,40 +159,33 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
>>   	vfio_device_unblock_group(device);
>>   }
>>   
>> +#define VFIO_ATTACH_FLAGS_MASK VFIO_DEVICE_ATTACH_PASID
>> +static unsigned long
>> +vfio_attach_xends[ilog2(VFIO_ATTACH_FLAGS_MASK) + 1] = {
>> +	XEND_SIZE(VFIO_DEVICE_ATTACH_PASID,
>> +		  struct vfio_device_attach_iommufd_pt, pasid),
>> +};
>> +
>> +#define VFIO_DETACH_FLAGS_MASK VFIO_DEVICE_DETACH_PASID
>> +static unsigned long
>> +vfio_detach_xends[ilog2(VFIO_DETACH_FLAGS_MASK) + 1] = {
>> +	XEND_SIZE(VFIO_DEVICE_DETACH_PASID,
>> +		  struct vfio_device_detach_iommufd_pt, pasid),
>> +};
>> +
>>   int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
>>   			    struct vfio_device_attach_iommufd_pt __user *arg)
>>   {
>>   	struct vfio_device_attach_iommufd_pt attach;
>>   	struct vfio_device *device = df->device;
>> -	unsigned long minsz, xend = 0;
>>   	int ret;
>>   
>> -	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
>> -
>> -	if (copy_from_user(&attach, arg, minsz))
>> -		return -EFAULT;
>> -
>> -	if (attach.argsz < minsz)
>> -		return -EINVAL;
>> -
>> -	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
>> -		return -EINVAL;
>> -
>> -	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
>> -		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
>> -
>> -	/*
>> -	 * xend may be equal to minsz if a flag is defined for reusing a
>> -	 * reserved field or a special usage of an existing field.
>> -	 */
>> -	if (xend > minsz) {
>> -		if (attach.argsz < xend)
>> -			return -EINVAL;
>> -
>> -		if (copy_from_user((void *)&attach + minsz,
>> -				   (void __user *)arg + minsz, xend - minsz))
>> -			return -EFAULT;
>> -	}
>> +	ret = vfio_copy_user_data((void __user *)arg, &attach,
>> +				  struct vfio_device_attach_iommufd_pt,
>> +				  pt_id, VFIO_ATTACH_FLAGS_MASK,
>> +				  vfio_attach_xends);
>> +	if (ret)
>> +		return ret;
>>   
>>   	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
>>   	    !device->ops->pasid_attach_ioas)
>> @@ -227,34 +220,14 @@ int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
>>   {
>>   	struct vfio_device_detach_iommufd_pt detach;
>>   	struct vfio_device *device = df->device;
>> -	unsigned long minsz, xend = 0;
>> -
>> -	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
>> -
>> -	if (copy_from_user(&detach, arg, minsz))
>> -		return -EFAULT;
>> -
>> -	if (detach.argsz < minsz)
>> -		return -EINVAL;
>> -
>> -	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
>> -		return -EINVAL;
>> -
>> -	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
>> -		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
>> -
>> -	/*
>> -	 * xend may be equal to minsz if a flag is defined for reusing a
>> -	 * reserved field or a special usage of an existing field.
>> -	 */
>> -	if (xend > minsz) {
>> -		if (detach.argsz < xend)
>> -			return -EINVAL;
>> +	int ret;
>>   
>> -		if (copy_from_user((void *)&detach + minsz,
>> -				   (void __user *)arg + minsz, xend - minsz))
>> -			return -EFAULT;
>> -	}
>> +	ret = vfio_copy_user_data((void __user *)arg, &detach,
>> +				  struct vfio_device_detach_iommufd_pt,
>> +				  flags, VFIO_DETACH_FLAGS_MASK,
>> +				  vfio_detach_xends);
>> +	if (ret)
>> +		return ret;
>>   
>>   	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
>>   	    !device->ops->pasid_detach_ioas)
>> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
>> index 50128da18bca..87bed550c46e 100644
>> --- a/drivers/vfio/vfio.h
>> +++ b/drivers/vfio/vfio.h
>> @@ -34,6 +34,24 @@ void vfio_df_close(struct vfio_device_file *df);
>>   struct vfio_device_file *
>>   vfio_allocate_device_file(struct vfio_device *device);
>>   
>> +int vfio_copy_from_user(void *buffer, void __user *arg,
>> +			unsigned long minsz, u32 flags_mask,
>> +			unsigned long *xend_array);
>> +
>> +#define vfio_copy_user_data(_arg, _local_buffer, _struct, _min_last,          \
>> +			    _flags_mask, _xend_array)                         \
>> +	vfio_copy_from_user(_local_buffer, _arg,                              \
>> +			    offsetofend(_struct, _min_last) +                \
>> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, argsz) !=     \
>> +					      0) +                            \
>> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, flags) !=     \
>> +					      sizeof(u32)),                   \
>> +			    _flags_mask, _xend_array)
>> +
>> +#define XEND_SIZE(_flag, _struct, _xlast)                                    \
>> +	[ilog2(_flag)] = offsetofend(_struct, _xlast) +                      \
>> +			 BUILD_BUG_ON_ZERO(_flag == 0)                       \
>> +
>>   extern const struct file_operations vfio_device_fops;
>>   
>>   #ifdef CONFIG_VFIO_NOIOMMU
>> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
>> index a5a62d9d963f..c61336ea5123 100644
>> --- a/drivers/vfio/vfio_main.c
>> +++ b/drivers/vfio/vfio_main.c
>> @@ -1694,6 +1694,61 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
>>   }
>>   EXPORT_SYMBOL(vfio_dma_rw);
>>   
>> +/**
>> + * vfio_copy_from_user - Copy the user struct that may have extended fields
>> + *
>> + * @buffer: The local buffer to store the data copied from user
>> + * @arg: The user buffer pointer
>> + * @minsz: The minimum size of the user struct
>> + * @flags_mask: The combination of all the falgs defined
>> + * @xend_array: The array that stores the xend size for set flags.
>> + *
>> + * This helper requires the user struct put the argsz and flags fields in
>> + * the first 8 bytes.
>> + *
>> + * Return 0 for success, otherwise -errno
>> + */
>> +int vfio_copy_from_user(void *buffer, void __user *arg,
> 
> This should probably be prefixed with an underscore and note that
> callers should use the wrapper function to impose the parameter
> checking.

got it.

> 
>> +			unsigned long minsz, u32 flags_mask,
>> +			unsigned long *xend_array)
>> +{
>> +	unsigned long xend = minsz;
>> +	struct user_header {
>> +		u32 argsz;
>> +		u32 flags;
>> +	} *header;
>> +	unsigned long flags;
>> +	u32 flag;
>> +
>> +	if (copy_from_user(buffer, arg, minsz))
>> +		return -EFAULT;
>> +
>> +	header = (struct user_header *)buffer;
>> +	if (header->argsz < minsz)
>> +		return -EINVAL;
>> +
>> +	if (header->flags & ~flags_mask)
>> +		return -EINVAL;
> 
> I'm already wrestling with whether this is an over engineered solution
> to remove a couple dozen lines of mostly duplicate logic between attach
> and detach, but a couple points that could make it more versatile:
> 
> (1) Test xend_array here:
> 
> 	if (!xend_array)
> 		return 0;

Perhaps we should return error if the header->flags has any bit set. Such
cases require a valid xend_array.

> 
> (2) Return ssize_t/-errno for the caller to know the resulting copy
> size.
> 
>> +
>> +	/* Loop each set flag to decide the xend */
>> +	flags = header->flags;
>> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
>> +		if (xend_array[flag] > xend)
>> +			xend = xend_array[flag];
> 
> Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
> least long enough to match the highest bit in flags?  Thanks,

yes. I would add a BUILD_BUG like the below.

BUILD_BUG_ON(ARRAY_SIZE(_xend_array) < ilog2(_flags_mask));

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-12  9:18     ` Yi Liu
@ 2024-11-12 13:52       ` Alex Williamson
  2024-11-13  7:22         ` Yi Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2024-11-12 13:52 UTC (permalink / raw)
  To: Yi Liu
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On Tue, 12 Nov 2024 17:18:02 +0800
Yi Liu <yi.l.liu@intel.com> wrote:

> On 2024/11/12 08:03, Alex Williamson wrote:
> > On Fri,  8 Nov 2024 04:17:41 -0800
> > Yi Liu <yi.l.liu@intel.com> wrote:
> >   
> >> This generalizes the logic of copying user data when the user struct
> >> Have new fields introduced. The helpers can be used by the vfio uapis
> >> that have the argsz and flags fields in the beginning 8 bytes.
> >>
> >> As an example, the vfio_device_{at|de}tach_iommufd_pt paths are updated
> >> to use the helpers.
> >>
> >> The flags may be defined to mark a new field in the structure, reuse
> >> reserved fields, or special handling of an existing field. The extended
> >> size would differ for different flags. Each user API that wants to use
> >> the generalized helpers should define an array to store the corresponding
> >> extended sizes for each defined flag.
> >>
> >> For example, we start out with the below, minsz is 12.
> >>
> >>    struct vfio_foo_struct {
> >>    	__u32   argsz;
> >>    	__u32   flags;
> >>    	__u32   pt_id;
> >>    };
> >>
> >> And then here it becomes:
> >>
> >>    struct vfio_foo_struct {
> >>    	__u32   argsz;
> >>    	__u32   flags;
> >>    #define VFIO_FOO_STRUCT_PASID   (1 << 0)
> >>    	__u32   pt_id;
> >>    	__u32   pasid;
> >>    };
> >>
> >> The array is { 16 }.
> >>
> >> If the next flag is simply related to the processing of @pt_id and
> >> doesn't require @pasid, then the extended size of the new flag is
> >> 12. The array become { 16, 12 }
> >>
> >>    struct vfio_foo_struct {
> >>    	__u32   argsz;
> >>    	__u32   flags;
> >>    #define VFIO_FOO_STRUCT_PASID   (1 << 0)
> >>    #define VFIO_FOO_STRUCT_SPECICAL_PTID   (1 << 1)
> >>    	__u32   pt_id;
> >>    	__u32   pasid;
> >>    };
> >>
> >> Similarly, rather than adding new field, we might have reused a previously
> >> reserved field, for instance what if we already expanded the structure
> >> as the below, array is already { 24 }.
> >>
> >>    struct vfio_foo_struct {
> >>    	__u32   argsz;
> >>    	__u32   flags;
> >>    #define VFIO_FOO_STRUCT_XXX     (1 << 0)
> >>    	__u32   pt_id;
> >>    	__u32   reserved;
> >>    	__u64   xxx;
> >>    };
> >>
> >> If we then want to add @pasid, we might really prefer to take advantage
> >> of that reserved field and the array becomes { 24, 16 }.
> >>
> >>    struct vfio_foo_struct {
> >>    	__u32   argsz;
> >>    	__u32   flags;
> >>    #define VFIO_FOO_STRUCT_XXX     (1 << 0)
> >>    #define VFIO_FOO_STRUCT_PASID   (1 << 1)
> >>    	__u32   pt_id;
> >>    	__u32   reserved;  
> > 
> > I think this was supposed to be s/reserved/pasid/  
> 
> you are right.
> 
> >>    	__u64   xxx;
> >>    };
> >>
> >> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> >> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> >> ---
> >>   drivers/vfio/device_cdev.c | 81 +++++++++++++-------------------------
> >>   drivers/vfio/vfio.h        | 18 +++++++++
> >>   drivers/vfio/vfio_main.c   | 55 ++++++++++++++++++++++++++
> >>   3 files changed, 100 insertions(+), 54 deletions(-)
> >>
> >> diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c
> >> index 4519f482e212..35c7664b9a97 100644
> >> --- a/drivers/vfio/device_cdev.c
> >> +++ b/drivers/vfio/device_cdev.c
> >> @@ -159,40 +159,33 @@ void vfio_df_unbind_iommufd(struct vfio_device_file *df)
> >>   	vfio_device_unblock_group(device);
> >>   }
> >>   
> >> +#define VFIO_ATTACH_FLAGS_MASK VFIO_DEVICE_ATTACH_PASID
> >> +static unsigned long
> >> +vfio_attach_xends[ilog2(VFIO_ATTACH_FLAGS_MASK) + 1] = {
> >> +	XEND_SIZE(VFIO_DEVICE_ATTACH_PASID,
> >> +		  struct vfio_device_attach_iommufd_pt, pasid),
> >> +};
> >> +
> >> +#define VFIO_DETACH_FLAGS_MASK VFIO_DEVICE_DETACH_PASID
> >> +static unsigned long
> >> +vfio_detach_xends[ilog2(VFIO_DETACH_FLAGS_MASK) + 1] = {
> >> +	XEND_SIZE(VFIO_DEVICE_DETACH_PASID,
> >> +		  struct vfio_device_detach_iommufd_pt, pasid),
> >> +};
> >> +
> >>   int vfio_df_ioctl_attach_pt(struct vfio_device_file *df,
> >>   			    struct vfio_device_attach_iommufd_pt __user *arg)
> >>   {
> >>   	struct vfio_device_attach_iommufd_pt attach;
> >>   	struct vfio_device *device = df->device;
> >> -	unsigned long minsz, xend = 0;
> >>   	int ret;
> >>   
> >> -	minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id);
> >> -
> >> -	if (copy_from_user(&attach, arg, minsz))
> >> -		return -EFAULT;
> >> -
> >> -	if (attach.argsz < minsz)
> >> -		return -EINVAL;
> >> -
> >> -	if (attach.flags & (~VFIO_DEVICE_ATTACH_PASID))
> >> -		return -EINVAL;
> >> -
> >> -	if (attach.flags & VFIO_DEVICE_ATTACH_PASID)
> >> -		xend = offsetofend(struct vfio_device_attach_iommufd_pt, pasid);
> >> -
> >> -	/*
> >> -	 * xend may be equal to minsz if a flag is defined for reusing a
> >> -	 * reserved field or a special usage of an existing field.
> >> -	 */
> >> -	if (xend > minsz) {
> >> -		if (attach.argsz < xend)
> >> -			return -EINVAL;
> >> -
> >> -		if (copy_from_user((void *)&attach + minsz,
> >> -				   (void __user *)arg + minsz, xend - minsz))
> >> -			return -EFAULT;
> >> -	}
> >> +	ret = vfio_copy_user_data((void __user *)arg, &attach,
> >> +				  struct vfio_device_attach_iommufd_pt,
> >> +				  pt_id, VFIO_ATTACH_FLAGS_MASK,
> >> +				  vfio_attach_xends);
> >> +	if (ret)
> >> +		return ret;
> >>   
> >>   	if ((attach.flags & VFIO_DEVICE_ATTACH_PASID) &&
> >>   	    !device->ops->pasid_attach_ioas)
> >> @@ -227,34 +220,14 @@ int vfio_df_ioctl_detach_pt(struct vfio_device_file *df,
> >>   {
> >>   	struct vfio_device_detach_iommufd_pt detach;
> >>   	struct vfio_device *device = df->device;
> >> -	unsigned long minsz, xend = 0;
> >> -
> >> -	minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags);
> >> -
> >> -	if (copy_from_user(&detach, arg, minsz))
> >> -		return -EFAULT;
> >> -
> >> -	if (detach.argsz < minsz)
> >> -		return -EINVAL;
> >> -
> >> -	if (detach.flags & (~VFIO_DEVICE_DETACH_PASID))
> >> -		return -EINVAL;
> >> -
> >> -	if (detach.flags & VFIO_DEVICE_DETACH_PASID)
> >> -		xend = offsetofend(struct vfio_device_detach_iommufd_pt, pasid);
> >> -
> >> -	/*
> >> -	 * xend may be equal to minsz if a flag is defined for reusing a
> >> -	 * reserved field or a special usage of an existing field.
> >> -	 */
> >> -	if (xend > minsz) {
> >> -		if (detach.argsz < xend)
> >> -			return -EINVAL;
> >> +	int ret;
> >>   
> >> -		if (copy_from_user((void *)&detach + minsz,
> >> -				   (void __user *)arg + minsz, xend - minsz))
> >> -			return -EFAULT;
> >> -	}
> >> +	ret = vfio_copy_user_data((void __user *)arg, &detach,
> >> +				  struct vfio_device_detach_iommufd_pt,
> >> +				  flags, VFIO_DETACH_FLAGS_MASK,
> >> +				  vfio_detach_xends);
> >> +	if (ret)
> >> +		return ret;
> >>   
> >>   	if ((detach.flags & VFIO_DEVICE_DETACH_PASID) &&
> >>   	    !device->ops->pasid_detach_ioas)
> >> diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h
> >> index 50128da18bca..87bed550c46e 100644
> >> --- a/drivers/vfio/vfio.h
> >> +++ b/drivers/vfio/vfio.h
> >> @@ -34,6 +34,24 @@ void vfio_df_close(struct vfio_device_file *df);
> >>   struct vfio_device_file *
> >>   vfio_allocate_device_file(struct vfio_device *device);
> >>   
> >> +int vfio_copy_from_user(void *buffer, void __user *arg,
> >> +			unsigned long minsz, u32 flags_mask,
> >> +			unsigned long *xend_array);
> >> +
> >> +#define vfio_copy_user_data(_arg, _local_buffer, _struct, _min_last,          \
> >> +			    _flags_mask, _xend_array)                         \
> >> +	vfio_copy_from_user(_local_buffer, _arg,                              \
> >> +			    offsetofend(_struct, _min_last) +                \
> >> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, argsz) !=     \
> >> +					      0) +                            \
> >> +			    BUILD_BUG_ON_ZERO(offsetof(_struct, flags) !=     \
> >> +					      sizeof(u32)),                   \
> >> +			    _flags_mask, _xend_array)
> >> +
> >> +#define XEND_SIZE(_flag, _struct, _xlast)                                    \
> >> +	[ilog2(_flag)] = offsetofend(_struct, _xlast) +                      \
> >> +			 BUILD_BUG_ON_ZERO(_flag == 0)                       \
> >> +
> >>   extern const struct file_operations vfio_device_fops;
> >>   
> >>   #ifdef CONFIG_VFIO_NOIOMMU
> >> diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c
> >> index a5a62d9d963f..c61336ea5123 100644
> >> --- a/drivers/vfio/vfio_main.c
> >> +++ b/drivers/vfio/vfio_main.c
> >> @@ -1694,6 +1694,61 @@ int vfio_dma_rw(struct vfio_device *device, dma_addr_t iova, void *data,
> >>   }
> >>   EXPORT_SYMBOL(vfio_dma_rw);
> >>   
> >> +/**
> >> + * vfio_copy_from_user - Copy the user struct that may have extended fields
> >> + *
> >> + * @buffer: The local buffer to store the data copied from user
> >> + * @arg: The user buffer pointer
> >> + * @minsz: The minimum size of the user struct
> >> + * @flags_mask: The combination of all the falgs defined
> >> + * @xend_array: The array that stores the xend size for set flags.
> >> + *
> >> + * This helper requires the user struct put the argsz and flags fields in
> >> + * the first 8 bytes.
> >> + *
> >> + * Return 0 for success, otherwise -errno
> >> + */
> >> +int vfio_copy_from_user(void *buffer, void __user *arg,  
> > 
> > This should probably be prefixed with an underscore and note that
> > callers should use the wrapper function to impose the parameter
> > checking.  
> 
> got it.
> 
> >   
> >> +			unsigned long minsz, u32 flags_mask,
> >> +			unsigned long *xend_array)
> >> +{
> >> +	unsigned long xend = minsz;
> >> +	struct user_header {
> >> +		u32 argsz;
> >> +		u32 flags;
> >> +	} *header;
> >> +	unsigned long flags;
> >> +	u32 flag;
> >> +
> >> +	if (copy_from_user(buffer, arg, minsz))
> >> +		return -EFAULT;
> >> +
> >> +	header = (struct user_header *)buffer;
> >> +	if (header->argsz < minsz)
> >> +		return -EINVAL;
> >> +
> >> +	if (header->flags & ~flags_mask)
> >> +		return -EINVAL;  
> > 
> > I'm already wrestling with whether this is an over engineered solution
> > to remove a couple dozen lines of mostly duplicate logic between attach
> > and detach, but a couple points that could make it more versatile:
> > 
> > (1) Test xend_array here:
> > 
> > 	if (!xend_array)
> > 		return 0;  
> 
> Perhaps we should return error if the header->flags has any bit set. Such
> cases require a valid xend_array.

I don't think that's true.  For example if we want to drop this into
existing cases where the structure size has not expanded and flags are
used for other things, I don't think we want the overhead of declaring
an xend_array.

> > (2) Return ssize_t/-errno for the caller to know the resulting copy
> > size.
> >   
> >> +
> >> +	/* Loop each set flag to decide the xend */
> >> +	flags = header->flags;
> >> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
> >> +		if (xend_array[flag] > xend)
> >> +			xend = xend_array[flag];  
> > 
> > Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
> > least long enough to match the highest bit in flags?  Thanks,  
> 
> yes. I would add a BUILD_BUG like the below.
> 
> BUILD_BUG_ON(ARRAY_SIZE(_xend_array) < ilog2(_flags_mask));

So this would need to account that _xend_array can be NULL regardless
of _flags_mask.  Thanks,

Alex 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-12 13:52       ` Alex Williamson
@ 2024-11-13  7:22         ` Yi Liu
  2024-11-14 18:19           ` Alex Williamson
  0 siblings, 1 reply; 16+ messages in thread
From: Yi Liu @ 2024-11-13  7:22 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On 2024/11/12 21:52, Alex Williamson wrote:

>>>> +{
>>>> +	unsigned long xend = minsz;
>>>> +	struct user_header {
>>>> +		u32 argsz;
>>>> +		u32 flags;
>>>> +	} *header;
>>>> +	unsigned long flags;
>>>> +	u32 flag;
>>>> +
>>>> +	if (copy_from_user(buffer, arg, minsz))
>>>> +		return -EFAULT;
>>>> +
>>>> +	header = (struct user_header *)buffer;
>>>> +	if (header->argsz < minsz)
>>>> +		return -EINVAL;
>>>> +
>>>> +	if (header->flags & ~flags_mask)
>>>> +		return -EINVAL;
>>>
>>> I'm already wrestling with whether this is an over engineered solution
>>> to remove a couple dozen lines of mostly duplicate logic between attach
>>> and detach, but a couple points that could make it more versatile:
>>>
>>> (1) Test xend_array here:
>>>
>>> 	if (!xend_array)
>>> 		return 0;
>>
>> Perhaps we should return error if the header->flags has any bit set. Such
>> cases require a valid xend_array.
> 
> I don't think that's true.  For example if we want to drop this into
> existing cases where the structure size has not expanded and flags are
> used for other things, I don't think we want the overhead of declaring
> an xend_array.

I see. My thought was sticking with using it in the cases that have
extended fields. Given that would it be better to return minsz as you
suggested to return ssize_t to caller.

> 
>>> (2) Return ssize_t/-errno for the caller to know the resulting copy
>>> size.
>>>    
>>>> +
>>>> +	/* Loop each set flag to decide the xend */
>>>> +	flags = header->flags;
>>>> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
>>>> +		if (xend_array[flag] > xend)
>>>> +			xend = xend_array[flag];
>>>
>>> Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
>>> least long enough to match the highest bit in flags?  Thanks,
>>
>> yes. I would add a BUILD_BUG like the below.
>>
>> BUILD_BUG_ON(ARRAY_SIZE(_xend_array) < ilog2(_flags_mask));
> 
> So this would need to account that _xend_array can be NULL regardless
> of _flags_mask.  Thanks,
yes, but I encounter a problem to account it. The below failed as when
the _xend_array is a null pointer. It's due to the usage of ARRAY_SIZE
macro. If it's not doable, perhaps we can have two wrappers, one for
copying user data with array, this should enforce the array num check
with flags. While, the another one is for copying user data without
array, no array num check. How about your opinion?

BUILD_BUG_ON((_xend_array != NULL) && (ARRAY_SIZE(_xend_array) < 
ilog2(_flags_mask)));

Compiling fail snippet:

In file included from <command-line>:
./include/linux/array_size.h:11:38: warning: division ‘sizeof (long 
unsigned int *) / sizeof (long unsigned int)’ does not compute the number 
of array elements [-Wsizeof-pointer-div]
    11 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
       |                                      ^
././include/linux/compiler_types.h:497:23: note: in definition of macro 
‘__compiletime_assert’
   497 |                 if (!(condition)) 
      \
       |                       ^~~~~~~~~
././include/linux/compiler_types.h:517:9: note: in expansion of macro 
‘_compiletime_assert’
   517 |         _compiletime_assert(condition, msg, __compiletime_assert_, 
__COUNTER__)
       |         ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:39:37: note: in expansion of macro 
‘compiletime_assert’
    39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)


-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-13  7:22         ` Yi Liu
@ 2024-11-14 18:19           ` Alex Williamson
  2024-11-15 12:10             ` Yi Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Alex Williamson @ 2024-11-14 18:19 UTC (permalink / raw)
  To: Yi Liu
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On Wed, 13 Nov 2024 15:22:02 +0800
Yi Liu <yi.l.liu@intel.com> wrote:

> On 2024/11/12 21:52, Alex Williamson wrote:
> 
> >>>> +{
> >>>> +	unsigned long xend = minsz;
> >>>> +	struct user_header {
> >>>> +		u32 argsz;
> >>>> +		u32 flags;
> >>>> +	} *header;
> >>>> +	unsigned long flags;
> >>>> +	u32 flag;
> >>>> +
> >>>> +	if (copy_from_user(buffer, arg, minsz))
> >>>> +		return -EFAULT;
> >>>> +
> >>>> +	header = (struct user_header *)buffer;
> >>>> +	if (header->argsz < minsz)
> >>>> +		return -EINVAL;
> >>>> +
> >>>> +	if (header->flags & ~flags_mask)
> >>>> +		return -EINVAL;  
> >>>
> >>> I'm already wrestling with whether this is an over engineered solution
> >>> to remove a couple dozen lines of mostly duplicate logic between attach
> >>> and detach, but a couple points that could make it more versatile:
> >>>
> >>> (1) Test xend_array here:
> >>>
> >>> 	if (!xend_array)
> >>> 		return 0;  
> >>
> >> Perhaps we should return error if the header->flags has any bit set. Such
> >> cases require a valid xend_array.  
> > 
> > I don't think that's true.  For example if we want to drop this into
> > existing cases where the structure size has not expanded and flags are
> > used for other things, I don't think we want the overhead of declaring
> > an xend_array.  
> 
> I see. My thought was sticking with using it in the cases that have
> extended fields. Given that would it be better to return minsz as you
> suggested to return ssize_t to caller.

If the xend_array is NULL, then yes it would do the copy, validate
argsz and flags, and return minsz.
 
> >>> (2) Return ssize_t/-errno for the caller to know the resulting copy
> >>> size.
> >>>      
> >>>> +
> >>>> +	/* Loop each set flag to decide the xend */
> >>>> +	flags = header->flags;
> >>>> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
> >>>> +		if (xend_array[flag] > xend)
> >>>> +			xend = xend_array[flag];  
> >>>
> >>> Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
> >>> least long enough to match the highest bit in flags?  Thanks,  
> >>
> >> yes. I would add a BUILD_BUG like the below.
> >>
> >> BUILD_BUG_ON(ARRAY_SIZE(_xend_array) < ilog2(_flags_mask));  
> > 
> > So this would need to account that _xend_array can be NULL regardless
> > of _flags_mask.  Thanks,  
> yes, but I encounter a problem to account it. The below failed as when
> the _xend_array is a null pointer. It's due to the usage of ARRAY_SIZE
> macro. If it's not doable, perhaps we can have two wrappers, one for
> copying user data with array, this should enforce the array num check
> with flags. While, the another one is for copying user data without
> array, no array num check. How about your opinion?
> 
> BUILD_BUG_ON((_xend_array != NULL) && (ARRAY_SIZE(_xend_array) < 
> ilog2(_flags_mask)));
> 
> Compiling fail snippet:
> 
> In file included from <command-line>:
> ./include/linux/array_size.h:11:38: warning: division ‘sizeof (long 
> unsigned int *) / sizeof (long unsigned int)’ does not compute the number 
> of array elements [-Wsizeof-pointer-div]
>     11 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
> __must_be_array(arr))
>        |                                      ^
> ././include/linux/compiler_types.h:497:23: note: in definition of macro 
> ‘__compiletime_assert’
>    497 |                 if (!(condition)) 
>       \
>        |                       ^~~~~~~~~
> ././include/linux/compiler_types.h:517:9: note: in expansion of macro 
> ‘_compiletime_assert’
>    517 |         _compiletime_assert(condition, msg, __compiletime_assert_, 
> __COUNTER__)
>        |         ^~~~~~~~~~~~~~~~~~~
> ./include/linux/build_bug.h:39:37: note: in expansion of macro 
> ‘compiletime_assert’
>     39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)

TBH, I think this whole generalization is stalling the series.  We're
de-duplicating 20-ish lines of code implemented by adjacent functions
with something more complicated, I think mostly to formalize the
methodology of using flags to expand the ioctl data structure, which
has been our plan all along.  If it only addresses the duplication in
these two functions, the added complexity isn't that compelling, but
expanding it to be used more broadly is introducing scope creep.

Given the momentum on the iommufd side, if this series is intended to
be v6.13 material, we should probably defer this generalization to a
follow-on series where we can evaluate a more broadly used helper.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/5] vfio: Add vfio_copy_user_data()
  2024-11-14 18:19           ` Alex Williamson
@ 2024-11-15 12:10             ` Yi Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Yi Liu @ 2024-11-15 12:10 UTC (permalink / raw)
  To: Alex Williamson
  Cc: jgg, kevin.tian, baolu.lu, joro, eric.auger, nicolinc, kvm,
	chao.p.peng, iommu, zhenzhong.duan, vasant.hegde, will

On 2024/11/15 02:19, Alex Williamson wrote:
> On Wed, 13 Nov 2024 15:22:02 +0800
> Yi Liu <yi.l.liu@intel.com> wrote:
> 
>> On 2024/11/12 21:52, Alex Williamson wrote:
>>
>>>>>> +{
>>>>>> +	unsigned long xend = minsz;
>>>>>> +	struct user_header {
>>>>>> +		u32 argsz;
>>>>>> +		u32 flags;
>>>>>> +	} *header;
>>>>>> +	unsigned long flags;
>>>>>> +	u32 flag;
>>>>>> +
>>>>>> +	if (copy_from_user(buffer, arg, minsz))
>>>>>> +		return -EFAULT;
>>>>>> +
>>>>>> +	header = (struct user_header *)buffer;
>>>>>> +	if (header->argsz < minsz)
>>>>>> +		return -EINVAL;
>>>>>> +
>>>>>> +	if (header->flags & ~flags_mask)
>>>>>> +		return -EINVAL;
>>>>>
>>>>> I'm already wrestling with whether this is an over engineered solution
>>>>> to remove a couple dozen lines of mostly duplicate logic between attach
>>>>> and detach, but a couple points that could make it more versatile:
>>>>>
>>>>> (1) Test xend_array here:
>>>>>
>>>>> 	if (!xend_array)
>>>>> 		return 0;
>>>>
>>>> Perhaps we should return error if the header->flags has any bit set. Such
>>>> cases require a valid xend_array.
>>>
>>> I don't think that's true.  For example if we want to drop this into
>>> existing cases where the structure size has not expanded and flags are
>>> used for other things, I don't think we want the overhead of declaring
>>> an xend_array.
>>
>> I see. My thought was sticking with using it in the cases that have
>> extended fields. Given that would it be better to return minsz as you
>> suggested to return ssize_t to caller.
> 
> If the xend_array is NULL, then yes it would do the copy, validate
> argsz and flags, and return minsz.

yes.

>>>>> (2) Return ssize_t/-errno for the caller to know the resulting copy
>>>>> size.
>>>>>       
>>>>>> +
>>>>>> +	/* Loop each set flag to decide the xend */
>>>>>> +	flags = header->flags;
>>>>>> +	for_each_set_bit(flag, &flags, BITS_PER_TYPE(u32)) {
>>>>>> +		if (xend_array[flag] > xend)
>>>>>> +			xend = xend_array[flag];
>>>>>
>>>>> Can we craft a BUILD_BUG in the wrapper to test that xend_array is at
>>>>> least long enough to match the highest bit in flags?  Thanks,
>>>>
>>>> yes. I would add a BUILD_BUG like the below.
>>>>
>>>> BUILD_BUG_ON(ARRAY_SIZE(_xend_array) < ilog2(_flags_mask));
>>>
>>> So this would need to account that _xend_array can be NULL regardless
>>> of _flags_mask.  Thanks,
>> yes, but I encounter a problem to account it. The below failed as when
>> the _xend_array is a null pointer. It's due to the usage of ARRAY_SIZE
>> macro. If it's not doable, perhaps we can have two wrappers, one for
>> copying user data with array, this should enforce the array num check
>> with flags. While, the another one is for copying user data without
>> array, no array num check. How about your opinion?
>>
>> BUILD_BUG_ON((_xend_array != NULL) && (ARRAY_SIZE(_xend_array) <
>> ilog2(_flags_mask)));
>>
>> Compiling fail snippet:
>>
>> In file included from <command-line>:
>> ./include/linux/array_size.h:11:38: warning: division ‘sizeof (long
>> unsigned int *) / sizeof (long unsigned int)’ does not compute the number
>> of array elements [-Wsizeof-pointer-div]
>>      11 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
>> __must_be_array(arr))
>>         |                                      ^
>> ././include/linux/compiler_types.h:497:23: note: in definition of macro
>> ‘__compiletime_assert’
>>     497 |                 if (!(condition))
>>        \
>>         |                       ^~~~~~~~~
>> ././include/linux/compiler_types.h:517:9: note: in expansion of macro
>> ‘_compiletime_assert’
>>     517 |         _compiletime_assert(condition, msg, __compiletime_assert_,
>> __COUNTER__)
>>         |         ^~~~~~~~~~~~~~~~~~~
>> ./include/linux/build_bug.h:39:37: note: in expansion of macro
>> ‘compiletime_assert’
>>      39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
> 
> TBH, I think this whole generalization is stalling the series.  We're
> de-duplicating 20-ish lines of code implemented by adjacent functions
> with something more complicated, I think mostly to formalize the
> methodology of using flags to expand the ioctl data structure, which
> has been our plan all along.  If it only addresses the duplication in
> these two functions, the added complexity isn't that compelling, but
> expanding it to be used more broadly is introducing scope creep.
> 
> Given the momentum on the iommufd side, if this series is intended to
> be v6.13 material, we should probably defer this generalization to a
> follow-on series where we can evaluate a more broadly used helper.

sure. I'm open about it. I may drop it in next series, and make this
generalization as a follow-up patch. :)

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability
  2024-11-08 12:17 ` [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Yi Liu
@ 2024-12-11  2:43   ` Zhangfei Gao
  2024-12-11  3:12     ` Yi Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Zhangfei Gao @ 2024-12-11  2:43 UTC (permalink / raw)
  To: Yi Liu
  Cc: alex.williamson, jgg, kevin.tian, baolu.lu, joro, eric.auger,
	nicolinc, kvm, chao.p.peng, iommu, zhenzhong.duan, vasant.hegde,
	will

On Fri, 8 Nov 2024 at 20:17, Yi Liu <yi.l.liu@intel.com> wrote:
>
> PASID usage requires PASID support in both device and IOMMU. Since the
> iommu drivers always enable the PASID capability for the device if it
> is supported, so it is reasonable to extend the IOMMU_GET_HW_INFO to
> report the PASID capability to userspace.
>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> #aarch64 platform
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Hi, Yi

Found this patch still not gets merged in 6.13-rc1
Any plan to respin.

Will this target to 6.14?

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability
  2024-12-11  2:43   ` Zhangfei Gao
@ 2024-12-11  3:12     ` Yi Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Yi Liu @ 2024-12-11  3:12 UTC (permalink / raw)
  To: Zhangfei Gao
  Cc: alex.williamson, jgg, kevin.tian, baolu.lu, joro, eric.auger,
	nicolinc, kvm, chao.p.peng, iommu, zhenzhong.duan, vasant.hegde,
	will

On 2024/12/11 10:43, Zhangfei Gao wrote:
> On Fri, 8 Nov 2024 at 20:17, Yi Liu <yi.l.liu@intel.com> wrote:
>>
>> PASID usage requires PASID support in both device and IOMMU. Since the
>> iommu drivers always enable the PASID capability for the device if it
>> is supported, so it is reasonable to extend the IOMMU_GET_HW_INFO to
>> report the PASID capability to userspace.
>>
>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
>> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> #aarch64 platform
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> 
> Hi, Yi
> 
> Found this patch still not gets merged in 6.13-rc1
> Any plan to respin.
> 
> Will this target to 6.14?

yes, due to dependency [1] of this series, we need to target it to 6.14.

[1] 
https://lore.kernel.org/linux-iommu/20241104132513.15890-1-yi.l.liu@intel.com/

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-12-11  3:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08 12:17 [PATCH v5 0/5] vfio-pci support pasid attach/detach Yi Liu
2024-11-08 12:17 ` [PATCH v5 1/5] ida: Add ida_find_first_range() Yi Liu
2024-11-08 12:17 ` [PATCH v5 2/5] vfio-iommufd: Support pasid [at|de]tach for physical VFIO devices Yi Liu
2024-11-08 12:17 ` [PATCH v5 3/5] vfio: VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT support pasid Yi Liu
2024-11-12  0:02   ` Alex Williamson
2024-11-12  1:53     ` Yi Liu
2024-11-08 12:17 ` [PATCH v5 4/5] vfio: Add vfio_copy_user_data() Yi Liu
2024-11-12  0:03   ` Alex Williamson
2024-11-12  9:18     ` Yi Liu
2024-11-12 13:52       ` Alex Williamson
2024-11-13  7:22         ` Yi Liu
2024-11-14 18:19           ` Alex Williamson
2024-11-15 12:10             ` Yi Liu
2024-11-08 12:17 ` [PATCH v5 5/5] iommufd: Extend IOMMU_GET_HW_INFO to report PASID capability Yi Liu
2024-12-11  2:43   ` Zhangfei Gao
2024-12-11  3:12     ` Yi Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox