kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] iommufd: Add iommu hardware info reporting
@ 2023-05-11 14:30 Yi Liu
  2023-05-11 14:30 ` [PATCH v3 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Yi Liu @ 2023-05-11 14:30 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest,
	zhenzhong.duan

iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.

This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].

This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.

base-commit: 35db4f4dac813ffaa987cf633694107fabf3aff5

[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting

Change log:

v3:
 - Add r-b from Baolu
 - Rename IOMMU_HW_INFO_TYPE_DEFAULT to be IOMMU_HW_INFO_TYPE_NONE to
   better suit what it means
 - Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
   does not have driver-specific data to report per below remark.
   https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/

v2: https://lore.kernel.org/linux-iommu/20230309075358.571567-1-yi.l.liu@intel.com/
 - Drop patch 05 of v1 as it is already covered by other series
 - Rename the capability info to be iommu hardware info

v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.com/

Regards,
	Yi Liu

Lu Baolu (1):
  iommu: Add new iommu op to get iommu hardware information

Nicolin Chen (1):
  iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl

Yi Liu (2):
  iommu: Move dev_iommu_ops() to private header
  iommufd: Add IOMMU_DEVICE_GET_HW_INFO

 drivers/iommu/iommu-priv.h                    | 11 +++
 drivers/iommu/iommu.c                         |  2 +
 drivers/iommu/iommufd/device.c                | 73 +++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h       |  1 +
 drivers/iommu/iommufd/iommufd_test.h          |  9 +++
 drivers/iommu/iommufd/main.c                  |  3 +
 drivers/iommu/iommufd/selftest.c              | 16 ++++
 include/linux/iommu.h                         | 27 ++++---
 include/uapi/linux/iommufd.h                  | 44 +++++++++++
 tools/testing/selftests/iommu/iommufd.c       | 17 ++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
 11 files changed, 217 insertions(+), 12 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/4] iommu: Move dev_iommu_ops() to private header
  2023-05-11 14:30 [PATCH v3 0/4] iommufd: Add iommu hardware info reporting Yi Liu
@ 2023-05-11 14:30 ` Yi Liu
  2023-05-11 14:30 ` [PATCH v3 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Yi Liu @ 2023-05-11 14:30 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest,
	zhenzhong.duan

dev_iommu_ops() is essentially only used in iommu subsystem, so
move to a private header to avoid being abused by other drivers.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommu-priv.h | 11 +++++++++++
 drivers/iommu/iommu.c      |  2 ++
 include/linux/iommu.h      | 11 -----------
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h
index 7c8011bfd153..a6e694f59f64 100644
--- a/drivers/iommu/iommu-priv.h
+++ b/drivers/iommu/iommu-priv.h
@@ -4,6 +4,17 @@
 
 #include <linux/iommu.h>
 
+static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
+{
+	/*
+	 * Assume that valid ops must be installed if iommu_probe_device()
+	 * has succeeded. The device ops are essentially for internal use
+	 * within the IOMMU subsystem itself, so we should be able to trust
+	 * ourselves not to misuse the helper.
+	 */
+	return dev->iommu->iommu_dev->ops;
+}
+
 int iommu_group_replace_domain(struct iommu_group *group,
 			       struct iommu_domain *new_domain);
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 91573efd9488..b44fd1a76997 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -38,6 +38,8 @@
 
 #include "iommu-sva.h"
 
+#include "iommu-priv.h"
+
 static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e8c9a7da1060..e7d70548e597 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -444,17 +444,6 @@ static inline void iommu_iotlb_gather_init(struct iommu_iotlb_gather *gather)
 	};
 }
 
-static inline const struct iommu_ops *dev_iommu_ops(struct device *dev)
-{
-	/*
-	 * Assume that valid ops must be installed if iommu_probe_device()
-	 * has succeeded. The device ops are essentially for internal use
-	 * within the IOMMU subsystem itself, so we should be able to trust
-	 * ourselves not to misuse the helper.
-	 */
-	return dev->iommu->iommu_dev->ops;
-}
-
 extern int bus_iommu_probe(const struct bus_type *bus);
 extern bool iommu_present(const struct bus_type *bus);
 extern bool device_iommu_capable(struct device *dev, enum iommu_cap cap);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/4] iommu: Add new iommu op to get iommu hardware information
  2023-05-11 14:30 [PATCH v3 0/4] iommufd: Add iommu hardware info reporting Yi Liu
  2023-05-11 14:30 ` [PATCH v3 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
@ 2023-05-11 14:30 ` Yi Liu
  2023-05-11 14:30 ` [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
  2023-05-11 14:30 ` [PATCH v3 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
  3 siblings, 0 replies; 13+ messages in thread
From: Yi Liu @ 2023-05-11 14:30 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest,
	zhenzhong.duan

From: Lu Baolu <baolu.lu@linux.intel.com>

Introduce a new iommu op to get the IOMMU hardware capabilities for
iommufd. This information will be used by any vIOMMU driver which is
owned by userspace.

This op chooses to make the special parameters opaque to the core. This
suits the current usage model where accessing any of the IOMMU device
special parameters does require a userspace driver that matches the kernel
driver. If a need for common parameters, implemented similarly by several
drivers, arises then there's room in the design to grow a generic parameter
set as well. No wrapper API is added as it is supposed to be used by
iommufd only.

Different IOMMU hardware would have different hardware information. So the
information reported differs as well. To let the external user understand
the difference. enum iommu_hw_info_type is defined. For the iommu drivers
that are capable to report hardware information, it should have a unique
iommu_hw_info_type. The iommu_hw_info_type is stored in struct iommu_ops.
For the driver doesn't report hardware information, just use
IOMMU_HW_INFO_TYPE_NONE.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h        | 16 ++++++++++++++++
 include/uapi/linux/iommufd.h |  7 +++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e7d70548e597..a748d60206e7 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -14,6 +14,7 @@
 #include <linux/err.h>
 #include <linux/of.h>
 #include <uapi/linux/iommu.h>
+#include <uapi/linux/iommufd.h>
 
 #define IOMMU_READ	(1 << 0)
 #define IOMMU_WRITE	(1 << 1)
@@ -222,6 +223,11 @@ struct iommu_iotlb_gather {
 /**
  * struct iommu_ops - iommu ops and capabilities
  * @capable: check capability
+ * @hw_info: IOMMU hardware information. The type of the returned data is
+ *           marked by @hw_info_type. The data buffer returned by this op
+ *           is allocated in the IOMMU driver and the caller should free it
+ *           after use. Return the data buffer if success, or ERR_PTR on
+ *           failure.
  * @domain_alloc: allocate iommu domain
  * @probe_device: Add device to iommu driver handling
  * @release_device: Remove device from iommu driver handling
@@ -246,11 +252,20 @@ struct iommu_iotlb_gather {
  * @remove_dev_pasid: Remove any translation configurations of a specific
  *                    pasid, so that any DMA transactions with this pasid
  *                    will be blocked by the hardware.
+ * @hw_info_type: One of enum iommu_hw_info_type defined in
+ *                include/uapi/linux/iommufd.h. It is used to tag the type
+ *                of data returned by .hw_info callback. The drivers that
+ *                support .hw_info callback should define a unique type
+ *                in include/uapi/linux/iommufd.h. For the drivers that do
+ *                not implement .hw_info callback, this field is
+ *                IOMMU_HW_INFO_TYPE_NONE which is 0. Hence, such drivers
+ *                do not need to care this field.
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  * @owner: Driver module providing these ops
  */
 struct iommu_ops {
 	bool (*capable)(struct device *dev, enum iommu_cap);
+	void *(*hw_info)(struct device *dev, u32 *length);
 
 	/* Domain allocation and freeing by the iommu driver */
 	struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
@@ -279,6 +294,7 @@ struct iommu_ops {
 	void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid);
 
 	const struct iommu_domain_ops *default_domain_ops;
+	enum iommu_hw_info_type hw_info_type;
 	unsigned long pgsize_bitmap;
 	struct module *owner;
 };
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 8245c01adca6..99bf0715f545 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -370,4 +370,11 @@ struct iommu_hwpt_alloc {
 	__u32 __reserved;
 };
 #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC)
+
+/**
+ * enum iommu_hw_info_type - IOMMU Hardware Info Types
+ */
+enum iommu_hw_info_type {
+	IOMMU_HW_INFO_TYPE_NONE,
+};
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-11 14:30 [PATCH v3 0/4] iommufd: Add iommu hardware info reporting Yi Liu
  2023-05-11 14:30 ` [PATCH v3 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
  2023-05-11 14:30 ` [PATCH v3 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
@ 2023-05-11 14:30 ` Yi Liu
  2023-05-12  5:38   ` Baolu Lu
                     ` (2 more replies)
  2023-05-11 14:30 ` [PATCH v3 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu
  3 siblings, 3 replies; 13+ messages in thread
From: Yi Liu @ 2023-05-11 14:30 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest,
	zhenzhong.duan

Under nested IOMMU translation, userspace owns the stage-1 translation
table (e.g. the stage-1 page table of Intel VT-d or the context table
of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
and needs to be compatiable with the underlying IOMMU hardware. Hence,
userspace should know the IOMMU hardware capability before creating and
configuring the stage-1 translation table to kernel.

This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware information
for a given device. The returned data is vendor specific, userspace needs
to decode it with the structure mapped by the @out_data_type field.

As only physical devices have IOMMU hardware, so this will return error
if the given device is not a physical device.

Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c          | 72 +++++++++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h |  1 +
 drivers/iommu/iommufd/main.c            |  3 ++
 include/uapi/linux/iommufd.h            | 37 +++++++++++++
 4 files changed, 113 insertions(+)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index 051bd8e99858..bc99d092de8f 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -263,6 +263,78 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
 }
 EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD);
 
+static int iommufd_zero_fill_user(u64 ptr, int bytes)
+{
+	int index = 0;
+
+	for (; index < bytes; index++) {
+		if (put_user(0, (uint8_t __user *)u64_to_user_ptr(ptr + index)))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
+{
+	struct iommu_hw_info *cmd = ucmd->cmd;
+	unsigned int length = 0, data_len;
+	struct iommufd_device *idev;
+	const struct iommu_ops *ops;
+	void *data = NULL;
+	int rc = 0;
+
+	if (cmd->flags || cmd->__reserved || !cmd->data_len)
+		return -EOPNOTSUPP;
+
+	idev = iommufd_get_device(ucmd, cmd->dev_id);
+	if (IS_ERR(idev))
+		return PTR_ERR(idev);
+
+	ops = dev_iommu_ops(idev->dev);
+	if (!ops->hw_info)
+		goto done;
+
+	/* driver has hw_info callback should have a unique hw_info_type */
+	if (ops->hw_info_type == IOMMU_HW_INFO_TYPE_NONE) {
+		pr_warn_ratelimited("iommu driver set an invalid type\n");
+		rc = -ENODEV;
+		goto out_err;
+	}
+
+	data = ops->hw_info(idev->dev, &data_len);
+	if (IS_ERR(data)) {
+		rc = PTR_ERR(data);
+		goto out_err;
+	}
+
+	length = min(cmd->data_len, data_len);
+	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
+		rc = -EFAULT;
+		goto out_err;
+	}
+
+	/*
+	 * Zero the trailing bytes if the user buffer is bigger than the
+	 * data size kernel actually has.
+	 */
+	if (length < cmd->data_len) {
+		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
+					    cmd->data_len - length);
+		if (rc)
+			goto out_err;
+	}
+
+done:
+	cmd->data_len = length;
+	cmd->out_data_type = ops->hw_info_type;
+	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+
+out_err:
+	kfree(data);
+	iommufd_put_object(&idev->obj);
+	return rc;
+}
+
 static int iommufd_group_setup_msi(struct iommufd_group *igroup,
 				   struct iommufd_hw_pagetable *hwpt)
 {
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index dba730129b8c..69d6bb61d387 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -308,6 +308,7 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id)
 }
 
 void iommufd_device_destroy(struct iommufd_object *obj);
+int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd);
 
 struct iommufd_access {
 	struct iommufd_object obj;
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 3932fe26522b..5c24e8971f09 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -269,6 +269,7 @@ static int iommufd_option(struct iommufd_ucmd *ucmd)
 union ucmd_buffer {
 	struct iommu_destroy destroy;
 	struct iommu_hwpt_alloc hwpt;
+	struct iommu_hw_info info;
 	struct iommu_ioas_alloc alloc;
 	struct iommu_ioas_allow_iovas allow_iovas;
 	struct iommu_ioas_copy ioas_copy;
@@ -302,6 +303,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
 	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct iommu_destroy, id),
 	IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc,
 		 __reserved),
+	IOCTL_OP(IOMMU_DEVICE_GET_HW_INFO, iommufd_device_get_hw_info,
+		 struct iommu_hw_info, __reserved),
 	IOCTL_OP(IOMMU_IOAS_ALLOC, iommufd_ioas_alloc_ioctl,
 		 struct iommu_ioas_alloc, out_ioas_id),
 	IOCTL_OP(IOMMU_IOAS_ALLOW_IOVAS, iommufd_ioas_allow_iovas,
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 99bf0715f545..e9d42838dcbd 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -46,6 +46,7 @@ enum {
 	IOMMUFD_CMD_OPTION,
 	IOMMUFD_CMD_VFIO_IOAS,
 	IOMMUFD_CMD_HWPT_ALLOC,
+	IOMMUFD_CMD_DEVICE_GET_HW_INFO,
 };
 
 /**
@@ -377,4 +378,40 @@ struct iommu_hwpt_alloc {
 enum iommu_hw_info_type {
 	IOMMU_HW_INFO_TYPE_NONE,
 };
+
+/**
+ * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
+ * @size: sizeof(struct iommu_hw_info)
+ * @flags: Must be 0
+ * @dev_id: The device bound to the iommufd
+ * @data_len: Input the length of the user buffer in bytes. Output the
+ *            length of data filled in the user buffer.
+ * @data_ptr: Pointer to the user buffer
+ * @out_data_type: Output the iommu hardware info type as defined by
+ *                 enum iommu_hw_info_type.
+ * @__reserved: Must be 0
+ *
+ * Query the hardware iommu information for given device which has been
+ * bound to iommufd. @data_len is the size of the buffer which captures
+ * iommu type specific data and the data will be filled. Trailing bytes
+ * are zeroed if the user buffer is larger than the data kernel has.
+ *
+ * The type specific data would be used to sync capability between the
+ * virtual IOMMU and the hardware IOMMU. e.g. nested translation requires
+ * to check the hardware IOMMU capability so guest stage-1 page table
+ * uses a format compatible to the hardware IOMMU.
+ *
+ * The @out_data_type will be filled if the ioctl succeeds. It would
+ * be used to decode the data filled in the buffer pointed by @data_ptr.
+ */
+struct iommu_hw_info {
+	__u32 size;
+	__u32 flags;
+	__u32 dev_id;
+	__u32 data_len;
+	__aligned_u64 data_ptr;
+	__u32 out_data_type;
+	__u32 __reserved;
+};
+#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_DEVICE_GET_HW_INFO)
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl
  2023-05-11 14:30 [PATCH v3 0/4] iommufd: Add iommu hardware info reporting Yi Liu
                   ` (2 preceding siblings ...)
  2023-05-11 14:30 ` [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
@ 2023-05-11 14:30 ` Yi Liu
  3 siblings, 0 replies; 13+ messages in thread
From: Yi Liu @ 2023-05-11 14:30 UTC (permalink / raw)
  To: joro, alex.williamson, jgg, kevin.tian, robin.murphy, baolu.lu
  Cc: cohuck, eric.auger, nicolinc, kvm, mjrosato, chao.p.peng,
	yi.l.liu, yi.y.sun, peterx, jasowang, shameerali.kolothum.thodi,
	lulu, suravee.suthikulpanit, iommu, linux-kernel, linux-kselftest,
	zhenzhong.duan

From: Nicolin Chen <nicolinc@nvidia.com>

Add a mock_domain_hw_info function and an iommu_test_hw_info data
structure. This allows to test the IOMMU_DEVICE_GET_HW_INFO ioctl by
passing the test_reg value for the mock_dev.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/iommu/iommufd/device.c                |  1 +
 drivers/iommu/iommufd/iommufd_test.h          |  9 +++++++
 drivers/iommu/iommufd/selftest.c              | 16 ++++++++++++
 tools/testing/selftests/iommu/iommufd.c       | 17 +++++++++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++++++++++++++
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
index bc99d092de8f..4541d785bfd8 100644
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -8,6 +8,7 @@
 
 #include "io_pagetable.h"
 #include "iommufd_private.h"
+#include "iommufd_test.h"
 
 static bool allow_unsafe_interrupts;
 module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR);
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index 258de2253b61..3f3644375bf1 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -100,4 +100,13 @@ struct iommu_test_cmd {
 };
 #define IOMMU_TEST_CMD _IO(IOMMUFD_TYPE, IOMMUFD_CMD_BASE + 32)
 
+/* Mock structs for IOMMU_DEVICE_GET_HW_INFO ioctl */
+#define IOMMU_HW_INFO_TYPE_SELFTEST	0xfeedbeef
+#define IOMMU_HW_INFO_SELFTEST_REGVAL	0xdeadbeef
+
+struct iommu_test_hw_info {
+	__u32 flags;
+	__u32 test_reg;
+};
+
 #endif
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index bb2cd54ca7b6..af7459e211ad 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -128,6 +128,20 @@ static struct iommu_domain mock_blocking_domain = {
 	.ops = &mock_blocking_ops,
 };
 
+static void *mock_domain_hw_info(struct device *dev, u32 *length)
+{
+	struct iommu_test_hw_info *info;
+
+	info = kzalloc(sizeof(*info), GFP_KERNEL);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	info->test_reg = IOMMU_HW_INFO_SELFTEST_REGVAL;
+	*length = sizeof(*info);
+
+	return info;
+}
+
 static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type)
 {
 	struct mock_iommu_domain *mock;
@@ -279,6 +293,8 @@ static void mock_domain_set_plaform_dma_ops(struct device *dev)
 static const struct iommu_ops mock_ops = {
 	.owner = THIS_MODULE,
 	.pgsize_bitmap = MOCK_IO_PAGE_SIZE,
+	.hw_info_type = IOMMU_HW_INFO_TYPE_SELFTEST,
+	.hw_info = mock_domain_hw_info,
 	.domain_alloc = mock_domain_alloc,
 	.capable = mock_domain_capable,
 	.set_platform_dma_ops = mock_domain_set_plaform_dma_ops,
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 8acd0af37aa5..fa2324741ad2 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -121,6 +121,7 @@ TEST_F(iommufd, cmd_length)
 	TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP);
 	TEST_LENGTH(iommu_option, IOMMU_OPTION);
 	TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS);
+	TEST_LENGTH(iommu_hw_info, IOMMU_DEVICE_GET_HW_INFO);
 #undef TEST_LENGTH
 }
 
@@ -185,6 +186,7 @@ FIXTURE(iommufd_ioas)
 	uint32_t ioas_id;
 	uint32_t stdev_id;
 	uint32_t hwpt_id;
+	uint32_t device_id;
 	uint64_t base_iova;
 };
 
@@ -211,7 +213,7 @@ FIXTURE_SETUP(iommufd_ioas)
 
 	for (i = 0; i != variant->mock_domains; i++) {
 		test_cmd_mock_domain(self->ioas_id, &self->stdev_id,
-				     &self->hwpt_id, NULL);
+				     &self->hwpt_id, &self->device_id);
 		self->base_iova = MOCK_APERTURE_START;
 	}
 }
@@ -290,6 +292,19 @@ TEST_F(iommufd_ioas, ioas_area_auto_destroy)
 	}
 }
 
+TEST_F(iommufd_ioas, device_get_hw_info)
+{
+	struct iommu_test_hw_info info;
+
+	if (self->device_id) {
+		test_cmd_device_get_hw_info(self->device_id, sizeof(info), &info);
+		assert(info.test_reg == IOMMU_HW_INFO_SELFTEST_REGVAL);
+	} else {
+		test_err_device_get_hw_info(ENOENT, self->device_id,
+					    sizeof(info), &info);
+	}
+}
+
 TEST_F(iommufd_ioas, area)
 {
 	int i;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 70353e68e599..8dced7ef9118 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -348,3 +348,29 @@ static void teardown_iommufd(int fd, struct __test_metadata *_metadata)
 	})
 
 #endif
+
+static int _test_cmd_device_get_hw_info(int fd, __u32 device_id,
+					__u32 data_len, void *data)
+{
+	struct iommu_hw_info cmd = {
+		.size = sizeof(cmd),
+		.dev_id = device_id,
+		.data_len = data_len,
+		.data_ptr = (uint64_t)data,
+	};
+	int ret;
+
+	ret = ioctl(fd, IOMMU_DEVICE_GET_HW_INFO, &cmd);
+	if (ret)
+		return ret;
+	return 0;
+}
+
+#define test_cmd_device_get_hw_info(device_id, data_len, data)         \
+	ASSERT_EQ(0, _test_cmd_device_get_hw_info(self->fd, device_id, \
+						  data_len, data))
+
+#define test_err_device_get_hw_info(_errno, device_id, data_len, data) \
+	EXPECT_ERRNO(_errno,                                           \
+		     _test_cmd_device_get_hw_info(self->fd, device_id, \
+						  data_len, data))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-11 14:30 ` [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
@ 2023-05-12  5:38   ` Baolu Lu
  2023-05-15  6:14     ` Liu, Yi L
  2023-05-19  8:42   ` Tian, Kevin
       [not found]   ` <BL1PR11MB527177A3860E10818E9761938C7C9@BL1PR11MB5271.namprd11.prod.outlook.com>
  2 siblings, 1 reply; 13+ messages in thread
From: Baolu Lu @ 2023-05-12  5:38 UTC (permalink / raw)
  To: Yi Liu, joro, alex.williamson, jgg, kevin.tian, robin.murphy
  Cc: baolu.lu, cohuck, eric.auger, nicolinc, kvm, mjrosato,
	chao.p.peng, yi.y.sun, peterx, jasowang,
	shameerali.kolothum.thodi, lulu, suravee.suthikulpanit, iommu,
	linux-kernel, linux-kselftest, zhenzhong.duan

On 5/11/23 10:30 PM, Yi Liu wrote:
> Under nested IOMMU translation, userspace owns the stage-1 translation
> table (e.g. the stage-1 page table of Intel VT-d or the context table
> of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
> and needs to be compatiable with the underlying IOMMU hardware. Hence,
> userspace should know the IOMMU hardware capability before creating and
> configuring the stage-1 translation table to kernel.
> 
> This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware information
> for a given device. The returned data is vendor specific, userspace needs
> to decode it with the structure mapped by the @out_data_type field.
> 
> As only physical devices have IOMMU hardware, so this will return error
> if the given device is not a physical device.
> 
> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>   drivers/iommu/iommufd/device.c          | 72 +++++++++++++++++++++++++
>   drivers/iommu/iommufd/iommufd_private.h |  1 +
>   drivers/iommu/iommufd/main.c            |  3 ++
>   include/uapi/linux/iommufd.h            | 37 +++++++++++++
>   4 files changed, 113 insertions(+)
> 
> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> index 051bd8e99858..bc99d092de8f 100644
> --- a/drivers/iommu/iommufd/device.c
> +++ b/drivers/iommu/iommufd/device.c
> @@ -263,6 +263,78 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
>   }
>   EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD);
>   
> +static int iommufd_zero_fill_user(u64 ptr, int bytes)
> +{
> +	int index = 0;
> +
> +	for (; index < bytes; index++) {
> +		if (put_user(0, (uint8_t __user *)u64_to_user_ptr(ptr + index)))
> +			return -EFAULT;
> +	}
> +	return 0;
> +}
> +
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hw_info *cmd = ucmd->cmd;
> +	unsigned int length = 0, data_len;
> +	struct iommufd_device *idev;
> +	const struct iommu_ops *ops;
> +	void *data = NULL;
> +	int rc = 0;
> +
> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ops = dev_iommu_ops(idev->dev);
> +	if (!ops->hw_info)
> +		goto done;

If the iommu driver doesn't provide a hw_info callback, it still
returns success?

> +
> +	/* driver has hw_info callback should have a unique hw_info_type */
> +	if (ops->hw_info_type == IOMMU_HW_INFO_TYPE_NONE) {
> +		pr_warn_ratelimited("iommu driver set an invalid type\n");
> +		rc = -ENODEV;
> +		goto out_err;
> +	}
> +
> +	data = ops->hw_info(idev->dev, &data_len);
> +	if (IS_ERR(data)) {
> +		rc = PTR_ERR(data);
> +		goto out_err;
> +	}
> +
> +	length = min(cmd->data_len, data_len);
> +	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
> +		rc = -EFAULT;
> +		goto out_err;
> +	}
> +
> +	/*
> +	 * Zero the trailing bytes if the user buffer is bigger than the
> +	 * data size kernel actually has.
> +	 */
> +	if (length < cmd->data_len) {
> +		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
> +					    cmd->data_len - length);
> +		if (rc)
> +			goto out_err;
> +	}
> +
> +done:
> +	cmd->data_len = length;
> +	cmd->out_data_type = ops->hw_info_type;
> +	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> +
> +out_err:
> +	kfree(data);
> +	iommufd_put_object(&idev->obj);
> +	return rc;
> +}
> +
>   static int iommufd_group_setup_msi(struct iommufd_group *igroup,
>   				   struct iommufd_hw_pagetable *hwpt)
>   {

Best regards,
baolu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-12  5:38   ` Baolu Lu
@ 2023-05-15  6:14     ` Liu, Yi L
  2023-05-16  1:49       ` Baolu Lu
  0 siblings, 1 reply; 13+ messages in thread
From: Liu, Yi L @ 2023-05-15  6:14 UTC (permalink / raw)
  To: Baolu Lu, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, Tian, Kevin, robin.murphy@arm.com
  Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com,
	kvm@vger.kernel.org, mjrosato@linux.ibm.com,
	chao.p.peng@linux.intel.com, yi.y.sun@linux.intel.com,
	peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong



> -----Original Message-----
> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Friday, May 12, 2023 1:39 PM
> To: Liu, Yi L <yi.l.liu@intel.com>; joro@8bytes.org; alex.williamson@redhat.com;
> jgg@nvidia.com; Tian, Kevin <kevin.tian@intel.com>; robin.murphy@arm.com
> Cc: baolu.lu@linux.intel.com; cohuck@redhat.com; eric.auger@redhat.com;
> nicolinc@nvidia.com; kvm@vger.kernel.org; mjrosato@linux.ibm.com;
> chao.p.peng@linux.intel.com; yi.y.sun@linux.intel.com; peterx@redhat.com;
> jasowang@redhat.com; shameerali.kolothum.thodi@huawei.com; lulu@redhat.com;
> suravee.suthikulpanit@amd.com; iommu@lists.linux.dev; linux-kernel@vger.kernel.org;
> linux-kselftest@vger.kernel.org; Duan, Zhenzhong <zhenzhong.duan@intel.com>
> Subject: Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
> 
> On 5/11/23 10:30 PM, Yi Liu wrote:
> > Under nested IOMMU translation, userspace owns the stage-1 translation
> > table (e.g. the stage-1 page table of Intel VT-d or the context table
> > of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
> > and needs to be compatiable with the underlying IOMMU hardware. Hence,
> > userspace should know the IOMMU hardware capability before creating and
> > configuring the stage-1 translation table to kernel.
> >
> > This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware
> information
> > for a given device. The returned data is vendor specific, userspace needs
> > to decode it with the structure mapped by the @out_data_type field.
> >
> > As only physical devices have IOMMU hardware, so this will return error
> > if the given device is not a physical device.
> >
> > Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> > Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> > ---
> >   drivers/iommu/iommufd/device.c          | 72 +++++++++++++++++++++++++
> >   drivers/iommu/iommufd/iommufd_private.h |  1 +
> >   drivers/iommu/iommufd/main.c            |  3 ++
> >   include/uapi/linux/iommufd.h            | 37 +++++++++++++
> >   4 files changed, 113 insertions(+)
> >
> > diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> > index 051bd8e99858..bc99d092de8f 100644
> > --- a/drivers/iommu/iommufd/device.c
> > +++ b/drivers/iommu/iommufd/device.c
> > @@ -263,6 +263,78 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
> >   }
> >   EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD);
> >
> > +static int iommufd_zero_fill_user(u64 ptr, int bytes)
> > +{
> > +	int index = 0;
> > +
> > +	for (; index < bytes; index++) {
> > +		if (put_user(0, (uint8_t __user *)u64_to_user_ptr(ptr + index)))
> > +			return -EFAULT;
> > +	}
> > +	return 0;
> > +}
> > +
> > +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_hw_info *cmd = ucmd->cmd;
> > +	unsigned int length = 0, data_len;
> > +	struct iommufd_device *idev;
> > +	const struct iommu_ops *ops;
> > +	void *data = NULL;
> > +	int rc = 0;
> > +
> > +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> > +		return -EOPNOTSUPP;
> > +
> > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > +	if (IS_ERR(idev))
> > +		return PTR_ERR(idev);
> > +
> > +	ops = dev_iommu_ops(idev->dev);
> > +	if (!ops->hw_info)
> > +		goto done;
> 
> If the iommu driver doesn't provide a hw_info callback, it still
> returns success?

Yes, as noted in the cover letter. It's for a remark from Jason. In such
case, the out_data_type is NULL, it means no specific data is filled
in the buffer pointed by cmd->data_ptr.

- Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
   does not have driver-specific data to report per below remark.
   https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/

Regards,
Yi Liu

> > +
> > +	/* driver has hw_info callback should have a unique hw_info_type */
> > +	if (ops->hw_info_type == IOMMU_HW_INFO_TYPE_NONE) {
> > +		pr_warn_ratelimited("iommu driver set an invalid type\n");
> > +		rc = -ENODEV;
> > +		goto out_err;
> > +	}
> > +
> > +	data = ops->hw_info(idev->dev, &data_len);
> > +	if (IS_ERR(data)) {
> > +		rc = PTR_ERR(data);
> > +		goto out_err;
> > +	}
> > +
> > +	length = min(cmd->data_len, data_len);
> > +	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
> > +		rc = -EFAULT;
> > +		goto out_err;
> > +	}
> > +
> > +	/*
> > +	 * Zero the trailing bytes if the user buffer is bigger than the
> > +	 * data size kernel actually has.
> > +	 */
> > +	if (length < cmd->data_len) {
> > +		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
> > +					    cmd->data_len - length);
> > +		if (rc)
> > +			goto out_err;
> > +	}
> > +
> > +done:
> > +	cmd->data_len = length;
> > +	cmd->out_data_type = ops->hw_info_type;
> > +	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> > +
> > +out_err:
> > +	kfree(data);
> > +	iommufd_put_object(&idev->obj);
> > +	return rc;
> > +}
> > +
> >   static int iommufd_group_setup_msi(struct iommufd_group *igroup,
> >   				   struct iommufd_hw_pagetable *hwpt)
> >   {
> 
> Best regards,
> baolu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-15  6:14     ` Liu, Yi L
@ 2023-05-16  1:49       ` Baolu Lu
  0 siblings, 0 replies; 13+ messages in thread
From: Baolu Lu @ 2023-05-16  1:49 UTC (permalink / raw)
  To: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, Tian, Kevin, robin.murphy@arm.com
  Cc: baolu.lu, cohuck@redhat.com, eric.auger@redhat.com,
	nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com,
	chao.p.peng@linux.intel.com, yi.y.sun@linux.intel.com,
	peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

On 5/15/23 2:14 PM, Liu, Yi L wrote:
>> -----Original Message-----
>> From: Baolu Lu<baolu.lu@linux.intel.com>
>> Sent: Friday, May 12, 2023 1:39 PM
>> To: Liu, Yi L<yi.l.liu@intel.com>;joro@8bytes.org;alex.williamson@redhat.com;
>> jgg@nvidia.com; Tian, Kevin<kevin.tian@intel.com>;robin.murphy@arm.com
>> Cc:baolu.lu@linux.intel.com;cohuck@redhat.com;eric.auger@redhat.com;
>> nicolinc@nvidia.com;kvm@vger.kernel.org;mjrosato@linux.ibm.com;
>> chao.p.peng@linux.intel.com;yi.y.sun@linux.intel.com;peterx@redhat.com;
>> jasowang@redhat.com;shameerali.kolothum.thodi@huawei.com;lulu@redhat.com;
>> suravee.suthikulpanit@amd.com;iommu@lists.linux.dev;linux-kernel@vger.kernel.org;
>> linux-kselftest@vger.kernel.org; Duan, Zhenzhong<zhenzhong.duan@intel.com>
>> Subject: Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
>>
>> On 5/11/23 10:30 PM, Yi Liu wrote:
>>> Under nested IOMMU translation, userspace owns the stage-1 translation
>>> table (e.g. the stage-1 page table of Intel VT-d or the context table
>>> of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific,
>>> and needs to be compatiable with the underlying IOMMU hardware. Hence,
>>> userspace should know the IOMMU hardware capability before creating and
>>> configuring the stage-1 translation table to kernel.
>>>
>>> This adds IOMMU_DEVICE_GET_HW_INFO to query the IOMMU hardware
>> information
>>> for a given device. The returned data is vendor specific, userspace needs
>>> to decode it with the structure mapped by the @out_data_type field.
>>>
>>> As only physical devices have IOMMU hardware, so this will return error
>>> if the given device is not a physical device.
>>>
>>> Co-developed-by: Nicolin Chen<nicolinc@nvidia.com>
>>> Signed-off-by: Nicolin Chen<nicolinc@nvidia.com>
>>> Signed-off-by: Yi Liu<yi.l.liu@intel.com>
>>> ---
>>>    drivers/iommu/iommufd/device.c          | 72 +++++++++++++++++++++++++
>>>    drivers/iommu/iommufd/iommufd_private.h |  1 +
>>>    drivers/iommu/iommufd/main.c            |  3 ++
>>>    include/uapi/linux/iommufd.h            | 37 +++++++++++++
>>>    4 files changed, 113 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
>>> index 051bd8e99858..bc99d092de8f 100644
>>> --- a/drivers/iommu/iommufd/device.c
>>> +++ b/drivers/iommu/iommufd/device.c
>>> @@ -263,6 +263,78 @@ u32 iommufd_device_to_id(struct iommufd_device *idev)
>>>    }
>>>    EXPORT_SYMBOL_NS_GPL(iommufd_device_to_id, IOMMUFD);
>>>
>>> +static int iommufd_zero_fill_user(u64 ptr, int bytes)
>>> +{
>>> +	int index = 0;
>>> +
>>> +	for (; index < bytes; index++) {
>>> +		if (put_user(0, (uint8_t __user *)u64_to_user_ptr(ptr + index)))
>>> +			return -EFAULT;
>>> +	}
>>> +	return 0;
>>> +}
>>> +
>>> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
>>> +{
>>> +	struct iommu_hw_info *cmd = ucmd->cmd;
>>> +	unsigned int length = 0, data_len;
>>> +	struct iommufd_device *idev;
>>> +	const struct iommu_ops *ops;
>>> +	void *data = NULL;
>>> +	int rc = 0;
>>> +
>>> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
>>> +		return -EOPNOTSUPP;
>>> +
>>> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
>>> +	if (IS_ERR(idev))
>>> +		return PTR_ERR(idev);
>>> +
>>> +	ops = dev_iommu_ops(idev->dev);
>>> +	if (!ops->hw_info)
>>> +		goto done;
>> If the iommu driver doesn't provide a hw_info callback, it still
>> returns success?
> Yes, as noted in the cover letter. It's for a remark from Jason. In such
> case, the out_data_type is NULL, it means no specific data is filled
> in the buffer pointed by cmd->data_ptr.
> 
> - Let IOMMU_DEVICE_GET_HW_INFO succeed even the underlying iommu driver
>     does not have driver-specific data to report per below remark.
>     https://lore.kernel.org/kvm/ZAcwJSK%2F9UVI9LXu@nvidia.com/

Oh, I overlooked that. Thanks for the explanation. It's fair enough.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-11 14:30 ` [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
  2023-05-12  5:38   ` Baolu Lu
@ 2023-05-19  8:42   ` Tian, Kevin
  2023-05-19 18:30     ` Nicolin Chen
       [not found]   ` <BL1PR11MB527177A3860E10818E9761938C7C9@BL1PR11MB5271.namprd11.prod.outlook.com>
  2 siblings, 1 reply; 13+ messages in thread
From: Tian, Kevin @ 2023-05-19  8:42 UTC (permalink / raw)
  To: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, robin.murphy@arm.com, baolu.lu@linux.intel.com
  Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com,
	kvm@vger.kernel.org, mjrosato@linux.ibm.com,
	chao.p.peng@linux.intel.com, yi.y.sun@linux.intel.com,
	peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, May 11, 2023 10:30 PM
> +
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hw_info *cmd = ucmd->cmd;
> +	unsigned int length = 0, data_len;
> +	struct iommufd_device *idev;
> +	const struct iommu_ops *ops;
> +	void *data = NULL;
> +	int rc = 0;
> +
> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ops = dev_iommu_ops(idev->dev);
> +	if (!ops->hw_info)
> +		goto done;
> +
> +	/* driver has hw_info callback should have a unique hw_info_type */
> +	if (ops->hw_info_type == IOMMU_HW_INFO_TYPE_NONE) {
> +		pr_warn_ratelimited("iommu driver set an invalid type\n");
> +		rc = -ENODEV;
> +		goto out_err;
> +	}

this should be a WARN_ON_ONCE() since it's a driver bug.

> +
> +	data = ops->hw_info(idev->dev, &data_len);
> +	if (IS_ERR(data)) {
> +		rc = PTR_ERR(data);
> +		goto out_err;
> +	}
> +
> +	length = min(cmd->data_len, data_len);
> +	if (copy_to_user(u64_to_user_ptr(cmd->data_ptr), data, length)) {
> +		rc = -EFAULT;
> +		goto out_err;
> +	}
> +
> +	/*
> +	 * Zero the trailing bytes if the user buffer is bigger than the
> +	 * data size kernel actually has.
> +	 */
> +	if (length < cmd->data_len) {
> +		rc = iommufd_zero_fill_user(cmd->data_ptr + length,
> +					    cmd->data_len - length);
> +		if (rc)
> +			goto out_err;
> +	}
> +
> +done:
> +	cmd->data_len = length;
> +	cmd->out_data_type = ops->hw_info_type;
> +	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));

the 'done' label should be moved before above zero_fill. Otherwise
in !ops->hw_info case the user buffer is not cleared.

>  union ucmd_buffer {
>  	struct iommu_destroy destroy;
>  	struct iommu_hwpt_alloc hwpt;
> +	struct iommu_hw_info info;

follow alphabetic order this should be ahead of hwpt.

> @@ -302,6 +303,8 @@ static const struct iommufd_ioctl_op
> iommufd_ioctl_ops[] = {
>  	IOCTL_OP(IOMMU_DESTROY, iommufd_destroy, struct
> iommu_destroy, id),
>  	IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct
> iommu_hwpt_alloc,
>  		 __reserved),
> +	IOCTL_OP(IOMMU_DEVICE_GET_HW_INFO,
> iommufd_device_get_hw_info,
> +		 struct iommu_hw_info, __reserved),

before IOMMU_HWPT_ALLOC

> +
> +/**
> + * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
> + * @size: sizeof(struct iommu_hw_info)
> + * @flags: Must be 0
> + * @dev_id: The device bound to the iommufd
> + * @data_len: Input the length of the user buffer in bytes. Output the
> + *            length of data filled in the user buffer.
> + * @data_ptr: Pointer to the user buffer
> + * @out_data_type: Output the iommu hardware info type as defined by
> + *                 enum iommu_hw_info_type.
> + * @__reserved: Must be 0
> + *
> + * Query the hardware iommu information for given device which has been
> + * bound to iommufd. @data_len is the size of the buffer which captures
> + * iommu type specific data and the data will be filled. Trailing bytes
> + * are zeroed if the user buffer is larger than the data kernel has.
> + *
> + * The type specific data would be used to sync capability between the
> + * virtual IOMMU and the hardware IOMMU. e.g. nested translation
> requires
> + * to check the hardware IOMMU capability so guest stage-1 page table
> + * uses a format compatible to the hardware IOMMU.
> + *
> + * The @out_data_type will be filled if the ioctl succeeds. It would
> + * be used to decode the data filled in the buffer pointed by @data_ptr.
> + */
> +struct iommu_hw_info {
> +	__u32 size;
> +	__u32 flags;
> +	__u32 dev_id;
> +	__u32 data_len;
> +	__aligned_u64 data_ptr;
> +	__u32 out_data_type;
> +	__u32 __reserved;

it's unusual to have reserved field in the end. It makes more sense
to move data_ptr to the end to make it meaningful.

> +};
> +#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE,
> IOMMUFD_CMD_DEVICE_GET_HW_INFO)
>  #endif

Here we have a naming confusion.

'IOMMU' is the prefix of iommufd ioctls.

'DEVICE' is the subjective.

Then "GET_HW_INFO" implies getting hardware info related to
this device. then it should not be restricted to the iommu info.

with that it's clearer to call it IOMMU_DEVICE_GET_IOMMU_INFO.

similarly for struct iommu_hw_info.

'iommu' is the prefix for all iommufd ioctl structures.

then 'hw_info' is too broard.

iommu_device_iommu_info reads better? though having two
iommu's in the name is a little bit annoying...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
       [not found]   ` <BL1PR11MB527177A3860E10818E9761938C7C9@BL1PR11MB5271.namprd11.prod.outlook.com>
@ 2023-05-19  9:15     ` Tian, Kevin
  0 siblings, 0 replies; 13+ messages in thread
From: Tian, Kevin @ 2023-05-19  9:15 UTC (permalink / raw)
  To: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, robin.murphy@arm.com, baolu.lu@linux.intel.com
  Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com,
	kvm@vger.kernel.org, mjrosato@linux.ibm.com,
	chao.p.peng@linux.intel.com, yi.y.sun@linux.intel.com,
	peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

> From: Tian, Kevin
> Sent: Friday, May 19, 2023 4:42 PM
> > +struct iommu_hw_info {
> > +	__u32 size;
> > +	__u32 flags;
> > +	__u32 dev_id;
> > +	__u32 data_len;
> > +	__aligned_u64 data_ptr;
> > +	__u32 out_data_type;
> > +	__u32 __reserved;
> 
> it's unusual to have reserved field in the end. It makes more sense
> to move data_ptr to the end to make it meaningful.
> 

Please ignore this comment. typed too fast...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-19  8:42   ` Tian, Kevin
@ 2023-05-19 18:30     ` Nicolin Chen
  2023-05-24  5:00       ` Tian, Kevin
  0 siblings, 1 reply; 13+ messages in thread
From: Nicolin Chen @ 2023-05-19 18:30 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, robin.murphy@arm.com, baolu.lu@linux.intel.com,
	cohuck@redhat.com, eric.auger@redhat.com, kvm@vger.kernel.org,
	mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
	yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

Hi Kevin,

On Fri, May 19, 2023 at 08:42:07AM +0000, Tian, Kevin wrote:

> > +};
> > +#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE,
> > IOMMUFD_CMD_DEVICE_GET_HW_INFO)
> >  #endif
> 
> Here we have a naming confusion.
> 
> 'IOMMU' is the prefix of iommufd ioctls.
> 
> 'DEVICE' is the subjective.
> 
> Then "GET_HW_INFO" implies getting hardware info related to
> this device. then it should not be restricted to the iommu info.
> 
> with that it's clearer to call it IOMMU_DEVICE_GET_IOMMU_INFO.

Though the entire ioctl is tied to the input "dev_id", I think
it isn't really about the device corresponding to the dev_id,
similar to the IOMMU_HWPT_ALLOC having a dev_id input too. So,
I think the "IOMMU_DEVICE" here should be interpreted simply
as "an iommu device". We could also highlight this somewhere
in the header.

With that being said, IOMMU_DEVICE_SET/UNSET_DATA should be
renamed to IOMMU_DEVICE_SET/UNSET_DEV_DATA -- "DEVICE" is the
iommu device while the "DEV_DATA" is a given device that's
behind the iommu.

> similarly for struct iommu_hw_info.
> 
> 'iommu' is the prefix for all iommufd ioctl structures.
> 
> then 'hw_info' is too broard.
> 
> iommu_device_iommu_info reads better? though having two
> iommu's in the name is a little bit annoying...

How about:
IOMMU_DEVICE_GET_FEATURES
struct iommu_device_features
?

Thanks
Nic

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-19 18:30     ` Nicolin Chen
@ 2023-05-24  5:00       ` Tian, Kevin
  2023-05-24  5:19         ` Nicolin Chen
  0 siblings, 1 reply; 13+ messages in thread
From: Tian, Kevin @ 2023-05-24  5:00 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, robin.murphy@arm.com, baolu.lu@linux.intel.com,
	cohuck@redhat.com, eric.auger@redhat.com, kvm@vger.kernel.org,
	mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
	yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, May 20, 2023 2:30 AM
> 
> Hi Kevin,
> 
> On Fri, May 19, 2023 at 08:42:07AM +0000, Tian, Kevin wrote:
> 
> > > +};
> > > +#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE,
> > > IOMMUFD_CMD_DEVICE_GET_HW_INFO)
> > >  #endif
> >
> > Here we have a naming confusion.
> >
> > 'IOMMU' is the prefix of iommufd ioctls.
> >
> > 'DEVICE' is the subjective.
> >
> > Then "GET_HW_INFO" implies getting hardware info related to
> > this device. then it should not be restricted to the iommu info.
> >
> > with that it's clearer to call it IOMMU_DEVICE_GET_IOMMU_INFO.
> 
> Though the entire ioctl is tied to the input "dev_id", I think
> it isn't really about the device corresponding to the dev_id,
> similar to the IOMMU_HWPT_ALLOC having a dev_id input too. So,
> I think the "IOMMU_DEVICE" here should be interpreted simply
> as "an iommu device". We could also highlight this somewhere
> in the header.

yes this is a good view of it. with that it's not necessary to have
a 'DEVICE' notation in the name which looks confusing with dev_id.

Just IOMMU_GET_HW_INFO for the iommu behind the specified dev_id.

then keep the structure name as iommu_hw_info.

> 
> With that being said, IOMMU_DEVICE_SET/UNSET_DATA should be
> renamed to IOMMU_DEVICE_SET/UNSET_DEV_DATA -- "DEVICE" is the
> iommu device while the "DEV_DATA" is a given device that's
> behind the iommu.

this then becomes IOMMU_SET/UNSET_DEV_DATA.

> 
> > similarly for struct iommu_hw_info.
> >
> > 'iommu' is the prefix for all iommufd ioctl structures.
> >
> > then 'hw_info' is too broard.
> >
> > iommu_device_iommu_info reads better? though having two
> > iommu's in the name is a little bit annoying...
> 
> How about:
> IOMMU_DEVICE_GET_FEATURES
> struct iommu_device_features
> ?
> 
> Thanks
> Nic

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO
  2023-05-24  5:00       ` Tian, Kevin
@ 2023-05-24  5:19         ` Nicolin Chen
  0 siblings, 0 replies; 13+ messages in thread
From: Nicolin Chen @ 2023-05-24  5:19 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Liu, Yi L, joro@8bytes.org, alex.williamson@redhat.com,
	jgg@nvidia.com, robin.murphy@arm.com, baolu.lu@linux.intel.com,
	cohuck@redhat.com, eric.auger@redhat.com, kvm@vger.kernel.org,
	mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com,
	yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com,
	shameerali.kolothum.thodi@huawei.com, lulu@redhat.com,
	suravee.suthikulpanit@amd.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Duan, Zhenzhong

On Wed, May 24, 2023 at 05:00:40AM +0000, Tian, Kevin wrote:
> > > > +};
> > > > +#define IOMMU_DEVICE_GET_HW_INFO _IO(IOMMUFD_TYPE,
> > > > IOMMUFD_CMD_DEVICE_GET_HW_INFO)
> > > >  #endif
> > >
> > > Here we have a naming confusion.
> > >
> > > 'IOMMU' is the prefix of iommufd ioctls.
> > >
> > > 'DEVICE' is the subjective.
> > >
> > > Then "GET_HW_INFO" implies getting hardware info related to
> > > this device. then it should not be restricted to the iommu info.
> > >
> > > with that it's clearer to call it IOMMU_DEVICE_GET_IOMMU_INFO.
> >
> > Though the entire ioctl is tied to the input "dev_id", I think
> > it isn't really about the device corresponding to the dev_id,
> > similar to the IOMMU_HWPT_ALLOC having a dev_id input too. So,
> > I think the "IOMMU_DEVICE" here should be interpreted simply
> > as "an iommu device". We could also highlight this somewhere
> > in the header.
> 
> yes this is a good view of it. with that it's not necessary to have
> a 'DEVICE' notation in the name which looks confusing with dev_id.
> 
> Just IOMMU_GET_HW_INFO for the iommu behind the specified dev_id.
> 
> then keep the structure name as iommu_hw_info.

That'd be neat.

> > With that being said, IOMMU_DEVICE_SET/UNSET_DATA should be
> > renamed to IOMMU_DEVICE_SET/UNSET_DEV_DATA -- "DEVICE" is the
> > iommu device while the "DEV_DATA" is a given device that's
> > behind the iommu.
> 
> this then becomes IOMMU_SET/UNSET_DEV_DATA.

Ack.

Thanks
Nic

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-05-24  5:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-11 14:30 [PATCH v3 0/4] iommufd: Add iommu hardware info reporting Yi Liu
2023-05-11 14:30 ` [PATCH v3 1/4] iommu: Move dev_iommu_ops() to private header Yi Liu
2023-05-11 14:30 ` [PATCH v3 2/4] iommu: Add new iommu op to get iommu hardware information Yi Liu
2023-05-11 14:30 ` [PATCH v3 3/4] iommufd: Add IOMMU_DEVICE_GET_HW_INFO Yi Liu
2023-05-12  5:38   ` Baolu Lu
2023-05-15  6:14     ` Liu, Yi L
2023-05-16  1:49       ` Baolu Lu
2023-05-19  8:42   ` Tian, Kevin
2023-05-19 18:30     ` Nicolin Chen
2023-05-24  5:00       ` Tian, Kevin
2023-05-24  5:19         ` Nicolin Chen
     [not found]   ` <BL1PR11MB527177A3860E10818E9761938C7C9@BL1PR11MB5271.namprd11.prod.outlook.com>
2023-05-19  9:15     ` Tian, Kevin
2023-05-11 14:30 ` [PATCH v3 4/4] iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl Yi Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).