* [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
@ 2024-10-22 0:19 Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Nicolin Chen
` (11 more replies)
0 siblings, 12 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
This series introduces a new vIOMMU infrastructure and related ioctls.
IOMMUFD has been using the HWPT infrastructure for all cases, including a
nested IO page table support. Yet, there're limitations for an HWPT-based
structure to support some advanced HW-accelerated features, such as CMDQV
on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-IOMMU
environment, it is not straightforward for nested HWPTs to share the same
parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
parent HWPT typically hold one stage-2 IO pagetable and tag it with only
one ID in the cache entries. When sharing one large stage-2 IO pagetable
across physical IOMMU instances, that one ID may not always be available
across all the IOMMU instances. In other word, it's ideal for SW to have
a different container for the stage-2 IO pagetable so it can hold another
ID that's available.
For this "different container", add vIOMMU, an additional layer to hold
extra virtualization information:
_______________________________________________________________________
| iommufd (with vIOMMU) |
| |
| [5] |
| _____________ |
| | | |
| |----------------| vIOMMU | |
| | | | |
| | | | |
| | [1] | | [4] [2] |
| | ______ | | _____________ ________ |
| | | | | [3] | | | | | |
| | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
| | |______| |_____________| |_____________| |________| |
| | | | | | |
|______|________|______________|__________________|_______________|_____|
| | | | |
______v_____ | ______v_____ ______v_____ ___v__
| struct | | PFN | (paging) | | (nested) | |struct|
|iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device|
|____________| storage|____________| |____________| |______|
The vIOMMU object should be seen as a slice of a physical IOMMU instance
that is passed to or shared with a VM. That can be some HW/SW resources:
- Security namespace for guest owned ID, e.g. guest-controlled cache tags
- Access to a sharable nesting parent pagetable across physical IOMMUs
- Virtualization of various platforms IDs, e.g. RIDs and others
- Delivery of paravirtualized invalidation
- Direct assigned invalidation queues
- Direct assigned interrupts
- Non-affiliated event reporting
On a multi-IOMMU system, the vIOMMU object must be instanced to the number
of the physical IOMMUs that are passed to (via devices) a guest VM, while
being able to hold the shareable parent HWPT. Each vIOMMU then just needs
to allocate its own individual ID to tag its own cache:
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested0 |--->| viommu0 ------------------
---------------- | | IDx |
----------------------------
----------------------------
---------------- | | paging_hwpt0 |
| hwpt_nested1 |--->| viommu1 ------------------
---------------- | | IDy |
----------------------------
As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an allocation
only. And implement it in arm-smmu-v3 driver as a real world use case.
More vIOMMU-based structs and ioctls will be introduced in the follow-up
series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we
repurposed the vIOMMU object from an earlier RFC, just for a referece:
https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
This series is on Github:
https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v4
(paring QEMU branch for testing will be provided with the part2 series)
Changelog
v4
* Added "Reviewed-by" from Jason
* Dropped IOMMU_VIOMMU_TYPE_DEFAULT support
* Dropped iommufd_object_alloc_elm renamings
* Renamed iommufd's viommu_api.c to driver.c
* Reworked iommufd_viommu_alloc helper
* Added a separate iommufd_hwpt_nested_alloc_for_viommu function for
hwpt_nested allocations on a vIOMMU, and added comparison between
viommu->iommu_dev->ops and dev_iommu_ops(idev->dev)
* Replaced s2_parent with vsmmu in arm_smmu_nested_domain
* Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in
viommu_ops
* Replaced wait_queue_head_t with a completion, to delay the unplug of
mock_iommu_dev
* Corrected documentation graph that was missing struct iommu_device
* Added an iommufd_verify_unfinalized_object helper to verify driver-
allocated vIOMMU/vDEVICE objects
* Added missing test cases for TEST_LENGTH and fail_nth
v3
https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/
* Rebased on top of Jason's nesting v3 series
https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-smmuv3_nesting_jgg@nvidia.com/
* Split the series into smaller parts
* Added Jason's Reviewed-by
* Added back viommu->iommu_dev
* Added support for driver-allocated vIOMMU v.s. core-allocated
* Dropped arm_smmu_cache_invalidate_user
* Added an iommufd_test_wait_for_users() in selftest
* Reworked test code to make viommu an individual FIXTURE
* Added missing TEST_LENGTH case for the new ioctl command
v2
https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
* Limited vdev_id to one per idev
* Added a rw_sem to protect the vdev_id list
* Reworked driver-level APIs with proper lockings
* Added a new viommu_api file for IOMMUFD_DRIVER config
* Dropped useless iommu_dev point from the viommu structure
* Added missing index numnbers to new types in the uAPI header
* Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one
* Reworked mock_viommu_cache_invalidate() using the new iommu helper
* Reordered details of set/unset_vdev_id handlers for proper lockings
v1
https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
Thanks!
Nicolin
Nicolin Chen (11):
iommufd: Move struct iommufd_object to public iommufd header
iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
iommufd: Add iommufd_verify_unfinalized_object
iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
iommufd/selftest: Add refcount to mock_iommu_device
iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
Documentation: userspace-api: iommufd: Update vIOMMU
iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
drivers/iommu/iommufd/Makefile | 5 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++---
drivers/iommu/iommufd/iommufd_private.h | 36 ++------
drivers/iommu/iommufd/iommufd_test.h | 2 +
include/linux/iommu.h | 14 +++
include/linux/iommufd.h | 89 +++++++++++++++++++
include/uapi/linux/iommufd.h | 56 ++++++++++--
tools/testing/selftests/iommu/iommufd_utils.h | 28 ++++++
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 79 ++++++++++------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +-
drivers/iommu/iommufd/driver.c | 38 ++++++++
drivers/iommu/iommufd/hw_pagetable.c | 69 +++++++++++++-
drivers/iommu/iommufd/main.c | 58 ++++++------
drivers/iommu/iommufd/selftest.c | 73 +++++++++++++--
drivers/iommu/iommufd/viommu.c | 85 ++++++++++++++++++
tools/testing/selftests/iommu/iommufd.c | 78 ++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 11 +++
Documentation/userspace-api/iommufd.rst | 69 +++++++++++++-
18 files changed, 701 insertions(+), 124 deletions(-)
create mode 100644 drivers/iommu/iommufd/driver.c
create mode 100644 drivers/iommu/iommufd/viommu.c
--
2.43.0
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 8:34 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Nicolin Chen
` (10 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Prepare for an embedded structure design for driver-level iommufd_viommu
objects:
// include/linux/iommufd.h
struct iommufd_viommu {
struct iommufd_object obj;
....
};
// Some IOMMU driver
struct iommu_driver_viommu {
struct iommufd_viommu core;
....
};
It has to expose struct iommufd_object and enum iommufd_object_type from
the core-level private header to the public iommufd header.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/iommufd_private.h | 25 +------------------------
include/linux/iommufd.h | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index f1d865e6fab6..1bb8c0aaecd1 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -5,8 +5,8 @@
#define __IOMMUFD_PRIVATE_H
#include <linux/iommu.h>
+#include <linux/iommufd.h>
#include <linux/iova_bitmap.h>
-#include <linux/refcount.h>
#include <linux/rwsem.h>
#include <linux/uaccess.h>
#include <linux/xarray.h>
@@ -122,29 +122,6 @@ static inline int iommufd_ucmd_respond(struct iommufd_ucmd *ucmd,
return 0;
}
-enum iommufd_object_type {
- IOMMUFD_OBJ_NONE,
- IOMMUFD_OBJ_ANY = IOMMUFD_OBJ_NONE,
- IOMMUFD_OBJ_DEVICE,
- IOMMUFD_OBJ_HWPT_PAGING,
- IOMMUFD_OBJ_HWPT_NESTED,
- IOMMUFD_OBJ_IOAS,
- IOMMUFD_OBJ_ACCESS,
- IOMMUFD_OBJ_FAULT,
-#ifdef CONFIG_IOMMUFD_TEST
- IOMMUFD_OBJ_SELFTEST,
-#endif
- IOMMUFD_OBJ_MAX,
-};
-
-/* Base struct for all objects with a userspace ID handle. */
-struct iommufd_object {
- refcount_t shortterm_users;
- refcount_t users;
- enum iommufd_object_type type;
- unsigned int id;
-};
-
static inline bool iommufd_lock_obj(struct iommufd_object *obj)
{
if (!refcount_inc_not_zero(&obj->users))
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 30f832a60ccb..22948dd03d67 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -8,6 +8,7 @@
#include <linux/err.h>
#include <linux/errno.h>
+#include <linux/refcount.h>
#include <linux/types.h>
struct device;
@@ -18,6 +19,29 @@ struct iommufd_ctx;
struct iommufd_device;
struct page;
+enum iommufd_object_type {
+ IOMMUFD_OBJ_NONE,
+ IOMMUFD_OBJ_ANY = IOMMUFD_OBJ_NONE,
+ IOMMUFD_OBJ_DEVICE,
+ IOMMUFD_OBJ_HWPT_PAGING,
+ IOMMUFD_OBJ_HWPT_NESTED,
+ IOMMUFD_OBJ_IOAS,
+ IOMMUFD_OBJ_ACCESS,
+ IOMMUFD_OBJ_FAULT,
+#ifdef CONFIG_IOMMUFD_TEST
+ IOMMUFD_OBJ_SELFTEST,
+#endif
+ IOMMUFD_OBJ_MAX,
+};
+
+/* Base struct for all objects with a userspace ID handle. */
+struct iommufd_object {
+ refcount_t shortterm_users;
+ refcount_t users;
+ enum iommufd_object_type type;
+ unsigned int id;
+};
+
struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
struct device *dev, u32 *id);
void iommufd_device_unbind(struct iommufd_device *idev);
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-22 2:28 ` Baolu Lu
2024-10-22 0:19 ` [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object Nicolin Chen
` (9 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Add a new IOMMUFD_OBJ_VIOMMU with an iommufd_viommu structure to represent
a slice of physical IOMMU device passed to or shared with a user space VM.
This slice, now a vIOMMU object, is a group of virtualization resources of
a physical IOMMU's, such as:
- Security namespace for guest owned ID, e.g. guest-controlled cache tags
- Access to a sharable nesting parent pagetable across physical IOMMUs
- Virtualization of various platforms IDs, e.g. RIDs and others
- Delivery of paravirtualized invalidation
- Direct assigned invalidation queues
- Direct assigned interrupts
- Non-affiliated event reporting
Add a new viommu_alloc op in iommu_ops, for drivers to allocate their own
vIOMMU structures. And this allocation also needs a free(), so add struct
iommufd_viommu_ops.
To simplify a vIOMMU allocation, provide a iommufd_viommu_alloc() helper.
It's suggested that a driver should embed a core-level viommu structure in
its driver-level viommu struct and call the iommufd_viommu_alloc() helper,
meanwhile the driver can also implement a viommu ops:
struct my_driver_viommu {
struct iommufd_viommu core;
/* driver-owned properties/features */
....
};
static const struct iommufd_viommu_ops my_driver_viommu_ops = {
.free = my_driver_viommu_free,
/* future ops for virtualization features */
....
};
static struct iommufd_viommu my_driver_viommu_alloc(...)
{
struct my_driver_viommu *my_viommu =
iommufd_viommu_alloc(ictx, my_driver_viommu, core,
my_driver_viommu_ops);
/* Init my_viommu and related HW feature */
....
return &my_viommu->core;
}
static struct iommu_domain_ops my_driver_domain_ops = {
....
.viommu_alloc = my_driver_viommu_alloc,
};
To make the Kernel config work between a driver and the iommufd core, move
the _iommufd_object_alloc helper into a new driver.c file that builds with
CONFIG_IOMMUFD_DRIVER.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/Makefile | 2 +-
drivers/iommu/iommufd/iommufd_private.h | 4 --
include/linux/iommu.h | 14 +++++++
include/linux/iommufd.h | 56 +++++++++++++++++++++++++
drivers/iommu/iommufd/driver.c | 38 +++++++++++++++++
drivers/iommu/iommufd/main.c | 32 --------------
6 files changed, 109 insertions(+), 37 deletions(-)
create mode 100644 drivers/iommu/iommufd/driver.c
diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
index cf4605962bea..435124a8e1f1 100644
--- a/drivers/iommu/iommufd/Makefile
+++ b/drivers/iommu/iommufd/Makefile
@@ -12,4 +12,4 @@ iommufd-y := \
iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o
obj-$(CONFIG_IOMMUFD) += iommufd.o
-obj-$(CONFIG_IOMMUFD_DRIVER) += iova_bitmap.o
+obj-$(CONFIG_IOMMUFD_DRIVER) += iova_bitmap.o driver.o
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 1bb8c0aaecd1..5bd41257f2ef 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -202,10 +202,6 @@ iommufd_object_put_and_try_destroy(struct iommufd_ctx *ictx,
iommufd_object_remove(ictx, obj, obj->id, 0);
}
-struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
- size_t size,
- enum iommufd_object_type type);
-
#define __iommufd_object_alloc(ictx, ptr, type, obj) \
container_of(_iommufd_object_alloc( \
ictx, \
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 4ad9b9ec6c9b..14f24b5cd16f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -42,6 +42,8 @@ struct notifier_block;
struct iommu_sva;
struct iommu_dma_cookie;
struct iommu_fault_param;
+struct iommufd_ctx;
+struct iommufd_viommu;
#define IOMMU_FAULT_PERM_READ (1 << 0) /* read */
#define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */
@@ -542,6 +544,14 @@ static inline int __iommu_copy_struct_from_user_array(
* @remove_dev_pasid: Remove any translation configurations of a specific
* pasid, so that any DMA transactions with this pasid
* will be blocked by the hardware.
+ * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind
+ * the @dev, as the set of virtualization resources shared/passed
+ * to user space IOMMU instance. And associate it with a nesting
+ * @parent_domain. The @viommu_type must be defined in the header
+ * include/uapi/linux/iommufd.h
+ * It is suggested to call iommufd_viommu_alloc() helper for
+ * a bundled allocation of the core and the driver structures,
+ * using the given @ictx pointer.
* @pgsize_bitmap: bitmap of all possible supported page sizes
* @owner: Driver module providing these ops
* @identity_domain: An always available, always attachable identity
@@ -591,6 +601,10 @@ struct iommu_ops {
void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid,
struct iommu_domain *domain);
+ struct iommufd_viommu *(*viommu_alloc)(
+ struct device *dev, struct iommu_domain *parent_domain,
+ struct iommufd_ctx *ictx, unsigned int viommu_type);
+
const struct iommu_domain_ops *default_domain_ops;
unsigned long pgsize_bitmap;
struct module *owner;
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 22948dd03d67..55054fbc793c 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -17,6 +17,7 @@ struct iommu_group;
struct iommufd_access;
struct iommufd_ctx;
struct iommufd_device;
+struct iommufd_viommu_ops;
struct page;
enum iommufd_object_type {
@@ -28,6 +29,7 @@ enum iommufd_object_type {
IOMMUFD_OBJ_IOAS,
IOMMUFD_OBJ_ACCESS,
IOMMUFD_OBJ_FAULT,
+ IOMMUFD_OBJ_VIOMMU,
#ifdef CONFIG_IOMMUFD_TEST
IOMMUFD_OBJ_SELFTEST,
#endif
@@ -78,6 +80,26 @@ void iommufd_access_detach(struct iommufd_access *access);
void iommufd_ctx_get(struct iommufd_ctx *ictx);
+struct iommufd_viommu {
+ struct iommufd_object obj;
+ struct iommufd_ctx *ictx;
+ struct iommu_device *iommu_dev;
+ struct iommufd_hwpt_paging *hwpt;
+
+ const struct iommufd_viommu_ops *ops;
+
+ unsigned int type;
+};
+
+/**
+ * struct iommufd_viommu_ops - vIOMMU specific operations
+ * @free: Free all driver-specific parts of an iommufd_viommu. The memory of the
+ * vIOMMU will be free-ed by iommufd core after calling this free op.
+ */
+struct iommufd_viommu_ops {
+ void (*free)(struct iommufd_viommu *viommu);
+};
+
#if IS_ENABLED(CONFIG_IOMMUFD)
struct iommufd_ctx *iommufd_ctx_from_file(struct file *file);
struct iommufd_ctx *iommufd_ctx_from_fd(int fd);
@@ -135,4 +157,38 @@ static inline int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx)
return -EOPNOTSUPP;
}
#endif /* CONFIG_IOMMUFD */
+
+#if IS_ENABLED(CONFIG_IOMMUFD_DRIVER)
+struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
+ size_t size,
+ enum iommufd_object_type type);
+#else /* !CONFIG_IOMMUFD_DRIVER */
+static inline struct iommufd_object *
+_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size,
+ enum iommufd_object_type type)
+{
+ return ERR_PTR(-EOPNOTSUPP);
+}
+#endif /* CONFIG_IOMMUFD_DRIVER */
+
+/*
+ * Helpers for IOMMU driver to allocate driver structures that will be freed by
+ * the iommufd core. The free op will be called prior to freeing the memory.
+ */
+#define iommufd_viommu_alloc(ictx, drv_struct, member, viommu_ops) \
+ ({ \
+ struct drv_struct *ret; \
+ \
+ static_assert( \
+ __same_type(struct iommufd_viommu, \
+ ((struct drv_struct *)NULL)->member)); \
+ static_assert(offsetof(struct drv_struct, member.obj) == 0); \
+ ret = container_of( \
+ _iommufd_object_alloc(ictx, sizeof(struct drv_struct), \
+ IOMMUFD_OBJ_VIOMMU), \
+ struct drv_struct, member.obj); \
+ if (!IS_ERR(ret)) \
+ ret->member.ops = viommu_ops; \
+ ret; \
+ })
#endif
diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c
new file mode 100644
index 000000000000..c0876d3f91c7
--- /dev/null
+++ b/drivers/iommu/iommufd/driver.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
+ */
+
+#include "iommufd_private.h"
+
+struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
+ size_t size,
+ enum iommufd_object_type type)
+{
+ struct iommufd_object *obj;
+ int rc;
+
+ obj = kzalloc(size, GFP_KERNEL_ACCOUNT);
+ if (!obj)
+ return ERR_PTR(-ENOMEM);
+ obj->type = type;
+ /* Starts out bias'd by 1 until it is removed from the xarray */
+ refcount_set(&obj->shortterm_users, 1);
+ refcount_set(&obj->users, 1);
+
+ /*
+ * Reserve an ID in the xarray but do not publish the pointer yet since
+ * the caller hasn't initialized it yet. Once the pointer is published
+ * in the xarray and visible to other threads we can't reliably destroy
+ * it anymore, so the caller must complete all errorable operations
+ * before calling iommufd_object_finalize().
+ */
+ rc = xa_alloc(&ictx->objects, &obj->id, XA_ZERO_ENTRY, xa_limit_31b,
+ GFP_KERNEL_ACCOUNT);
+ if (rc)
+ goto out_free;
+ return obj;
+out_free:
+ kfree(obj);
+ return ERR_PTR(rc);
+}
+EXPORT_SYMBOL_NS_GPL(_iommufd_object_alloc, IOMMUFD);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index b5f5d27ee963..92bd075108e5 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -29,38 +29,6 @@ struct iommufd_object_ops {
static const struct iommufd_object_ops iommufd_object_ops[];
static struct miscdevice vfio_misc_dev;
-struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx,
- size_t size,
- enum iommufd_object_type type)
-{
- struct iommufd_object *obj;
- int rc;
-
- obj = kzalloc(size, GFP_KERNEL_ACCOUNT);
- if (!obj)
- return ERR_PTR(-ENOMEM);
- obj->type = type;
- /* Starts out bias'd by 1 until it is removed from the xarray */
- refcount_set(&obj->shortterm_users, 1);
- refcount_set(&obj->users, 1);
-
- /*
- * Reserve an ID in the xarray but do not publish the pointer yet since
- * the caller hasn't initialized it yet. Once the pointer is published
- * in the xarray and visible to other threads we can't reliably destroy
- * it anymore, so the caller must complete all errorable operations
- * before calling iommufd_object_finalize().
- */
- rc = xa_alloc(&ictx->objects, &obj->id, XA_ZERO_ENTRY,
- xa_limit_31b, GFP_KERNEL_ACCOUNT);
- if (rc)
- goto out_free;
- return obj;
-out_free:
- kfree(obj);
- return ERR_PTR(rc);
-}
-
/*
* Allow concurrent access to the object.
*
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 8:49 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Nicolin Chen
` (8 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
To support driver-allocated vIOMMU objects, it's suggested to call the
allocator helper in IOMMU dirvers. However, there is no guarantee that
drivers will all use it and allocate objects properly.
Add a helper for iommufd core to verify if an unfinalized object is at
least reserved in the ictx.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/iommufd_private.h | 3 +++
drivers/iommu/iommufd/main.c | 20 ++++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 5bd41257f2ef..d53c1ca75532 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -152,6 +152,9 @@ static inline void iommufd_put_object(struct iommufd_ctx *ictx,
wake_up_interruptible_all(&ictx->destroy_wait);
}
+int iommufd_verify_unfinalized_object(struct iommufd_ctx *ictx,
+ struct iommufd_object *to_verify);
+
void iommufd_object_abort(struct iommufd_ctx *ictx, struct iommufd_object *obj);
void iommufd_object_abort_and_destroy(struct iommufd_ctx *ictx,
struct iommufd_object *obj);
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index 92bd075108e5..e244fed1b7ab 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -89,6 +89,26 @@ struct iommufd_object *iommufd_get_object(struct iommufd_ctx *ictx, u32 id,
return obj;
}
+int iommufd_verify_unfinalized_object(struct iommufd_ctx *ictx,
+ struct iommufd_object *to_verify)
+{
+ XA_STATE(xas, &ictx->objects, 0);
+ struct iommufd_object *obj;
+ int rc = 0;
+
+ if (!to_verify || !to_verify->id)
+ return -EINVAL;
+ xas.xa_index = to_verify->id;
+
+ xa_lock(&ictx->objects);
+ obj = xas_load(&xas);
+ /* Being an unfinalized object, the loaded obj is a reserved space */
+ if (obj != XA_ZERO_ENTRY)
+ rc = -ENOENT;
+ xa_unlock(&ictx->objects);
+ return rc;
+}
+
static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx,
struct iommufd_object *to_destroy)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (2 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 8:59 ` Tian, Kevin
2024-10-25 9:05 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops Nicolin Chen
` (7 subsequent siblings)
11 siblings, 2 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Add a new ioctl for user space to do a vIOMMU allocation. It must be based
on a nesting parent HWPT, so take its refcount.
If an IOMMU driver supports a driver-managed vIOMMU object, it must define
its own IOMMU_VIOMMU_TYPE_ in the uAPI header and implement a viommu_alloc
op in its iommu_ops.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/Makefile | 3 +-
drivers/iommu/iommufd/iommufd_private.h | 3 +
include/uapi/linux/iommufd.h | 40 ++++++++++++
drivers/iommu/iommufd/main.c | 6 ++
drivers/iommu/iommufd/viommu.c | 85 +++++++++++++++++++++++++
5 files changed, 136 insertions(+), 1 deletion(-)
create mode 100644 drivers/iommu/iommufd/viommu.c
diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
index 435124a8e1f1..7c207c5f1eb6 100644
--- a/drivers/iommu/iommufd/Makefile
+++ b/drivers/iommu/iommufd/Makefile
@@ -7,7 +7,8 @@ iommufd-y := \
ioas.o \
main.o \
pages.o \
- vfio_compat.o
+ vfio_compat.o \
+ viommu.o
iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index d53c1ca75532..9adf8d616796 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -504,6 +504,9 @@ static inline int iommufd_hwpt_replace_device(struct iommufd_device *idev,
return iommu_group_replace_domain(idev->igroup->group, hwpt->domain);
}
+int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd);
+void iommufd_viommu_destroy(struct iommufd_object *obj);
+
#ifdef CONFIG_IOMMUFD_TEST
int iommufd_test(struct iommufd_ucmd *ucmd);
void iommufd_selftest_destroy(struct iommufd_object *obj);
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index cd4920886ad0..d1c99285eda0 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -51,6 +51,7 @@ enum {
IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP = 0x8c,
IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d,
IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e,
+ IOMMUFD_CMD_VIOMMU_ALLOC = 0x8f,
};
/**
@@ -852,4 +853,43 @@ struct iommu_fault_alloc {
__u32 out_fault_fd;
};
#define IOMMU_FAULT_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_FAULT_QUEUE_ALLOC)
+
+/**
+ * enum iommu_viommu_type - Virtual IOMMU Type
+ * @IOMMU_VIOMMU_TYPE_DEFAULT: Reserved for future use
+ */
+enum iommu_viommu_type {
+ IOMMU_VIOMMU_TYPE_DEFAULT = 0,
+};
+
+/**
+ * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
+ * @size: sizeof(struct iommu_viommu_alloc)
+ * @flags: Must be 0
+ * @type: Type of the virtual IOMMU. Must be defined in enum iommu_viommu_type
+ * @dev_id: The device's physical IOMMU will be used to back the virtual IOMMU
+ * @hwpt_id: ID of a nesting parent HWPT to associate to
+ * @out_viommu_id: Output virtual IOMMU ID for the allocated object
+ *
+ * Allocate a virtual IOMMU object that represents the underlying physical
+ * IOMMU's virtualization support. The vIOMMU object is a security-isolated
+ * slice of the physical IOMMU HW that is unique to a specific VM. Operations
+ * global to the IOMMU are connected to the vIOMMU, such as:
+ * - Security namespace for guest owned ID, e.g. guest-controlled cache tags
+ * - Access to a sharable nesting parent pagetable across physical IOMMUs
+ * - Virtualization of various platforms IDs, e.g. RIDs and others
+ * - Delivery of paravirtualized invalidation
+ * - Direct assigned invalidation queues
+ * - Direct assigned interrupts
+ * - Non-affiliated event reporting
+ */
+struct iommu_viommu_alloc {
+ __u32 size;
+ __u32 flags;
+ __u32 type;
+ __u32 dev_id;
+ __u32 hwpt_id;
+ __u32 out_viommu_id;
+};
+#define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC)
#endif
diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
index e244fed1b7ab..ab5ee325d809 100644
--- a/drivers/iommu/iommufd/main.c
+++ b/drivers/iommu/iommufd/main.c
@@ -321,6 +321,7 @@ union ucmd_buffer {
struct iommu_ioas_unmap unmap;
struct iommu_option option;
struct iommu_vfio_ioas vfio_ioas;
+ struct iommu_viommu_alloc viommu;
#ifdef CONFIG_IOMMUFD_TEST
struct iommu_test_cmd test;
#endif
@@ -372,6 +373,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
val64),
IOCTL_OP(IOMMU_VFIO_IOAS, iommufd_vfio_ioas, struct iommu_vfio_ioas,
__reserved),
+ IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl,
+ struct iommu_viommu_alloc, out_viommu_id),
#ifdef CONFIG_IOMMUFD_TEST
IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last),
#endif
@@ -507,6 +510,9 @@ static const struct iommufd_object_ops iommufd_object_ops[] = {
[IOMMUFD_OBJ_FAULT] = {
.destroy = iommufd_fault_destroy,
},
+ [IOMMUFD_OBJ_VIOMMU] = {
+ .destroy = iommufd_viommu_destroy,
+ },
#ifdef CONFIG_IOMMUFD_TEST
[IOMMUFD_OBJ_SELFTEST] = {
.destroy = iommufd_selftest_destroy,
diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c
new file mode 100644
index 000000000000..e612f3d539b7
--- /dev/null
+++ b/drivers/iommu/iommufd/viommu.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES
+ */
+
+#include "iommufd_private.h"
+
+void iommufd_viommu_destroy(struct iommufd_object *obj)
+{
+ struct iommufd_viommu *viommu =
+ container_of(obj, struct iommufd_viommu, obj);
+
+ if (viommu->ops && viommu->ops->free)
+ viommu->ops->free(viommu);
+ refcount_dec(&viommu->hwpt->common.obj.users);
+}
+
+int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd)
+{
+ struct iommu_viommu_alloc *cmd = ucmd->cmd;
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommufd_viommu *viommu;
+ struct iommufd_device *idev;
+ const struct iommu_ops *ops;
+ int rc;
+
+ if (cmd->flags || cmd->type == IOMMU_VIOMMU_TYPE_DEFAULT)
+ return -EOPNOTSUPP;
+
+ idev = iommufd_get_device(ucmd, cmd->dev_id);
+ if (IS_ERR(idev))
+ return PTR_ERR(idev);
+
+ ops = dev_iommu_ops(idev->dev);
+ if (!ops->viommu_alloc) {
+ rc = -EOPNOTSUPP;
+ goto out_put_idev;
+ }
+
+ hwpt_paging = iommufd_get_hwpt_paging(ucmd, cmd->hwpt_id);
+ if (IS_ERR(hwpt_paging)) {
+ rc = PTR_ERR(hwpt_paging);
+ goto out_put_idev;
+ }
+
+ if (!hwpt_paging->nest_parent) {
+ rc = -EINVAL;
+ goto out_put_hwpt;
+ }
+
+ viommu = ops->viommu_alloc(idev->dev, hwpt_paging->common.domain,
+ ucmd->ictx, cmd->type);
+ if (IS_ERR(viommu)) {
+ rc = PTR_ERR(viommu);
+ goto out_put_hwpt;
+ }
+
+ rc = iommufd_verify_unfinalized_object(ucmd->ictx, &viommu->obj);
+ if (rc) {
+ kfree(viommu);
+ goto out_put_hwpt;
+ }
+
+ viommu->type = cmd->type;
+ viommu->ictx = ucmd->ictx;
+ viommu->hwpt = hwpt_paging;
+ /* Assume physical IOMMUs are unpluggable (the most likely case) */
+ viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
+
+ refcount_inc(&viommu->hwpt->common.obj.users);
+
+ cmd->out_viommu_id = viommu->obj.id;
+ rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+ if (rc)
+ goto out_abort;
+ iommufd_object_finalize(ucmd->ictx, &viommu->obj);
+ goto out_put_hwpt;
+
+out_abort:
+ iommufd_object_abort_and_destroy(ucmd->ictx, &viommu->obj);
+out_put_hwpt:
+ iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj);
+out_put_idev:
+ iommufd_put_object(ucmd->ictx, &idev->obj);
+ return rc;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (3 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 9:00 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Nicolin Chen
` (6 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Allow IOMMU driver to use a vIOMMU object that holds a nesting parent
hwpt/domain to allocate a nested domain.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
include/linux/iommufd.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h
index 55054fbc793c..5c13c35952d8 100644
--- a/include/linux/iommufd.h
+++ b/include/linux/iommufd.h
@@ -14,6 +14,7 @@
struct device;
struct file;
struct iommu_group;
+struct iommu_user_data;
struct iommufd_access;
struct iommufd_ctx;
struct iommufd_device;
@@ -95,9 +96,17 @@ struct iommufd_viommu {
* struct iommufd_viommu_ops - vIOMMU specific operations
* @free: Free all driver-specific parts of an iommufd_viommu. The memory of the
* vIOMMU will be free-ed by iommufd core after calling this free op.
+ * @domain_alloc_nested: Allocate a IOMMU_DOMAIN_NESTED on a vIOMMU that holds a
+ * nesting parent domain (IOMMU_DOMAIN_PAGING). @user_data
+ * must be defined in include/uapi/linux/iommufd.h.
+ * It must fully initialize the new iommu_domain before
+ * returning. Upon failure, ERR_PTR must be returned.
*/
struct iommufd_viommu_ops {
void (*free)(struct iommufd_viommu *viommu);
+ struct iommu_domain *(*domain_alloc_nested)(
+ struct iommufd_viommu *viommu,
+ const struct iommu_user_data *user_data);
};
#if IS_ENABLED(CONFIG_IOMMUFD)
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (4 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 9:04 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 07/11] iommufd/selftest: Add refcount to mock_iommu_device Nicolin Chen
` (5 subsequent siblings)
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Now a vIOMMU holds a shareable nesting parent HWPT. So, it can act like
that nesting parent HWPT to allocate a nested HWPT.
Support that in the IOMMU_HWPT_ALLOC ioctl handler, and update its kdoc.
Also, add an iommufd_hwpt_nested_alloc_for_viommu helper to allocate a
nested HWPT for a vIOMMU object. Since a vIOMMU object holds the parent
hwpt's refcount already, increase the refcount of the vIOMMU only.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/iommufd_private.h | 1 +
include/uapi/linux/iommufd.h | 14 ++---
drivers/iommu/iommufd/hw_pagetable.c | 69 ++++++++++++++++++++++++-
3 files changed, 77 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
index 9adf8d616796..8c9ab35eaea5 100644
--- a/drivers/iommu/iommufd/iommufd_private.h
+++ b/drivers/iommu/iommufd/iommufd_private.h
@@ -288,6 +288,7 @@ struct iommufd_hwpt_paging {
struct iommufd_hwpt_nested {
struct iommufd_hw_pagetable common;
struct iommufd_hwpt_paging *parent;
+ struct iommufd_viommu *viommu;
};
static inline bool hwpt_is_paging(struct iommufd_hw_pagetable *hwpt)
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index d1c99285eda0..f835ccf4a494 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -430,7 +430,7 @@ enum iommu_hwpt_data_type {
* @size: sizeof(struct iommu_hwpt_alloc)
* @flags: Combination of enum iommufd_hwpt_alloc_flags
* @dev_id: The device to allocate this HWPT for
- * @pt_id: The IOAS or HWPT to connect this HWPT to
+ * @pt_id: The IOAS or HWPT or vIOMMU to connect this HWPT to
* @out_hwpt_id: The ID of the new HWPT
* @__reserved: Must be 0
* @data_type: One of enum iommu_hwpt_data_type
@@ -449,11 +449,13 @@ enum iommu_hwpt_data_type {
* IOMMU_HWPT_DATA_NONE. The HWPT can be allocated as a parent HWPT for a
* nesting configuration by passing IOMMU_HWPT_ALLOC_NEST_PARENT via @flags.
*
- * A user-managed nested HWPT will be created from a given parent HWPT via
- * @pt_id, in which the parent HWPT must be allocated previously via the
- * same ioctl from a given IOAS (@pt_id). In this case, the @data_type
- * must be set to a pre-defined type corresponding to an I/O page table
- * type supported by the underlying IOMMU hardware.
+ * A user-managed nested HWPT will be created from a given vIOMMU (wrapping a
+ * parent HWPT) or a parent HWPT via @pt_id, in which the parent HWPT must be
+ * allocated previously via the same ioctl from a given IOAS (@pt_id). In this
+ * case, the @data_type must be set to a pre-defined type corresponding to an
+ * I/O page table type supported by the underlying IOMMU hardware. The device
+ * via @dev_id and the vIOMMU via @pt_id must be associated to the same IOMMU
+ * instance.
*
* If the @data_type is set to IOMMU_HWPT_DATA_NONE, @data_len and
* @data_uptr should be zero. Otherwise, both @data_len and @data_uptr
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c
index d06bf6e6c19f..5314cd486ddb 100644
--- a/drivers/iommu/iommufd/hw_pagetable.c
+++ b/drivers/iommu/iommufd/hw_pagetable.c
@@ -57,7 +57,10 @@ void iommufd_hwpt_nested_destroy(struct iommufd_object *obj)
container_of(obj, struct iommufd_hwpt_nested, common.obj);
__iommufd_hwpt_destroy(&hwpt_nested->common);
- refcount_dec(&hwpt_nested->parent->common.obj.users);
+ if (hwpt_nested->viommu)
+ refcount_dec(&hwpt_nested->viommu->obj.users);
+ else
+ refcount_dec(&hwpt_nested->parent->common.obj.users);
}
void iommufd_hwpt_nested_abort(struct iommufd_object *obj)
@@ -260,6 +263,54 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx,
return ERR_PTR(rc);
}
+/**
+ * iommufd_hwpt_nested_alloc_for_viommu() - Get a hwpt_nested for a vIOMMU
+ * @viommu: vIOMMU ojbect to associate the hwpt_nested/domain with
+ * @user_data: user_data pointer. Must be valid
+ *
+ * Allocate a new IOMMU_DOMAIN_NESTED for a vIOMMU and return it as a NESTED
+ * hw_pagetable.
+ */
+static struct iommufd_hwpt_nested *
+iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu *viommu,
+ const struct iommu_user_data *user_data)
+{
+ struct iommufd_hwpt_nested *hwpt_nested;
+ struct iommufd_hw_pagetable *hwpt;
+ int rc;
+
+ if (!viommu->ops || !viommu->ops->domain_alloc_nested)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ hwpt_nested = __iommufd_object_alloc(
+ viommu->ictx, hwpt_nested, IOMMUFD_OBJ_HWPT_NESTED, common.obj);
+ if (IS_ERR(hwpt_nested))
+ return ERR_CAST(hwpt_nested);
+ hwpt = &hwpt_nested->common;
+
+ hwpt_nested->viommu = viommu;
+ hwpt_nested->parent = viommu->hwpt;
+ refcount_inc(&viommu->obj.users);
+
+ hwpt->domain = viommu->ops->domain_alloc_nested(viommu, user_data);
+ if (IS_ERR(hwpt->domain)) {
+ rc = PTR_ERR(hwpt->domain);
+ hwpt->domain = NULL;
+ goto out_abort;
+ }
+ hwpt->domain->owner = viommu->iommu_dev->ops;
+
+ if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) {
+ rc = -EINVAL;
+ goto out_abort;
+ }
+ return hwpt_nested;
+
+out_abort:
+ iommufd_object_abort_and_destroy(viommu->ictx, &hwpt->obj);
+ return ERR_PTR(rc);
+}
+
int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
{
struct iommu_hwpt_alloc *cmd = ucmd->cmd;
@@ -316,6 +367,22 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd)
goto out_unlock;
}
hwpt = &hwpt_nested->common;
+ } else if (pt_obj->type == IOMMUFD_OBJ_VIOMMU) {
+ struct iommufd_hwpt_nested *hwpt_nested;
+ struct iommufd_viommu *viommu;
+
+ viommu = container_of(pt_obj, struct iommufd_viommu, obj);
+ if (viommu->iommu_dev != __iommu_get_iommu_dev(idev->dev)) {
+ rc = -EINVAL;
+ goto out_unlock;
+ }
+ hwpt_nested = iommufd_hwpt_nested_alloc_for_viommu(viommu,
+ &user_data);
+ if (IS_ERR(hwpt_nested)) {
+ rc = PTR_ERR(hwpt_nested);
+ goto out_unlock;
+ }
+ hwpt = &hwpt_nested->common;
} else {
rc = -EINVAL;
goto out_put_pt;
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 07/11] iommufd/selftest: Add refcount to mock_iommu_device
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (5 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 08/11] iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST Nicolin Chen
` (4 subsequent siblings)
11 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
For an iommu_dev that can unplug (so far only this selftest does so), the
viommu->iommu_dev pointer has no guarantee of its life cycle after it is
copied from the idev->dev->iommu->iommu_dev.
Track the user count of the iommu_dev. Postpone the exit routine using a
completion, if refcount is unbalanced. The refcount inc/dec will be added
in the following patch.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/selftest.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index 540437be168a..dbd2a78c1074 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -507,14 +507,17 @@ static bool mock_domain_capable(struct device *dev, enum iommu_cap cap)
static struct iopf_queue *mock_iommu_iopf_queue;
-static struct iommu_device mock_iommu_device = {
-};
+static struct mock_iommu_device {
+ struct iommu_device iommu_dev;
+ struct completion complete;
+ refcount_t users;
+} mock_iommu;
static struct iommu_device *mock_probe_device(struct device *dev)
{
if (dev->bus != &iommufd_mock_bus_type.bus)
return ERR_PTR(-ENODEV);
- return &mock_iommu_device;
+ return &mock_iommu.iommu_dev;
}
static void mock_domain_page_response(struct device *dev, struct iopf_fault *evt,
@@ -1536,24 +1539,27 @@ int __init iommufd_test_init(void)
if (rc)
goto err_platform;
- rc = iommu_device_sysfs_add(&mock_iommu_device,
+ rc = iommu_device_sysfs_add(&mock_iommu.iommu_dev,
&selftest_iommu_dev->dev, NULL, "%s",
dev_name(&selftest_iommu_dev->dev));
if (rc)
goto err_bus;
- rc = iommu_device_register_bus(&mock_iommu_device, &mock_ops,
+ rc = iommu_device_register_bus(&mock_iommu.iommu_dev, &mock_ops,
&iommufd_mock_bus_type.bus,
&iommufd_mock_bus_type.nb);
if (rc)
goto err_sysfs;
+ refcount_set(&mock_iommu.users, 1);
+ init_completion(&mock_iommu.complete);
+
mock_iommu_iopf_queue = iopf_queue_alloc("mock-iopfq");
return 0;
err_sysfs:
- iommu_device_sysfs_remove(&mock_iommu_device);
+ iommu_device_sysfs_remove(&mock_iommu.iommu_dev);
err_bus:
bus_unregister(&iommufd_mock_bus_type.bus);
err_platform:
@@ -1563,6 +1569,15 @@ int __init iommufd_test_init(void)
return rc;
}
+static void iommufd_test_wait_for_users(void)
+{
+ if (refcount_dec_and_test(&mock_iommu.users))
+ return;
+ /* Time out waiting for iommu device user count to become 0 */
+ WARN_ON(!wait_for_completion_timeout(&mock_iommu.complete,
+ msecs_to_jiffies(10000)));
+}
+
void iommufd_test_exit(void)
{
if (mock_iommu_iopf_queue) {
@@ -1570,8 +1585,9 @@ void iommufd_test_exit(void)
mock_iommu_iopf_queue = NULL;
}
- iommu_device_sysfs_remove(&mock_iommu_device);
- iommu_device_unregister_bus(&mock_iommu_device,
+ iommufd_test_wait_for_users();
+ iommu_device_sysfs_remove(&mock_iommu.iommu_dev);
+ iommu_device_unregister_bus(&mock_iommu.iommu_dev,
&iommufd_mock_bus_type.bus,
&iommufd_mock_bus_type.nb);
bus_unregister(&iommufd_mock_bus_type.bus);
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 08/11] iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (6 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 07/11] iommufd/selftest: Add refcount to mock_iommu_device Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 09/11] iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage Nicolin Chen
` (3 subsequent siblings)
11 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Implement the viommu alloc/free functions to increase/reduce refcount of
its dependent mock iommu device. User space can verify this loop via the
IOMMU_VIOMMU_TYPE_SELFTEST.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/iommufd/iommufd_test.h | 2 ++
drivers/iommu/iommufd/selftest.c | 41 ++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h
index f4bc23a92f9a..edced4ac7cd3 100644
--- a/drivers/iommu/iommufd/iommufd_test.h
+++ b/drivers/iommu/iommufd/iommufd_test.h
@@ -180,4 +180,6 @@ struct iommu_hwpt_invalidate_selftest {
__u32 iotlb_id;
};
+#define IOMMU_VIOMMU_TYPE_SELFTEST 0xdeadbeef
+
#endif
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index dbd2a78c1074..04dd95fe24ca 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -132,6 +132,10 @@ struct mock_iommu_domain_nested {
u32 iotlb[MOCK_NESTED_DOMAIN_IOTLB_NUM];
};
+struct mock_viommu {
+ struct iommufd_viommu core;
+};
+
enum selftest_obj_type {
TYPE_IDEV,
};
@@ -543,6 +547,42 @@ static int mock_dev_disable_feat(struct device *dev, enum iommu_dev_features fea
return 0;
}
+static void mock_viommu_free(struct iommufd_viommu *viommu)
+{
+ struct mock_iommu_device *mock_iommu = container_of(
+ viommu->iommu_dev, struct mock_iommu_device, iommu_dev);
+
+ if (refcount_dec_and_test(&mock_iommu->users))
+ complete(&mock_iommu->complete);
+
+ /* iommufd core frees mock_viommu and viommu */
+}
+
+static struct iommufd_viommu_ops mock_viommu_ops = {
+ .free = mock_viommu_free,
+};
+
+static struct iommufd_viommu *mock_viommu_alloc(struct device *dev,
+ struct iommu_domain *domain,
+ struct iommufd_ctx *ictx,
+ unsigned int viommu_type)
+{
+ struct mock_iommu_device *mock_iommu =
+ iommu_get_iommu_dev(dev, struct mock_iommu_device, iommu_dev);
+ struct mock_viommu *mock_viommu;
+
+ if (viommu_type != IOMMU_VIOMMU_TYPE_SELFTEST)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ mock_viommu =
+ iommufd_viommu_alloc(ictx, mock_viommu, core, &mock_viommu_ops);
+ if (IS_ERR(mock_viommu))
+ return ERR_CAST(mock_viommu);
+
+ refcount_inc(&mock_iommu->users);
+ return &mock_viommu->core;
+}
+
static const struct iommu_ops mock_ops = {
/*
* IOMMU_DOMAIN_BLOCKED cannot be returned from def_domain_type()
@@ -562,6 +602,7 @@ static const struct iommu_ops mock_ops = {
.dev_enable_feat = mock_dev_enable_feat,
.dev_disable_feat = mock_dev_disable_feat,
.user_pasid_table = true,
+ .viommu_alloc = mock_viommu_alloc,
.default_domain_ops =
&(struct iommu_domain_ops){
.free = mock_domain_free,
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 09/11] iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (7 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 08/11] iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU Nicolin Chen
` (2 subsequent siblings)
11 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Add a new iommufd_viommu FIXTURE and setup it up with a vIOMMU object.
Any new vIOMMU feature will be added as a TEST_F under that.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
tools/testing/selftests/iommu/iommufd_utils.h | 28 +++++++
tools/testing/selftests/iommu/iommufd.c | 78 +++++++++++++++++++
.../selftests/iommu/iommufd_fail_nth.c | 11 +++
3 files changed, 117 insertions(+)
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h
index 40f6f14ce136..ca09308dad6a 100644
--- a/tools/testing/selftests/iommu/iommufd_utils.h
+++ b/tools/testing/selftests/iommu/iommufd_utils.h
@@ -762,3 +762,31 @@ static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd)
#define test_cmd_trigger_iopf(device_id, fault_fd) \
ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd))
+
+static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id,
+ __u32 type, __u32 flags, __u32 *viommu_id)
+{
+ struct iommu_viommu_alloc cmd = {
+ .size = sizeof(cmd),
+ .flags = flags,
+ .type = type,
+ .dev_id = device_id,
+ .hwpt_id = hwpt_id,
+ };
+ int ret;
+
+ ret = ioctl(fd, IOMMU_VIOMMU_ALLOC, &cmd);
+ if (ret)
+ return ret;
+ if (viommu_id)
+ *viommu_id = cmd.out_viommu_id;
+ return 0;
+}
+
+#define test_cmd_viommu_alloc(device_id, hwpt_id, type, viommu_id) \
+ ASSERT_EQ(0, _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, \
+ type, 0, viommu_id))
+#define test_err_viommu_alloc(_errno, device_id, hwpt_id, type, viommu_id) \
+ EXPECT_ERRNO(_errno, \
+ _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, \
+ type, 0, viommu_id))
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c
index 4927b9add5ad..49ac144cf5a4 100644
--- a/tools/testing/selftests/iommu/iommufd.c
+++ b/tools/testing/selftests/iommu/iommufd.c
@@ -128,6 +128,7 @@ TEST_F(iommufd, cmd_length)
TEST_LENGTH(iommu_ioas_unmap, IOMMU_IOAS_UNMAP, length);
TEST_LENGTH(iommu_option, IOMMU_OPTION, val64);
TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved);
+ TEST_LENGTH(iommu_viommu_alloc, IOMMU_VIOMMU_ALLOC, out_viommu_id);
#undef TEST_LENGTH
}
@@ -2386,4 +2387,81 @@ TEST_F(vfio_compat_mock_domain, huge_map)
}
}
+FIXTURE(iommufd_viommu)
+{
+ int fd;
+ uint32_t ioas_id;
+ uint32_t stdev_id;
+ uint32_t hwpt_id;
+ uint32_t device_id;
+ uint32_t viommu_id;
+};
+
+FIXTURE_VARIANT(iommufd_viommu)
+{
+ unsigned int viommu;
+};
+
+FIXTURE_SETUP(iommufd_viommu)
+{
+ self->fd = open("/dev/iommu", O_RDWR);
+ ASSERT_NE(-1, self->fd);
+ test_ioctl_ioas_alloc(&self->ioas_id);
+ test_ioctl_set_default_memory_limit();
+
+ if (variant->viommu) {
+ test_cmd_mock_domain(self->ioas_id, &self->stdev_id, NULL,
+ &self->device_id);
+
+ /* Negative test -- invalid hwpt */
+ test_err_viommu_alloc(ENOENT, self->device_id, self->hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST, NULL);
+
+ /* Negative test -- not a nesting parent hwpt */
+ test_cmd_hwpt_alloc(self->device_id, self->ioas_id, 0,
+ &self->hwpt_id);
+ test_err_viommu_alloc(EINVAL, self->device_id, self->hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST, NULL);
+ test_ioctl_destroy(self->hwpt_id);
+
+ /* Allocate a nesting parent HWP */
+ test_cmd_hwpt_alloc(self->device_id, self->ioas_id,
+ IOMMU_HWPT_ALLOC_NEST_PARENT,
+ &self->hwpt_id);
+ /* Negative test -- unsupported viommu type */
+ test_err_viommu_alloc(EOPNOTSUPP, self->device_id,
+ self->hwpt_id, 0xdead, NULL);
+
+ test_cmd_viommu_alloc(self->device_id, self->hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST,
+ &self->viommu_id);
+ } else {
+ test_err_viommu_alloc(ENOENT, self->device_id, self->hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST, NULL);
+ }
+}
+
+FIXTURE_TEARDOWN(iommufd_viommu)
+{
+ if (variant->viommu) {
+ test_ioctl_destroy(self->viommu_id);
+ test_ioctl_destroy(self->hwpt_id);
+ }
+ teardown_iommufd(self->fd, _metadata);
+}
+
+FIXTURE_VARIANT_ADD(iommufd_viommu, no_viommu)
+{
+ .viommu = 0,
+};
+
+FIXTURE_VARIANT_ADD(iommufd_viommu, mock_viommu)
+{
+ .viommu = 1,
+};
+
+TEST_F(iommufd_viommu, viommu_auto_destroy)
+{
+}
+
TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c
index c5d5e69452b0..e9a980b7729b 100644
--- a/tools/testing/selftests/iommu/iommufd_fail_nth.c
+++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c
@@ -582,6 +582,7 @@ TEST_FAIL_NTH(basic_fail_nth, device)
uint32_t stdev_id;
uint32_t idev_id;
uint32_t hwpt_id;
+ uint32_t viommu_id;
__u64 iova;
self->fd = open("/dev/iommu", O_RDWR);
@@ -624,6 +625,16 @@ TEST_FAIL_NTH(basic_fail_nth, device)
if (_test_cmd_mock_domain_replace(self->fd, stdev_id, hwpt_id, NULL))
return -1;
+
+ if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0,
+ IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id,
+ IOMMU_HWPT_DATA_NONE, 0, 0))
+ return -1;
+
+ if (_test_cmd_viommu_alloc(self->fd, idev_id, hwpt_id,
+ IOMMU_VIOMMU_TYPE_SELFTEST, 0, &viommu_id))
+ return -1;
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (8 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 09/11] iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 9:11 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Nicolin Chen
2024-10-25 8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
With the introduction of the new object and its infrastructure, update the
doc to reflect that and add a new graph.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
1 file changed, 68 insertions(+), 1 deletion(-)
diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index 2deba93bf159..92d16efad5b0 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
space usually has mappings from guest-level I/O virtual addresses to guest-
level physical addresses.
+ - IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
+ passed to or shared with a VM. It may be some HW-accelerated virtualization
+ features and some SW resources used by the VM. For examples:
+ * Security namespace for guest owned ID, e.g. guest-controlled cache tags
+ * Access to a sharable nesting parent pagetable across physical IOMMUs
+ * Virtualization of various platforms IDs, e.g. RIDs and others
+ * Delivery of paravirtualized invalidation
+ * Direct assigned invalidation queues
+ * Direct assigned interrupts
+ * Non-affiliated event reporting
+ Such a vIOMMU object generally has the access to a nesting parent pagetable
+ to support some HW-accelerated virtualization features. So, a vIOMMU object
+ must be created given a nesting parent HWPT_PAGING object, and then it would
+ encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used
+ to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING.
+
+ .. note::
+
+ The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a
+ VM. A VM can have one giant virtualized IOMMU running on a machine having
+ multiple physical IOMMUs, in which case the VMM will dispatch the requests
+ or configurations from this single virtualized IOMMU instance to multiple
+ vIOMMU objects created for individual slices of different physical IOMMUs.
+ In other words, a vIOMMU object is always a representation of one physical
+ IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full
+ virtualization features from physical IOMMUs, it is suggested to build the
+ same number of virtualized IOMMUs as the number of physical IOMMUs, so the
+ passed-through devices would be connected to their own virtualized IOMMUs
+ backed by corresponding vIOMMU objects, in which case a guest OS would do
+ the "dispatch" naturally instead of VMM trappings.
+
All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
The diagrams below show relationships between user-visible objects and kernel
@@ -101,6 +132,28 @@ creating the objects and links::
|------------>|iommu_domain|<----|iommu_domain|<----|device|
|____________| |____________| |______|
+ _______________________________________________________________________
+ | iommufd (with vIOMMU) |
+ | |
+ | [5] |
+ | _____________ |
+ | | | |
+ | |----------------| vIOMMU | |
+ | | | | |
+ | | | | |
+ | | [1] | | [4] [2] |
+ | | ______ | | _____________ ________ |
+ | | | | | [3] | | | | | |
+ | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
+ | | |______| |_____________| |_____________| |________| |
+ | | | | | | |
+ |______|________|______________|__________________|_______________|_____|
+ | | | | |
+ ______v_____ | ______v_____ ______v_____ ___v__
+ | struct | | PFN | (paging) | | (nested) | |struct|
+ |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device|
+ |____________| storage|____________| |____________| |______|
+
1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can
hold multiple IOAS objects. IOAS is the most generic object and does not
expose interfaces that are specific to single IOMMU drivers. All operations
@@ -132,7 +185,8 @@ creating the objects and links::
flag is set.
4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC
- uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object
+ uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a
+ nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object
to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object
must be a nesting parent manually allocated via the same uAPI previously with
an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The
@@ -149,6 +203,18 @@ creating the objects and links::
created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type
of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc.
+5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC
+ uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU)
+ and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The
+ iommufd core will link the vIOMMU object to the struct iommu_device that the
+ struct device is behind. And an IOMMU driver can implement a viommu_alloc op
+ to allocate its own vIOMMU data structure embedding the core-level structure
+ iommufd_viommu and some driver-specific data. If necessary, the driver can
+ also configure its HW virtualization feature for that vIOMMU (and thus for
+ the VM). Successful completion of this operation sets up the linkages between
+ the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used
+ as a nesting parent object to allocate an HWPT_NESTED object described above.
+
A device can only bind to an iommufd due to DMA ownership claim and attach to at
most one IOAS object (no support of PASID yet).
@@ -161,6 +227,7 @@ User visible objects are backed by following datastructures:
- iommufd_device for IOMMUFD_OBJ_DEVICE.
- iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
- iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.
+- iommufd_viommu for IOMMUFD_OBJ_VIOMMU.
Several terminologies when looking at these datastructures:
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (9 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU Nicolin Chen
@ 2024-10-22 0:19 ` Nicolin Chen
2024-10-25 9:18 ` Tian, Kevin
2024-10-25 8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
11 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 0:19 UTC (permalink / raw)
To: jgg, kevin.tian, will
Cc: joro, suravee.suthikulpanit, robin.murphy, dwmw2, baolu.lu, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type. Implement
an arm_vsmmu_alloc() with its viommu op arm_vsmmu_domain_alloc_nested(),
to replace arm_smmu_domain_alloc_nesting(). As an initial step, copy the
VMID from s2_parent. A later cleanup series is required to move the VMID
allocation out of the stage-2 domain allocation routine to this.
After that, replace nested_domain->s2_parent with nested_domain->vsmmu.
Note that the validatting conditions for a nested_domain allocation are
moved from arm_vsmmu_domain_alloc_nested to arm_vsmmu_alloc, since there
is no point in creating a vIOMMU (vsmmu) from the beginning if it would
not support a nested_domain.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++---
include/uapi/linux/iommufd.h | 2 +
.../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 79 ++++++++++++-------
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +--
4 files changed, 70 insertions(+), 46 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 956c12637866..5a025d310dbe 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -10,6 +10,7 @@
#include <linux/bitfield.h>
#include <linux/iommu.h>
+#include <linux/iommufd.h>
#include <linux/kernel.h>
#include <linux/mmzone.h>
#include <linux/sizes.h>
@@ -835,7 +836,7 @@ struct arm_smmu_domain {
struct arm_smmu_nested_domain {
struct iommu_domain domain;
- struct arm_smmu_domain *s2_parent;
+ struct arm_vsmmu *vsmmu;
__le64 ste[2];
};
@@ -1005,21 +1006,22 @@ tegra241_cmdqv_probe(struct arm_smmu_device *smmu)
}
#endif /* CONFIG_TEGRA241_CMDQV */
+struct arm_vsmmu {
+ struct iommufd_viommu core;
+ struct arm_smmu_device *smmu;
+ struct arm_smmu_domain *s2_parent;
+ u16 vmid;
+};
+
#if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD)
void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type);
-struct iommu_domain *
-arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
- struct iommu_domain *parent,
- const struct iommu_user_data *user_data);
+struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
+ struct iommu_domain *parent,
+ struct iommufd_ctx *ictx,
+ unsigned int viommu_type);
#else
#define arm_smmu_hw_info NULL
-static inline struct iommu_domain *
-arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
- struct iommu_domain *parent,
- const struct iommu_user_data *user_data)
-{
- return ERR_PTR(-EOPNOTSUPP);
-}
+#define arm_vsmmu_alloc NULL
#endif /* CONFIG_ARM_SMMU_V3_IOMMUFD */
#endif /* _ARM_SMMU_V3_H */
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index f835ccf4a494..09c1b4ba46d8 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -859,9 +859,11 @@ struct iommu_fault_alloc {
/**
* enum iommu_viommu_type - Virtual IOMMU Type
* @IOMMU_VIOMMU_TYPE_DEFAULT: Reserved for future use
+ * @IOMMU_VIOMMU_TYPE_ARM_SMMUV3: ARM SMMUv3 driver specific type
*/
enum iommu_viommu_type {
IOMMU_VIOMMU_TYPE_DEFAULT = 0,
+ IOMMU_VIOMMU_TYPE_ARM_SMMUV3 = 1,
};
/**
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 44e1b9bef850..70ad857a57b8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -34,7 +34,8 @@ static void arm_smmu_make_nested_cd_table_ste(
struct arm_smmu_ste *target, struct arm_smmu_master *master,
struct arm_smmu_nested_domain *nested_domain, bool ats_enabled)
{
- arm_smmu_make_s2_domain_ste(target, master, nested_domain->s2_parent,
+ arm_smmu_make_s2_domain_ste(target, master,
+ nested_domain->vsmmu->s2_parent,
ats_enabled);
target->data[0] = cpu_to_le64(STRTAB_STE_0_V |
@@ -75,7 +76,8 @@ static void arm_smmu_make_nested_domain_ste(
break;
case STRTAB_STE_0_CFG_BYPASS:
arm_smmu_make_s2_domain_ste(
- target, master, nested_domain->s2_parent, ats_enabled);
+ target, master, nested_domain->vsmmu->s2_parent,
+ ats_enabled);
break;
case STRTAB_STE_0_CFG_ABORT:
default:
@@ -100,7 +102,7 @@ static int arm_smmu_attach_dev_nested(struct iommu_domain *domain,
struct arm_smmu_ste ste;
int ret;
- if (nested_domain->s2_parent->smmu != master->smmu)
+ if (nested_domain->vsmmu->smmu != master->smmu)
return -EINVAL;
if (arm_smmu_ssids_in_use(&master->cd_table))
return -EBUSY;
@@ -151,36 +153,15 @@ static int arm_smmu_validate_vste(struct iommu_hwpt_arm_smmuv3 *arg)
return 0;
}
-struct iommu_domain *
-arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
- struct iommu_domain *parent,
+static struct iommu_domain *
+arm_vsmmu_domain_alloc_nested(struct iommufd_viommu *viommu,
const struct iommu_user_data *user_data)
{
- struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core);
struct arm_smmu_nested_domain *nested_domain;
- struct arm_smmu_domain *smmu_parent;
struct iommu_hwpt_arm_smmuv3 arg;
int ret;
- if (flags || !(master->smmu->features & ARM_SMMU_FEAT_NESTING))
- return ERR_PTR(-EOPNOTSUPP);
-
- /*
- * Must support some way to prevent the VM from bypassing the cache
- * because VFIO currently does not do any cache maintenance.
- */
- if (!arm_smmu_master_canwbs(master) &&
- !(master->smmu->features & ARM_SMMU_FEAT_S2FWB))
- return ERR_PTR(-EOPNOTSUPP);
-
- /*
- * The core code checks that parent was created with
- * IOMMU_HWPT_ALLOC_NEST_PARENT
- */
- smmu_parent = to_smmu_domain(parent);
- if (smmu_parent->smmu != master->smmu)
- return ERR_PTR(-EINVAL);
-
ret = iommu_copy_struct_from_user(&arg, user_data,
IOMMU_HWPT_DATA_ARM_SMMUV3, ste);
if (ret)
@@ -196,9 +177,51 @@ arm_smmu_domain_alloc_nesting(struct device *dev, u32 flags,
nested_domain->domain.type = IOMMU_DOMAIN_NESTED;
nested_domain->domain.ops = &arm_smmu_nested_ops;
- nested_domain->s2_parent = smmu_parent;
+ nested_domain->vsmmu = vsmmu;
nested_domain->ste[0] = arg.ste[0];
nested_domain->ste[1] = arg.ste[1] & ~cpu_to_le64(STRTAB_STE_1_EATS);
return &nested_domain->domain;
}
+
+
+static const struct iommufd_viommu_ops arm_vsmmu_ops = {
+ .domain_alloc_nested = arm_vsmmu_domain_alloc_nested,
+};
+
+struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
+ struct iommu_domain *parent,
+ struct iommufd_ctx *ictx,
+ unsigned int viommu_type)
+{
+ struct arm_smmu_device *smmu =
+ iommu_get_iommu_dev(dev, struct arm_smmu_device, iommu);
+ struct arm_smmu_master *master = dev_iommu_priv_get(dev);
+ struct arm_smmu_domain *s2_parent = to_smmu_domain(parent);
+ struct arm_vsmmu *vsmmu;
+
+ if (viommu_type != IOMMU_VIOMMU_TYPE_ARM_SMMUV3)
+ return ERR_PTR(-EOPNOTSUPP);
+
+ if (!(smmu->features & ARM_SMMU_FEAT_NESTING))
+ return ERR_PTR(-EOPNOTSUPP);
+
+ /*
+ * Must support some way to prevent the VM from bypassing the cache
+ * because VFIO currently does not do any cache maintenance.
+ */
+ if (!arm_smmu_master_canwbs(master) &&
+ !(smmu->features & ARM_SMMU_FEAT_S2FWB))
+ return ERR_PTR(-EOPNOTSUPP);
+
+ vsmmu = iommufd_viommu_alloc(ictx, arm_vsmmu, core, &arm_vsmmu_ops);
+ if (IS_ERR(vsmmu))
+ return ERR_CAST(vsmmu);
+
+ vsmmu->smmu = smmu;
+ vsmmu->s2_parent = s2_parent;
+ /* FIXME Move VMID allocation from the S2 domain allocation to here */
+ vsmmu->vmid = s2_parent->s2_cfg.vmid;
+
+ return &vsmmu->core;
+}
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index c7cc98961019..de598d66b5c2 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2661,7 +2661,7 @@ to_smmu_domain_devices(struct iommu_domain *domain)
domain->type == IOMMU_DOMAIN_SVA)
return to_smmu_domain(domain);
if (domain->type == IOMMU_DOMAIN_NESTED)
- return to_smmu_nested_domain(domain)->s2_parent;
+ return to_smmu_nested_domain(domain)->vsmmu->s2_parent;
return NULL;
}
@@ -3126,13 +3126,9 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 flags,
struct arm_smmu_domain *smmu_domain;
int ret;
- if (parent)
- return arm_smmu_domain_alloc_nesting(dev, flags, parent,
- user_data);
-
if (flags & ~PAGING_FLAGS)
return ERR_PTR(-EOPNOTSUPP);
- if (user_data)
+ if (parent || user_data)
return ERR_PTR(-EOPNOTSUPP);
smmu_domain = arm_smmu_domain_alloc();
@@ -3541,6 +3537,7 @@ static struct iommu_ops arm_smmu_ops = {
.dev_disable_feat = arm_smmu_dev_disable_feature,
.page_response = arm_smmu_page_response,
.def_domain_type = arm_smmu_def_domain_type,
+ .viommu_alloc = arm_vsmmu_alloc,
.user_pasid_table = 1,
.pgsize_bitmap = -1UL, /* Restricted during device attach */
.owner = THIS_MODULE,
--
2.43.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 0:19 ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Nicolin Chen
@ 2024-10-22 2:28 ` Baolu Lu
2024-10-22 4:40 ` Nicolin Chen
0 siblings, 1 reply; 36+ messages in thread
From: Baolu Lu @ 2024-10-22 2:28 UTC (permalink / raw)
To: Nicolin Chen, jgg, kevin.tian, will
Cc: baolu.lu, joro, suravee.suthikulpanit, robin.murphy, dwmw2, shuah,
linux-kernel, iommu, linux-arm-kernel, linux-kselftest,
eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
On 2024/10/22 8:19, Nicolin Chen wrote:
> + * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind
> + * the @dev, as the set of virtualization resources shared/passed
> + * to user space IOMMU instance. And associate it with a nesting
> + * @parent_domain. The @viommu_type must be defined in the header
> + * include/uapi/linux/iommufd.h
> + * It is suggested to call iommufd_viommu_alloc() helper for
> + * a bundled allocation of the core and the driver structures,
> + * using the given @ictx pointer.
> * @pgsize_bitmap: bitmap of all possible supported page sizes
> * @owner: Driver module providing these ops
> * @identity_domain: An always available, always attachable identity
> @@ -591,6 +601,10 @@ struct iommu_ops {
> void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid,
> struct iommu_domain *domain);
>
> + struct iommufd_viommu *(*viommu_alloc)(
> + struct device *dev, struct iommu_domain *parent_domain,
> + struct iommufd_ctx *ictx, unsigned int viommu_type);
Is the vIOMMU object limited to a parent domain?
Thanks,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 2:28 ` Baolu Lu
@ 2024-10-22 4:40 ` Nicolin Chen
2024-10-22 8:59 ` Baolu Lu
0 siblings, 1 reply; 36+ messages in thread
From: Nicolin Chen @ 2024-10-22 4:40 UTC (permalink / raw)
To: Baolu Lu
Cc: jgg, kevin.tian, will, joro, suravee.suthikulpanit, robin.murphy,
dwmw2, shuah, linux-kernel, iommu, linux-arm-kernel,
linux-kselftest, eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
On Tue, Oct 22, 2024 at 10:28:30AM +0800, Baolu Lu wrote:
> On 2024/10/22 8:19, Nicolin Chen wrote:
> > + * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind
> > + * the @dev, as the set of virtualization resources shared/passed
> > + * to user space IOMMU instance. And associate it with a nesting
> > + * @parent_domain. The @viommu_type must be defined in the header
> > + * include/uapi/linux/iommufd.h
> > + * It is suggested to call iommufd_viommu_alloc() helper for
> > + * a bundled allocation of the core and the driver structures,
> > + * using the given @ictx pointer.
> > * @pgsize_bitmap: bitmap of all possible supported page sizes
> > * @owner: Driver module providing these ops
> > * @identity_domain: An always available, always attachable identity
> > @@ -591,6 +601,10 @@ struct iommu_ops {
> > void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid,
> > struct iommu_domain *domain);
> >
> > + struct iommufd_viommu *(*viommu_alloc)(
> > + struct device *dev, struct iommu_domain *parent_domain,
> > + struct iommufd_ctx *ictx, unsigned int viommu_type);
>
> Is the vIOMMU object limited to a parent domain?
Yes, for each vIOMMU (a slice of physical IOMMU per VM), one S2
parent domain is enough. Typically, it has the mappings between
gPAs and hPAs. If its format/compatibility allows, this single
parent domain can be shared with other vIOMMUs.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 4:40 ` Nicolin Chen
@ 2024-10-22 8:59 ` Baolu Lu
2024-10-22 13:15 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Baolu Lu @ 2024-10-22 8:59 UTC (permalink / raw)
To: Nicolin Chen
Cc: baolu.lu, jgg, kevin.tian, will, joro, suravee.suthikulpanit,
robin.murphy, dwmw2, shuah, linux-kernel, iommu, linux-arm-kernel,
linux-kselftest, eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
On 2024/10/22 12:40, Nicolin Chen wrote:
> On Tue, Oct 22, 2024 at 10:28:30AM +0800, Baolu Lu wrote:
>> On 2024/10/22 8:19, Nicolin Chen wrote:
>>> + * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind
>>> + * the @dev, as the set of virtualization resources shared/passed
>>> + * to user space IOMMU instance. And associate it with a nesting
>>> + * @parent_domain. The @viommu_type must be defined in the header
>>> + * include/uapi/linux/iommufd.h
>>> + * It is suggested to call iommufd_viommu_alloc() helper for
>>> + * a bundled allocation of the core and the driver structures,
>>> + * using the given @ictx pointer.
>>> * @pgsize_bitmap: bitmap of all possible supported page sizes
>>> * @owner: Driver module providing these ops
>>> * @identity_domain: An always available, always attachable identity
>>> @@ -591,6 +601,10 @@ struct iommu_ops {
>>> void (*remove_dev_pasid)(struct device *dev, ioasid_t pasid,
>>> struct iommu_domain *domain);
>>>
>>> + struct iommufd_viommu *(*viommu_alloc)(
>>> + struct device *dev, struct iommu_domain *parent_domain,
>>> + struct iommufd_ctx *ictx, unsigned int viommu_type);
>> Is the vIOMMU object limited to a parent domain?
> Yes, for each vIOMMU (a slice of physical IOMMU per VM), one S2
> parent domain is enough. Typically, it has the mappings between
> gPAs and hPAs. If its format/compatibility allows, this single
> parent domain can be shared with other vIOMMUs.
Is it feasible to make vIOMMU object more generic, rather than strictly
tying it to nested translation? For example, a normal paging domain that
translates gPAs to hPAs could also have a vIOMMU object associated with
it.
While we can only support vIOMMU object allocation uAPI for S2 paging
domains in the context of this series, we could consider leaving the
option open to associate a vIOMMU object with other normal paging
domains that are not a nested parent?
Thanks,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 8:59 ` Baolu Lu
@ 2024-10-22 13:15 ` Jason Gunthorpe
2024-10-23 1:48 ` Baolu Lu
2024-10-25 8:47 ` Tian, Kevin
0 siblings, 2 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-10-22 13:15 UTC (permalink / raw)
To: Baolu Lu
Cc: Nicolin Chen, kevin.tian, will, joro, suravee.suthikulpanit,
robin.murphy, dwmw2, shuah, linux-kernel, iommu, linux-arm-kernel,
linux-kselftest, eric.auger, jean-philippe, mdf, mshavit,
shameerali.kolothum.thodi, smostafa, yi.l.liu, aik, zhangfei.gao,
patches
On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> Is it feasible to make vIOMMU object more generic, rather than strictly
> tying it to nested translation? For example, a normal paging domain that
> translates gPAs to hPAs could also have a vIOMMU object associated with
> it.
>
> While we can only support vIOMMU object allocation uAPI for S2 paging
> domains in the context of this series, we could consider leaving the
> option open to associate a vIOMMU object with other normal paging
> domains that are not a nested parent?
Why? The nested parent flavour of the domain is basically free to
create, what reason would be to not do that?
If the HW doesn't support it, then does the HW really need/support a
VIOMMU?
I suppose it could be made up to the driver, but for now I think we
should leave it as is in the core code requiring it until we have a
reason to relax it.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 13:15 ` Jason Gunthorpe
@ 2024-10-23 1:48 ` Baolu Lu
2024-10-25 8:47 ` Tian, Kevin
1 sibling, 0 replies; 36+ messages in thread
From: Baolu Lu @ 2024-10-23 1:48 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: baolu.lu, Nicolin Chen, kevin.tian, will, joro,
suravee.suthikulpanit, robin.murphy, dwmw2, shuah, linux-kernel,
iommu, linux-arm-kernel, linux-kselftest, eric.auger,
jean-philippe, mdf, mshavit, shameerali.kolothum.thodi, smostafa,
yi.l.liu, aik, zhangfei.gao, patches
On 2024/10/22 21:15, Jason Gunthorpe wrote:
> On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
>
>> Is it feasible to make vIOMMU object more generic, rather than strictly
>> tying it to nested translation? For example, a normal paging domain that
>> translates gPAs to hPAs could also have a vIOMMU object associated with
>> it.
>>
>> While we can only support vIOMMU object allocation uAPI for S2 paging
>> domains in the context of this series, we could consider leaving the
>> option open to associate a vIOMMU object with other normal paging
>> domains that are not a nested parent?
> Why? The nested parent flavour of the domain is basically free to
> create, what reason would be to not do that?
Above addressed my question. The software using vIOMMU should allocate a
domain of nested parent type.
>
> If the HW doesn't support it, then does the HW really need/support a
> VIOMMU?
>
> I suppose it could be made up to the driver, but for now I think we
> should leave it as is in the core code requiring it until we have a
> reason to relax it.
Yes. In such cases, the iommu driver could always allow nested parent
domain allocation, but fails to allocate a nested domain if the hardware
is not capable of nesting translation.
Thanks,
baolu
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
` (10 preceding siblings ...)
2024-10-22 0:19 ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Nicolin Chen
@ 2024-10-25 8:34 ` Tian, Kevin
2024-10-25 15:42 ` Jason Gunthorpe
2024-10-25 16:28 ` Nicolin Chen
11 siblings, 2 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 8:34 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> This series introduces a new vIOMMU infrastructure and related ioctls.
>
> IOMMUFD has been using the HWPT infrastructure for all cases, including a
> nested IO page table support. Yet, there're limitations for an HWPT-based
> structure to support some advanced HW-accelerated features, such as
> CMDQV
> on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-
> IOMMU
> environment, it is not straightforward for nested HWPTs to share the same
> parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
> parent HWPT typically hold one stage-2 IO pagetable and tag it with only
> one ID in the cache entries. When sharing one large stage-2 IO pagetable
> across physical IOMMU instances, that one ID may not always be available
> across all the IOMMU instances. In other word, it's ideal for SW to have
> a different container for the stage-2 IO pagetable so it can hold another
> ID that's available.
Just holding multiple IDs doesn't require a different container. This is
just a side effect when vIOMMU will be required for other said reasons.
If we have to put more words here I'd prefer to adding a bit more for
CMDQV which is more compelling. not a big deal though. 😊
>
> For this "different container", add vIOMMU, an additional layer to hold
> extra virtualization information:
>
> ________________________________________________________________
> _______
> | iommufd (with vIOMMU) |
> | |
> | [5] |
> | _____________ |
> | | | |
> | |----------------| vIOMMU | |
> | | | | |
> | | | | |
> | | [1] | | [4] [2] |
> | | ______ | | _____________ ________ |
> | | | | | [3] | | | | | |
> | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
> | | |______| |_____________| |_____________| |________| |
> | | | | | | |
>
> |______|________|______________|__________________|_____________
> __|_____|
> | | | | |
> ______v_____ | ______v_____ ______v_____ ___v__
> | struct | | PFN | (paging) | | (nested) | |struct|
> |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----
> |device|
> |____________| storage|____________| |____________| |______|
>
nit - [1] ... [5] can be removed.
> The vIOMMU object should be seen as a slice of a physical IOMMU instance
> that is passed to or shared with a VM. That can be some HW/SW resources:
> - Security namespace for guest owned ID, e.g. guest-controlled cache tags
> - Access to a sharable nesting parent pagetable across physical IOMMUs
> - Virtualization of various platforms IDs, e.g. RIDs and others
> - Delivery of paravirtualized invalidation
> - Direct assigned invalidation queues
> - Direct assigned interrupts
> - Non-affiliated event reporting
sorry no idea about 'non-affiliated event'. Can you elaborate?
>
> On a multi-IOMMU system, the vIOMMU object must be instanced to the
> number
> of the physical IOMMUs that are passed to (via devices) a guest VM, while
'to the number of the physical IOMMUs that have a slice passed to ..."
> being able to hold the shareable parent HWPT. Each vIOMMU then just
> needs
> to allocate its own individual ID to tag its own cache:
> ----------------------------
> ---------------- | | paging_hwpt0 |
> | hwpt_nested0 |--->| viommu0 ------------------
> ---------------- | | IDx |
> ----------------------------
> ----------------------------
> ---------------- | | paging_hwpt0 |
> | hwpt_nested1 |--->| viommu1 ------------------
> ---------------- | | IDy |
> ----------------------------
>
> As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an
> allocation
> only. And implement it in arm-smmu-v3 driver as a real world use case.
>
> More vIOMMU-based structs and ioctls will be introduced in the follow-up
> series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we
> repurposed the vIOMMU object from an earlier RFC, just for a referece:
> https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
>
> This series is on Github:
> https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v4
> (paring QEMU branch for testing will be provided with the part2 series)
>
> Changelog
> v4
> * Added "Reviewed-by" from Jason
> * Dropped IOMMU_VIOMMU_TYPE_DEFAULT support
> * Dropped iommufd_object_alloc_elm renamings
> * Renamed iommufd's viommu_api.c to driver.c
> * Reworked iommufd_viommu_alloc helper
> * Added a separate iommufd_hwpt_nested_alloc_for_viommu function for
> hwpt_nested allocations on a vIOMMU, and added comparison between
> viommu->iommu_dev->ops and dev_iommu_ops(idev->dev)
> * Replaced s2_parent with vsmmu in arm_smmu_nested_domain
> * Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in
> viommu_ops
> * Replaced wait_queue_head_t with a completion, to delay the unplug of
> mock_iommu_dev
> * Corrected documentation graph that was missing struct iommu_device
> * Added an iommufd_verify_unfinalized_object helper to verify driver-
> allocated vIOMMU/vDEVICE objects
> * Added missing test cases for TEST_LENGTH and fail_nth
> v3
> https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/
> * Rebased on top of Jason's nesting v3 series
> https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-
> smmuv3_nesting_jgg@nvidia.com/
> * Split the series into smaller parts
> * Added Jason's Reviewed-by
> * Added back viommu->iommu_dev
> * Added support for driver-allocated vIOMMU v.s. core-allocated
> * Dropped arm_smmu_cache_invalidate_user
> * Added an iommufd_test_wait_for_users() in selftest
> * Reworked test code to make viommu an individual FIXTURE
> * Added missing TEST_LENGTH case for the new ioctl command
> v2
> https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
> * Limited vdev_id to one per idev
> * Added a rw_sem to protect the vdev_id list
> * Reworked driver-level APIs with proper lockings
> * Added a new viommu_api file for IOMMUFD_DRIVER config
> * Dropped useless iommu_dev point from the viommu structure
> * Added missing index numnbers to new types in the uAPI header
> * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT
> one
> * Reworked mock_viommu_cache_invalidate() using the new iommu helper
> * Reordered details of set/unset_vdev_id handlers for proper lockings
> v1
> https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
>
> Thanks!
> Nicolin
>
> Nicolin Chen (11):
> iommufd: Move struct iommufd_object to public iommufd header
> iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
> iommufd: Add iommufd_verify_unfinalized_object
> iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
> iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
> iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
> iommufd/selftest: Add refcount to mock_iommu_device
> iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
> iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
> Documentation: userspace-api: iommufd: Update vIOMMU
> iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3
> support
>
> drivers/iommu/iommufd/Makefile | 5 +-
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 26 +++---
> drivers/iommu/iommufd/iommufd_private.h | 36 ++------
> drivers/iommu/iommufd/iommufd_test.h | 2 +
> include/linux/iommu.h | 14 +++
> include/linux/iommufd.h | 89 +++++++++++++++++++
> include/uapi/linux/iommufd.h | 56 ++++++++++--
> tools/testing/selftests/iommu/iommufd_utils.h | 28 ++++++
> .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 79 ++++++++++------
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 9 +-
> drivers/iommu/iommufd/driver.c | 38 ++++++++
> drivers/iommu/iommufd/hw_pagetable.c | 69 +++++++++++++-
> drivers/iommu/iommufd/main.c | 58 ++++++------
> drivers/iommu/iommufd/selftest.c | 73 +++++++++++++--
> drivers/iommu/iommufd/viommu.c | 85 ++++++++++++++++++
> tools/testing/selftests/iommu/iommufd.c | 78 ++++++++++++++++
> .../selftests/iommu/iommufd_fail_nth.c | 11 +++
> Documentation/userspace-api/iommufd.rst | 69 +++++++++++++-
> 18 files changed, 701 insertions(+), 124 deletions(-)
> create mode 100644 drivers/iommu/iommufd/driver.c
> create mode 100644 drivers/iommu/iommufd/viommu.c
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header
2024-10-22 0:19 ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Nicolin Chen
@ 2024-10-25 8:34 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 8:34 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> Prepare for an embedded structure design for driver-level iommufd_viommu
> objects:
> // include/linux/iommufd.h
> struct iommufd_viommu {
> struct iommufd_object obj;
> ....
> };
>
> // Some IOMMU driver
> struct iommu_driver_viommu {
> struct iommufd_viommu core;
> ....
> };
>
> It has to expose struct iommufd_object and enum iommufd_object_type
> from
> the core-level private header to the public iommufd header.
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-22 13:15 ` Jason Gunthorpe
2024-10-23 1:48 ` Baolu Lu
@ 2024-10-25 8:47 ` Tian, Kevin
2024-10-25 15:24 ` Jason Gunthorpe
1 sibling, 1 reply; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 8:47 UTC (permalink / raw)
To: Jason Gunthorpe, Baolu Lu
Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, October 22, 2024 9:16 PM
>
> On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
>
> > Is it feasible to make vIOMMU object more generic, rather than strictly
> > tying it to nested translation? For example, a normal paging domain that
> > translates gPAs to hPAs could also have a vIOMMU object associated with
> > it.
> >
> > While we can only support vIOMMU object allocation uAPI for S2 paging
> > domains in the context of this series, we could consider leaving the
> > option open to associate a vIOMMU object with other normal paging
> > domains that are not a nested parent?
>
> Why? The nested parent flavour of the domain is basically free to
> create, what reason would be to not do that?
>
> If the HW doesn't support it, then does the HW really need/support a
> VIOMMU?
>
Now it's agreed to build trusted I/O on top of this new vIOMMU object.
format-wise probably it's free to assume that nested parent is supported
on any new platform which will support trusted I/O. But I'm not sure
all the conditions around allowing nested are same as for trusted I/O,
e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
always guaranteed in trusted I/O configuration?
Baolu did raise a good open to confirm given it will be used beyond
nesting. 😊
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object
2024-10-22 0:19 ` [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object Nicolin Chen
@ 2024-10-25 8:49 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 8:49 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> To support driver-allocated vIOMMU objects, it's suggested to call the
> allocator helper in IOMMU dirvers. However, there is no guarantee that
> drivers will all use it and allocate objects properly.
>
> Add a helper for iommufd core to verify if an unfinalized object is at
> least reserved in the ictx.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
2024-10-22 0:19 ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Nicolin Chen
@ 2024-10-25 8:59 ` Tian, Kevin
2024-10-25 16:22 ` Nicolin Chen
2024-10-25 9:05 ` Tian, Kevin
1 sibling, 1 reply; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 8:59 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> Add a new ioctl for user space to do a vIOMMU allocation. It must be based
> on a nesting parent HWPT, so take its refcount.
>
> If an IOMMU driver supports a driver-managed vIOMMU object, it must
> define
why highlight 'driver-managed', implying a core-managed vIOMMU
object some day?
> +/**
> + * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
> + * @size: sizeof(struct iommu_viommu_alloc)
> + * @flags: Must be 0
> + * @type: Type of the virtual IOMMU. Must be defined in enum
> iommu_viommu_type
> + * @dev_id: The device's physical IOMMU will be used to back the virtual
> IOMMU
> + * @hwpt_id: ID of a nesting parent HWPT to associate to
> + * @out_viommu_id: Output virtual IOMMU ID for the allocated object
> + *
> + * Allocate a virtual IOMMU object that represents the underlying physical
> + * IOMMU's virtualization support. The vIOMMU object is a security-isolated
> + * slice of the physical IOMMU HW that is unique to a specific VM.
the object itself is a software abstraction, while a 'slice' is a set of
real hw resources.
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
2024-10-22 0:19 ` [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops Nicolin Chen
@ 2024-10-25 9:00 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 9:00 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> Allow IOMMU driver to use a vIOMMU object that holds a nesting parent
> hwpt/domain to allocate a nested domain.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
2024-10-22 0:19 ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Nicolin Chen
@ 2024-10-25 9:04 ` Tian, Kevin
2024-10-25 16:14 ` Nicolin Chen
0 siblings, 1 reply; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 9:04 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> +static struct iommufd_hwpt_nested *
> +iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu
> *viommu,
> + const struct iommu_user_data *user_data)
probably "_for" can be skipped to reduce the name length
> +{
> + struct iommufd_hwpt_nested *hwpt_nested;
> + struct iommufd_hw_pagetable *hwpt;
> + int rc;
> +
> + if (!viommu->ops || !viommu->ops->domain_alloc_nested)
> + return ERR_PTR(-EOPNOTSUPP);
> +
> + hwpt_nested = __iommufd_object_alloc(
> + viommu->ictx, hwpt_nested, IOMMUFD_OBJ_HWPT_NESTED,
> common.obj);
> + if (IS_ERR(hwpt_nested))
> + return ERR_CAST(hwpt_nested);
> + hwpt = &hwpt_nested->common;
> +
> + hwpt_nested->viommu = viommu;
> + hwpt_nested->parent = viommu->hwpt;
> + refcount_inc(&viommu->obj.users);
> +
> + hwpt->domain = viommu->ops->domain_alloc_nested(viommu,
> user_data);
> + if (IS_ERR(hwpt->domain)) {
> + rc = PTR_ERR(hwpt->domain);
> + hwpt->domain = NULL;
> + goto out_abort;
> + }
> + hwpt->domain->owner = viommu->iommu_dev->ops;
> +
> + if (WARN_ON_ONCE(hwpt->domain->type !=
> IOMMU_DOMAIN_NESTED)) {
> + rc = -EINVAL;
> + goto out_abort;
> + }
> + return hwpt_nested;
> +
> +out_abort:
> + iommufd_object_abort_and_destroy(viommu->ictx, &hwpt->obj);
> + return ERR_PTR(rc);
> +}
> +
looks there missed a check on flags in this path.
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
2024-10-22 0:19 ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Nicolin Chen
2024-10-25 8:59 ` Tian, Kevin
@ 2024-10-25 9:05 ` Tian, Kevin
2024-10-25 16:17 ` Nicolin Chen
1 sibling, 1 reply; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 9:05 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> +
> + viommu->type = cmd->type;
> + viommu->ictx = ucmd->ictx;
> + viommu->hwpt = hwpt_paging;
> + /* Assume physical IOMMUs are unpluggable (the most likely case)
> */
> + viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
> +
so what would happen if this assumption breaks?
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU
2024-10-22 0:19 ` [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU Nicolin Chen
@ 2024-10-25 9:11 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 9:11 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:20 AM
>
> With the introduction of the new object and its infrastructure, update the
> doc to reflect that and add a new graph.
>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
2024-10-22 0:19 ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Nicolin Chen
@ 2024-10-25 9:18 ` Tian, Kevin
2024-10-25 15:41 ` Jason Gunthorpe
0 siblings, 1 reply; 36+ messages in thread
From: Tian, Kevin @ 2024-10-25 9:18 UTC (permalink / raw)
To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
robin.murphy@arm.com, dwmw2@infradead.org,
baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:20 AM
>
> Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type.
> Implement
> an arm_vsmmu_alloc() with its viommu op
> arm_vsmmu_domain_alloc_nested(),
> to replace arm_smmu_domain_alloc_nesting(). As an initial step, copy the
> VMID from s2_parent. A later cleanup series is required to move the VMID
> allocation out of the stage-2 domain allocation routine to this.
>
> After that, replace nested_domain->s2_parent with nested_domain->vsmmu.
>
> Note that the validatting conditions for a nested_domain allocation are
> moved from arm_vsmmu_domain_alloc_nested to arm_vsmmu_alloc, since
> there
> is no point in creating a vIOMMU (vsmmu) from the beginning if it would
> not support a nested_domain.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
hmm I wonder whether this series should be merged with Jason's
nesting series together and directly use vIOMMU to create nesting.
Otherwise it looks a bit weird for one series to first enable a uAPI
which is immediately replaced by another uAPI from the following
series. Even if both are merged in one cycle, logically it doesn't
sound clean when looking at the git history.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-25 8:47 ` Tian, Kevin
@ 2024-10-25 15:24 ` Jason Gunthorpe
2024-10-28 2:30 ` Tian, Kevin
0 siblings, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:24 UTC (permalink / raw)
To: Tian, Kevin
Cc: Baolu Lu, Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 08:47:40AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, October 22, 2024 9:16 PM
> >
> > On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> >
> > > Is it feasible to make vIOMMU object more generic, rather than strictly
> > > tying it to nested translation? For example, a normal paging domain that
> > > translates gPAs to hPAs could also have a vIOMMU object associated with
> > > it.
> > >
> > > While we can only support vIOMMU object allocation uAPI for S2 paging
> > > domains in the context of this series, we could consider leaving the
> > > option open to associate a vIOMMU object with other normal paging
> > > domains that are not a nested parent?
> >
> > Why? The nested parent flavour of the domain is basically free to
> > create, what reason would be to not do that?
> >
> > If the HW doesn't support it, then does the HW really need/support a
> > VIOMMU?
>
> Now it's agreed to build trusted I/O on top of this new vIOMMU object.
> format-wise probably it's free to assume that nested parent is supported
> on any new platform which will support trusted I/O. But I'm not sure
> all the conditions around allowing nested are same as for trusted I/O,
> e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
> always guaranteed in trusted I/O configuration?
ARM is a big ? what exactly will come, but I'm expecting that to be
resolved either with continued HW support or Linux will add the cache
flushing and relax the test.
> Baolu did raise a good open to confirm given it will be used beyond
> nesting. 😊
Even CC is "nesting", it is just nested with a fixed Identity S1 in
the baseline case. The S2 translation still exists and still has to be
consistent with whatever the secure world is doing.
So, my feeling is that the S2 nested domain is mandatory for the
viommu, especially for CC, it must exists. In the end there may be
more options than just a nested parent.
For instance if the CC design relies on the secure world sharing the
CPU and IOMMU page table we might need a new HWPT type to represent
that configuration.
From a uapi perspective we seem OK here as the hwpt input could be
anything. We might have to adjust some checks in the kernel someday.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
2024-10-25 9:18 ` Tian, Kevin
@ 2024-10-25 15:41 ` Jason Gunthorpe
0 siblings, 0 replies; 36+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:41 UTC (permalink / raw)
To: Tian, Kevin
Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 09:18:05AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:20 AM
> >
> > Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type.
> > Implement
> > an arm_vsmmu_alloc() with its viommu op
> > arm_vsmmu_domain_alloc_nested(),
> > to replace arm_smmu_domain_alloc_nesting(). As an initial step, copy the
> > VMID from s2_parent. A later cleanup series is required to move the VMID
> > allocation out of the stage-2 domain allocation routine to this.
> >
> > After that, replace nested_domain->s2_parent with nested_domain->vsmmu.
> >
> > Note that the validatting conditions for a nested_domain allocation are
> > moved from arm_vsmmu_domain_alloc_nested to arm_vsmmu_alloc, since
> > there
> > is no point in creating a vIOMMU (vsmmu) from the beginning if it would
> > not support a nested_domain.
> >
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
>
> hmm I wonder whether this series should be merged with Jason's
> nesting series together and directly use vIOMMU to create nesting.
> Otherwise it looks a bit weird for one series to first enable a uAPI
> which is immediately replaced by another uAPI from the following
> series.
It has changed from my original expectation, that's for sure. I've
wondered the same thing.
For now I've been keeping them separate and was going to review when
this is all settled down.
It is troublesome because of all the branches, but if we don't have a
conflict we could take the whole lot through iommufd.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
2024-10-25 8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
@ 2024-10-25 15:42 ` Jason Gunthorpe
2024-10-28 2:35 ` Tian, Kevin
2024-10-25 16:28 ` Nicolin Chen
1 sibling, 1 reply; 36+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:42 UTC (permalink / raw)
To: Tian, Kevin
Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > The vIOMMU object should be seen as a slice of a physical IOMMU instance
> > that is passed to or shared with a VM. That can be some HW/SW resources:
> > - Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > - Access to a sharable nesting parent pagetable across physical IOMMUs
> > - Virtualization of various platforms IDs, e.g. RIDs and others
> > - Delivery of paravirtualized invalidation
> > - Direct assigned invalidation queues
> > - Direct assigned interrupts
> > - Non-affiliated event reporting
>
> sorry no idea about 'non-affiliated event'. Can you elaborate?
This would be an even that is not a connected to a device
For instance a CMDQ experienced a problem.
Jason
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
2024-10-25 9:04 ` Tian, Kevin
@ 2024-10-25 16:14 ` Nicolin Chen
0 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:14 UTC (permalink / raw)
To: Tian, Kevin
Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 09:04:15AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > +static struct iommufd_hwpt_nested *
> > +iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu
> > *viommu,
> > + const struct iommu_user_data *user_data)
>
> probably "_for" can be skipped to reduce the name length
That would sound like a hwpt_nested allocating vIOMMU...
It'd be probably neutral to have iommufd_viommu_alloc_hwpt_nested,
yet we have iommufd_hwpt_nested_alloc (HWPT-based) to align with..
> looks there missed a check on flags in this path.
Oh yes, I missed that. Will pass in the cmd->flags.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
2024-10-25 9:05 ` Tian, Kevin
@ 2024-10-25 16:17 ` Nicolin Chen
0 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:17 UTC (permalink / raw)
To: Tian, Kevin
Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 09:05:58AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> > +
> > + viommu->type = cmd->type;
> > + viommu->ictx = ucmd->ictx;
> > + viommu->hwpt = hwpt_paging;
> > + /* Assume physical IOMMUs are unpluggable (the most likely case)
> > */
> > + viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
> > +
>
> so what would happen if this assumption breaks?
I had a very verbose comments previously that Alexey suggested to
optimize away.. Perhaps I should add back the part that mentions
adding a refcount for pluggable ones..
Nicolin
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
2024-10-25 8:59 ` Tian, Kevin
@ 2024-10-25 16:22 ` Nicolin Chen
0 siblings, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:22 UTC (permalink / raw)
To: Tian, Kevin
Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 08:59:11AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > Add a new ioctl for user space to do a vIOMMU allocation. It must be based
> > on a nesting parent HWPT, so take its refcount.
> >
> > If an IOMMU driver supports a driver-managed vIOMMU object, it must
> > define
>
> why highlight 'driver-managed', implying a core-managed vIOMMU
> object some day?
Oh, core-managed vIOMMU is gone since this version. I should have
updated the commit message here too.
> > +/**
> > + * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
> > + * @size: sizeof(struct iommu_viommu_alloc)
> > + * @flags: Must be 0
> > + * @type: Type of the virtual IOMMU. Must be defined in enum
> > iommu_viommu_type
> > + * @dev_id: The device's physical IOMMU will be used to back the virtual
> > IOMMU
> > + * @hwpt_id: ID of a nesting parent HWPT to associate to
> > + * @out_viommu_id: Output virtual IOMMU ID for the allocated object
> > + *
> > + * Allocate a virtual IOMMU object that represents the underlying physical
> > + * IOMMU's virtualization support. The vIOMMU object is a security-isolated
> > + * slice of the physical IOMMU HW that is unique to a specific VM.
>
> the object itself is a software abstraction, while a 'slice' is a set of
> real hw resources.
Yea, let's do this:
* Allocate a virtual IOMMU object, representing the underlying physical IOMMU's
* virtualization support that is a security-isolated slice of the real IOMMU HW
* that is unique to a specific VM.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
2024-10-25 8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
2024-10-25 15:42 ` Jason Gunthorpe
@ 2024-10-25 16:28 ` Nicolin Chen
1 sibling, 0 replies; 36+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:28 UTC (permalink / raw)
To: Tian, Kevin
Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > This series introduces a new vIOMMU infrastructure and related ioctls.
> >
> > IOMMUFD has been using the HWPT infrastructure for all cases, including a
> > nested IO page table support. Yet, there're limitations for an HWPT-based
> > structure to support some advanced HW-accelerated features, such as
> > CMDQV
> > on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-
> > IOMMU
> > environment, it is not straightforward for nested HWPTs to share the same
> > parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
> > parent HWPT typically hold one stage-2 IO pagetable and tag it with only
> > one ID in the cache entries. When sharing one large stage-2 IO pagetable
> > across physical IOMMU instances, that one ID may not always be available
> > across all the IOMMU instances. In other word, it's ideal for SW to have
> > a different container for the stage-2 IO pagetable so it can hold another
> > ID that's available.
>
> Just holding multiple IDs doesn't require a different container. This is
> just a side effect when vIOMMU will be required for other said reasons.
>
> If we have to put more words here I'd prefer to adding a bit more for
> CMDQV which is more compelling. not a big deal though. 😊
Ack.
> > For this "different container", add vIOMMU, an additional layer to hold
> > extra virtualization information:
> >
> > ________________________________________________________________
> > _______
> > | iommufd (with vIOMMU) |
> > | |
> > | [5] |
> > | _____________ |
> > | | | |
> > | |----------------| vIOMMU | |
> > | | | | |
> > | | | | |
> > | | [1] | | [4] [2] |
> > | | ______ | | _____________ ________ |
> > | | | | | [3] | | | | | |
> > | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
> > | | |______| |_____________| |_____________| |________| |
> > | | | | | | |
> >
> > |______|________|______________|__________________|_____________
> > __|_____|
> > | | | | |
> > ______v_____ | ______v_____ ______v_____ ___v__
> > | struct | | PFN | (paging) | | (nested) | |struct|
> > |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----
> > |device|
> > |____________| storage|____________| |____________| |______|
> >
>
> nit - [1] ... [5] can be removed.
They are copied from the Documentation where numbers are needed.
I will take all the numbers out in the cover-letters.
> > The vIOMMU object should be seen as a slice of a physical IOMMU instance
> > that is passed to or shared with a VM. That can be some HW/SW resources:
> > - Security namespace for guest owned ID, e.g. guest-controlled cache tags
> > - Access to a sharable nesting parent pagetable across physical IOMMUs
> > - Virtualization of various platforms IDs, e.g. RIDs and others
> > - Delivery of paravirtualized invalidation
> > - Direct assigned invalidation queues
> > - Direct assigned interrupts
> > - Non-affiliated event reporting
>
> sorry no idea about 'non-affiliated event'. Can you elaborate?
I'll put an "e.g.".
> > On a multi-IOMMU system, the vIOMMU object must be instanced to the
> > number
> > of the physical IOMMUs that are passed to (via devices) a guest VM, while
>
> 'to the number of the physical IOMMUs that have a slice passed to ..."
Ack.
Thanks
Nicolin
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
2024-10-25 15:24 ` Jason Gunthorpe
@ 2024-10-28 2:30 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-28 2:30 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Baolu Lu, Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, October 25, 2024 11:24 PM
>
> On Fri, Oct 25, 2024 at 08:47:40AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, October 22, 2024 9:16 PM
> > >
> > > On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> > >
> > > > Is it feasible to make vIOMMU object more generic, rather than strictly
> > > > tying it to nested translation? For example, a normal paging domain
> that
> > > > translates gPAs to hPAs could also have a vIOMMU object associated
> with
> > > > it.
> > > >
> > > > While we can only support vIOMMU object allocation uAPI for S2 paging
> > > > domains in the context of this series, we could consider leaving the
> > > > option open to associate a vIOMMU object with other normal paging
> > > > domains that are not a nested parent?
> > >
> > > Why? The nested parent flavour of the domain is basically free to
> > > create, what reason would be to not do that?
> > >
> > > If the HW doesn't support it, then does the HW really need/support a
> > > VIOMMU?
> >
> > Now it's agreed to build trusted I/O on top of this new vIOMMU object.
> > format-wise probably it's free to assume that nested parent is supported
> > on any new platform which will support trusted I/O. But I'm not sure
> > all the conditions around allowing nested are same as for trusted I/O,
> > e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
> > always guaranteed in trusted I/O configuration?
>
> ARM is a big ? what exactly will come, but I'm expecting that to be
> resolved either with continued HW support or Linux will add the cache
> flushing and relax the test.
>
> > Baolu did raise a good open to confirm given it will be used beyond
> > nesting. 😊
>
> Even CC is "nesting", it is just nested with a fixed Identity S1 in
> the baseline case. The S2 translation still exists and still has to be
> consistent with whatever the secure world is doing.
this is true. That is why I asked more from the conditions around
enabling nested instead of the translation/format itself.
>
> So, my feeling is that the S2 nested domain is mandatory for the
> viommu, especially for CC, it must exists. In the end there may be
> more options than just a nested parent.
>
> For instance if the CC design relies on the secure world sharing the
> CPU and IOMMU page table we might need a new HWPT type to represent
> that configuration.
>
> From a uapi perspective we seem OK here as the hwpt input could be
> anything. We might have to adjust some checks in the kernel someday.
>
yes, that could be extended in case of a need.
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
2024-10-25 15:42 ` Jason Gunthorpe
@ 2024-10-28 2:35 ` Tian, Kevin
0 siblings, 0 replies; 36+ messages in thread
From: Tian, Kevin @ 2024-10-28 2:35 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
patches@lists.linux.dev
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, October 25, 2024 11:43 PM
>
> On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > > The vIOMMU object should be seen as a slice of a physical IOMMU
> instance
> > > that is passed to or shared with a VM. That can be some HW/SW
> resources:
> > > - Security namespace for guest owned ID, e.g. guest-controlled cache
> tags
> > > - Access to a sharable nesting parent pagetable across physical IOMMUs
> > > - Virtualization of various platforms IDs, e.g. RIDs and others
> > > - Delivery of paravirtualized invalidation
> > > - Direct assigned invalidation queues
> > > - Direct assigned interrupts
> > > - Non-affiliated event reporting
> >
> > sorry no idea about 'non-affiliated event'. Can you elaborate?
>
> This would be an even that is not a connected to a device
>
> For instance a CMDQ experienced a problem.
>
Okay, then 'non-device-affiliated' is probably clearer.
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2024-10-28 2:37 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22 0:19 [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Nicolin Chen
2024-10-25 8:34 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Nicolin Chen
2024-10-22 2:28 ` Baolu Lu
2024-10-22 4:40 ` Nicolin Chen
2024-10-22 8:59 ` Baolu Lu
2024-10-22 13:15 ` Jason Gunthorpe
2024-10-23 1:48 ` Baolu Lu
2024-10-25 8:47 ` Tian, Kevin
2024-10-25 15:24 ` Jason Gunthorpe
2024-10-28 2:30 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object Nicolin Chen
2024-10-25 8:49 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Nicolin Chen
2024-10-25 8:59 ` Tian, Kevin
2024-10-25 16:22 ` Nicolin Chen
2024-10-25 9:05 ` Tian, Kevin
2024-10-25 16:17 ` Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops Nicolin Chen
2024-10-25 9:00 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Nicolin Chen
2024-10-25 9:04 ` Tian, Kevin
2024-10-25 16:14 ` Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 07/11] iommufd/selftest: Add refcount to mock_iommu_device Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 08/11] iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 09/11] iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage Nicolin Chen
2024-10-22 0:19 ` [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU Nicolin Chen
2024-10-25 9:11 ` Tian, Kevin
2024-10-22 0:19 ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Nicolin Chen
2024-10-25 9:18 ` Tian, Kevin
2024-10-25 15:41 ` Jason Gunthorpe
2024-10-25 8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
2024-10-25 15:42 ` Jason Gunthorpe
2024-10-28 2:35 ` Tian, Kevin
2024-10-25 16:28 ` Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).