Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* RE: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
       [not found] <cover.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  8:34 ` Tian, Kevin
  2024-10-25 15:42   ` Jason Gunthorpe
  2024-10-25 16:28   ` Nicolin Chen
       [not found] ` <1452c535a2e6a6c61f38fa752132db7e88b55770.1729553811.git.nicolinc@nvidia.com>
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  8:34 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> 
> This series introduces a new vIOMMU infrastructure and related ioctls.
> 
> IOMMUFD has been using the HWPT infrastructure for all cases, including a
> nested IO page table support. Yet, there're limitations for an HWPT-based
> structure to support some advanced HW-accelerated features, such as
> CMDQV
> on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-
> IOMMU
> environment, it is not straightforward for nested HWPTs to share the same
> parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
> parent HWPT typically hold one stage-2 IO pagetable and tag it with only
> one ID in the cache entries. When sharing one large stage-2 IO pagetable
> across physical IOMMU instances, that one ID may not always be available
> across all the IOMMU instances. In other word, it's ideal for SW to have
> a different container for the stage-2 IO pagetable so it can hold another
> ID that's available.

Just holding multiple IDs doesn't require a different container. This is
just a side effect when vIOMMU will be required for other said reasons.

If we have to put more words here I'd prefer to adding a bit more for
CMDQV which is more compelling. not a big deal though. 😊

> 
> For this "different container", add vIOMMU, an additional layer to hold
> extra virtualization information:
> 
> ________________________________________________________________
> _______
>  |                      iommufd (with vIOMMU)                            |
>  |                                                                       |
>  |                             [5]                                       |
>  |                        _____________                                  |
>  |                       |             |                                 |
>  |      |----------------|    vIOMMU   |                                 |
>  |      |                |             |                                 |
>  |      |                |             |                                 |
>  |      |      [1]       |             |          [4]             [2]    |
>  |      |     ______     |             |     _____________     ________  |
>  |      |    |      |    |     [3]     |    |             |   |        | |
>  |      |    | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
>  |      |    |______|    |_____________|    |_____________|   |________| |
>  |      |        |              |                  |               |     |
> 
> |______|________|______________|__________________|_____________
> __|_____|
>         |        |              |                  |               |
>   ______v_____   |        ______v_____       ______v_____       ___v__
>  |   struct   |  |  PFN  |  (paging)  |     |  (nested)  |     |struct|
>  |iommu_device|  |------>|iommu_domain|<----|iommu_domain|<----
> |device|
>  |____________|   storage|____________|     |____________|     |______|
> 

nit - [1] ... [5] can be removed.

> The vIOMMU object should be seen as a slice of a physical IOMMU instance
> that is passed to or shared with a VM. That can be some HW/SW resources:
>  - Security namespace for guest owned ID, e.g. guest-controlled cache tags
>  - Access to a sharable nesting parent pagetable across physical IOMMUs
>  - Virtualization of various platforms IDs, e.g. RIDs and others
>  - Delivery of paravirtualized invalidation
>  - Direct assigned invalidation queues
>  - Direct assigned interrupts
>  - Non-affiliated event reporting

sorry no idea about 'non-affiliated event'. Can you elaborate?

> 
> On a multi-IOMMU system, the vIOMMU object must be instanced to the
> number
> of the physical IOMMUs that are passed to (via devices) a guest VM, while

'to the number of the physical IOMMUs that have a slice passed to ..."

> being able to hold the shareable parent HWPT. Each vIOMMU then just
> needs
> to allocate its own individual ID to tag its own cache:
>                      ----------------------------
>  ----------------    |         |  paging_hwpt0  |
>  | hwpt_nested0 |--->| viommu0 ------------------
>  ----------------    |         |      IDx       |
>                      ----------------------------
>                      ----------------------------
>  ----------------    |         |  paging_hwpt0  |
>  | hwpt_nested1 |--->| viommu1 ------------------
>  ----------------    |         |      IDy       |
>                      ----------------------------
> 
> As an initial part-1, add IOMMUFD_CMD_VIOMMU_ALLOC ioctl for an
> allocation
> only. And implement it in arm-smmu-v3 driver as a real world use case.
> 
> More vIOMMU-based structs and ioctls will be introduced in the follow-up
> series to support vDEVICE, vIRQ (vEVENT) and vQUEUE objects. Although we
> repurposed the vIOMMU object from an earlier RFC, just for a referece:
> https://lore.kernel.org/all/cover.1712978212.git.nicolinc@nvidia.com/
> 
> This series is on Github:
> https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p1-v4
> (paring QEMU branch for testing will be provided with the part2 series)
> 
> Changelog
> v4
>  * Added "Reviewed-by" from Jason
>  * Dropped IOMMU_VIOMMU_TYPE_DEFAULT support
>  * Dropped iommufd_object_alloc_elm renamings
>  * Renamed iommufd's viommu_api.c to driver.c
>  * Reworked iommufd_viommu_alloc helper
>  * Added a separate iommufd_hwpt_nested_alloc_for_viommu function for
>    hwpt_nested allocations on a vIOMMU, and added comparison between
>    viommu->iommu_dev->ops and dev_iommu_ops(idev->dev)
>  * Replaced s2_parent with vsmmu in arm_smmu_nested_domain
>  * Replaced domain_alloc_user in iommu_ops with domain_alloc_nested in
>    viommu_ops
>  * Replaced wait_queue_head_t with a completion, to delay the unplug of
>    mock_iommu_dev
>  * Corrected documentation graph that was missing struct iommu_device
>  * Added an iommufd_verify_unfinalized_object helper to verify driver-
>    allocated vIOMMU/vDEVICE objects
>  * Added missing test cases for TEST_LENGTH and fail_nth
> v3
>  https://lore.kernel.org/all/cover.1728491453.git.nicolinc@nvidia.com/
>  * Rebased on top of Jason's nesting v3 series
>    https://lore.kernel.org/all/0-v3-e2e16cd7467f+2a6a1-
> smmuv3_nesting_jgg@nvidia.com/
>  * Split the series into smaller parts
>  * Added Jason's Reviewed-by
>  * Added back viommu->iommu_dev
>  * Added support for driver-allocated vIOMMU v.s. core-allocated
>  * Dropped arm_smmu_cache_invalidate_user
>  * Added an iommufd_test_wait_for_users() in selftest
>  * Reworked test code to make viommu an individual FIXTURE
>  * Added missing TEST_LENGTH case for the new ioctl command
> v2
>  https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
>  * Limited vdev_id to one per idev
>  * Added a rw_sem to protect the vdev_id list
>  * Reworked driver-level APIs with proper lockings
>  * Added a new viommu_api file for IOMMUFD_DRIVER config
>  * Dropped useless iommu_dev point from the viommu structure
>  * Added missing index numnbers to new types in the uAPI header
>  * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT
> one
>  * Reworked mock_viommu_cache_invalidate() using the new iommu helper
>  * Reordered details of set/unset_vdev_id handlers for proper lockings
> v1
>  https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
> 
> Thanks!
> Nicolin
> 
> Nicolin Chen (11):
>   iommufd: Move struct iommufd_object to public iommufd header
>   iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
>   iommufd: Add iommufd_verify_unfinalized_object
>   iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
>   iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
>   iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
>   iommufd/selftest: Add refcount to mock_iommu_device
>   iommufd/selftest: Add IOMMU_VIOMMU_TYPE_SELFTEST
>   iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverage
>   Documentation: userspace-api: iommufd: Update vIOMMU
>   iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3
> support
> 
>  drivers/iommu/iommufd/Makefile                |  5 +-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   | 26 +++---
>  drivers/iommu/iommufd/iommufd_private.h       | 36 ++------
>  drivers/iommu/iommufd/iommufd_test.h          |  2 +
>  include/linux/iommu.h                         | 14 +++
>  include/linux/iommufd.h                       | 89 +++++++++++++++++++
>  include/uapi/linux/iommufd.h                  | 56 ++++++++++--
>  tools/testing/selftests/iommu/iommufd_utils.h | 28 ++++++
>  .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 79 ++++++++++------
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   |  9 +-
>  drivers/iommu/iommufd/driver.c                | 38 ++++++++
>  drivers/iommu/iommufd/hw_pagetable.c          | 69 +++++++++++++-
>  drivers/iommu/iommufd/main.c                  | 58 ++++++------
>  drivers/iommu/iommufd/selftest.c              | 73 +++++++++++++--
>  drivers/iommu/iommufd/viommu.c                | 85 ++++++++++++++++++
>  tools/testing/selftests/iommu/iommufd.c       | 78 ++++++++++++++++
>  .../selftests/iommu/iommufd_fail_nth.c        | 11 +++
>  Documentation/userspace-api/iommufd.rst       | 69 +++++++++++++-
>  18 files changed, 701 insertions(+), 124 deletions(-)
>  create mode 100644 drivers/iommu/iommufd/driver.c
>  create mode 100644 drivers/iommu/iommufd/viommu.c
> 
> --
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header
       [not found] ` <1452c535a2e6a6c61f38fa752132db7e88b55770.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  8:34   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  8:34 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> 
> Prepare for an embedded structure design for driver-level iommufd_viommu
> objects:
>     // include/linux/iommufd.h
>     struct iommufd_viommu {
>         struct iommufd_object obj;
>         ....
>     };
> 
>     // Some IOMMU driver
>     struct iommu_driver_viommu {
>         struct iommufd_viommu core;
>         ....
>     };
> 
> It has to expose struct iommufd_object and enum iommufd_object_type
> from
> the core-level private header to the public iommufd header.
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
       [not found]         ` <20241022131554.GF13034@nvidia.com>
@ 2024-10-25  8:47           ` Tian, Kevin
  2024-10-25 15:24             ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  8:47 UTC (permalink / raw)
  To: Jason Gunthorpe, Baolu Lu
  Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Tuesday, October 22, 2024 9:16 PM
> 
> On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> 
> > Is it feasible to make vIOMMU object more generic, rather than strictly
> > tying it to nested translation? For example, a normal paging domain that
> > translates gPAs to hPAs could also have a vIOMMU object associated with
> > it.
> >
> > While we can only support vIOMMU object allocation uAPI for S2 paging
> > domains in the context of this series, we could consider leaving the
> > option open to associate a vIOMMU object with other normal paging
> > domains that are not a nested parent?
> 
> Why? The nested parent flavour of the domain is basically free to
> create, what reason would be to not do that?
> 
> If the HW doesn't support it, then does the HW really need/support a
> VIOMMU?
> 

Now it's agreed to build trusted I/O on top of this new vIOMMU object.
format-wise probably it's free to assume that nested parent is supported
on any new platform which will support trusted I/O. But I'm not sure
all the conditions around allowing nested are same as for trusted I/O,
e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
always guaranteed in trusted I/O configuration?

Baolu did raise a good open to confirm given it will be used beyond
nesting. 😊

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object
       [not found] ` <b48254df9ee530bedaf436e5aff279d9a882a7ca.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  8:49   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  8:49 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> 
> To support driver-allocated vIOMMU objects, it's suggested to call the
> allocator helper in IOMMU dirvers. However, there is no guarantee that
> drivers will all use it and allocate objects properly.
> 
> Add a helper for iommufd core to verify if an unfinalized object is at
> least reserved in the ictx.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
       [not found] ` <9da2cf334a182ced4d4ffa578b87889e9c0856f3.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  8:59   ` Tian, Kevin
  2024-10-25 16:22     ` Nicolin Chen
  2024-10-25  9:05   ` Tian, Kevin
  1 sibling, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  8:59 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> 
> Add a new ioctl for user space to do a vIOMMU allocation. It must be based
> on a nesting parent HWPT, so take its refcount.
> 
> If an IOMMU driver supports a driver-managed vIOMMU object, it must
> define

why highlight 'driver-managed', implying a core-managed vIOMMU 
object some day?

> +/**
> + * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
> + * @size: sizeof(struct iommu_viommu_alloc)
> + * @flags: Must be 0
> + * @type: Type of the virtual IOMMU. Must be defined in enum
> iommu_viommu_type
> + * @dev_id: The device's physical IOMMU will be used to back the virtual
> IOMMU
> + * @hwpt_id: ID of a nesting parent HWPT to associate to
> + * @out_viommu_id: Output virtual IOMMU ID for the allocated object
> + *
> + * Allocate a virtual IOMMU object that represents the underlying physical
> + * IOMMU's virtualization support. The vIOMMU object is a security-isolated
> + * slice of the physical IOMMU HW that is unique to a specific VM.

the object itself is a software abstraction, while a 'slice' is a set of
real hw resources.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops
       [not found] ` <1314e12b76edd84a8c9d0f14a6598538c8eeb50e.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  9:00   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  9:00 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> 
> Allow IOMMU driver to use a vIOMMU object that holds a nesting parent
> hwpt/domain to allocate a nested domain.
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
       [not found] ` <ba9ce9b3c8a93c9c790fee52961d075dfbb63ca7.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  9:04   ` Tian, Kevin
  2024-10-25 16:14     ` Nicolin Chen
  0 siblings, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  9:04 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
>
> +static struct iommufd_hwpt_nested *
> +iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu
> *viommu,
> +				     const struct iommu_user_data *user_data)

probably "_for" can be skipped to reduce the name length

> +{
> +	struct iommufd_hwpt_nested *hwpt_nested;
> +	struct iommufd_hw_pagetable *hwpt;
> +	int rc;
> +
> +	if (!viommu->ops || !viommu->ops->domain_alloc_nested)
> +		return ERR_PTR(-EOPNOTSUPP);
> +
> +	hwpt_nested = __iommufd_object_alloc(
> +		viommu->ictx, hwpt_nested, IOMMUFD_OBJ_HWPT_NESTED,
> common.obj);
> +	if (IS_ERR(hwpt_nested))
> +		return ERR_CAST(hwpt_nested);
> +	hwpt = &hwpt_nested->common;
> +
> +	hwpt_nested->viommu = viommu;
> +	hwpt_nested->parent = viommu->hwpt;
> +	refcount_inc(&viommu->obj.users);
> +
> +	hwpt->domain = viommu->ops->domain_alloc_nested(viommu,
> user_data);
> +	if (IS_ERR(hwpt->domain)) {
> +		rc = PTR_ERR(hwpt->domain);
> +		hwpt->domain = NULL;
> +		goto out_abort;
> +	}
> +	hwpt->domain->owner = viommu->iommu_dev->ops;
> +
> +	if (WARN_ON_ONCE(hwpt->domain->type !=
> IOMMU_DOMAIN_NESTED)) {
> +		rc = -EINVAL;
> +		goto out_abort;
> +	}
> +	return hwpt_nested;
> +
> +out_abort:
> +	iommufd_object_abort_and_destroy(viommu->ictx, &hwpt->obj);
> +	return ERR_PTR(rc);
> +}
> +

looks there missed a check on flags in this path.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
       [not found] ` <9da2cf334a182ced4d4ffa578b87889e9c0856f3.1729553811.git.nicolinc@nvidia.com>
  2024-10-25  8:59   ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Tian, Kevin
@ 2024-10-25  9:05   ` Tian, Kevin
  2024-10-25 16:17     ` Nicolin Chen
  1 sibling, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  9:05 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:19 AM
> +
> +	viommu->type = cmd->type;
> +	viommu->ictx = ucmd->ictx;
> +	viommu->hwpt = hwpt_paging;
> +	/* Assume physical IOMMUs are unpluggable (the most likely case)
> */
> +	viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
> +

so what would happen if this assumption breaks?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU
       [not found] ` <97fa87a511f0132ee0e233cadf09af075e4404c5.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  9:11   ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  9:11 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:20 AM
> 
> With the introduction of the new object and its infrastructure, update the
> doc to reflect that and add a new graph.
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
       [not found] ` <2180fdf423d0f2fcc5c031687690100b12c2ba51.1729553811.git.nicolinc@nvidia.com>
@ 2024-10-25  9:18   ` Tian, Kevin
  2024-10-25 15:41     ` Jason Gunthorpe
  0 siblings, 1 reply; 19+ messages in thread
From: Tian, Kevin @ 2024-10-25  9:18 UTC (permalink / raw)
  To: Nicolin Chen, jgg@nvidia.com, will@kernel.org
  Cc: joro@8bytes.org, suravee.suthikulpanit@amd.com,
	robin.murphy@arm.com, dwmw2@infradead.org,
	baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Tuesday, October 22, 2024 8:20 AM
> 
> Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type.
> Implement
> an arm_vsmmu_alloc() with its viommu op
> arm_vsmmu_domain_alloc_nested(),
> to replace arm_smmu_domain_alloc_nesting(). As an initial step, copy the
> VMID from s2_parent. A later cleanup series is required to move the VMID
> allocation out of the stage-2 domain allocation routine to this.
> 
> After that, replace nested_domain->s2_parent with nested_domain->vsmmu.
> 
> Note that the validatting conditions for a nested_domain allocation are
> moved from arm_vsmmu_domain_alloc_nested to arm_vsmmu_alloc, since
> there
> is no point in creating a vIOMMU (vsmmu) from the beginning if it would
> not support a nested_domain.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

hmm I wonder whether this series should be merged with Jason's
nesting series together and directly use vIOMMU to create nesting.
Otherwise it looks a bit weird for one series to first enable a uAPI
which is immediately replaced by another uAPI from the following
series. Even if both are merged in one cycle, logically it doesn't
sound clean when looking at the git history.





^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
  2024-10-25  8:47           ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Tian, Kevin
@ 2024-10-25 15:24             ` Jason Gunthorpe
  2024-10-28  2:30               ` Tian, Kevin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:24 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Baolu Lu, Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 08:47:40AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, October 22, 2024 9:16 PM
> > 
> > On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> > 
> > > Is it feasible to make vIOMMU object more generic, rather than strictly
> > > tying it to nested translation? For example, a normal paging domain that
> > > translates gPAs to hPAs could also have a vIOMMU object associated with
> > > it.
> > >
> > > While we can only support vIOMMU object allocation uAPI for S2 paging
> > > domains in the context of this series, we could consider leaving the
> > > option open to associate a vIOMMU object with other normal paging
> > > domains that are not a nested parent?
> > 
> > Why? The nested parent flavour of the domain is basically free to
> > create, what reason would be to not do that?
> > 
> > If the HW doesn't support it, then does the HW really need/support a
> > VIOMMU?
> 
> Now it's agreed to build trusted I/O on top of this new vIOMMU object.
> format-wise probably it's free to assume that nested parent is supported
> on any new platform which will support trusted I/O. But I'm not sure
> all the conditions around allowing nested are same as for trusted I/O,
> e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
> always guaranteed in trusted I/O configuration?

ARM is a big ? what exactly will come, but I'm expecting that to be
resolved either with continued HW support or Linux will add the cache
flushing and relax the test.

> Baolu did raise a good open to confirm given it will be used beyond
> nesting. 😊

Even CC is "nesting", it is just nested with a fixed Identity S1 in
the baseline case. The S2 translation still exists and still has to be
consistent with whatever the secure world is doing.

So, my feeling is that the S2 nested domain is mandatory for the
viommu, especially for CC, it must exists. In the end there may be
more options than just a nested parent.

For instance if the CC design relies on the secure world sharing the
CPU and IOMMU page table we might need a new HWPT type to represent
that configuration.

From a uapi perspective we seem OK here as the hwpt input could be
anything. We might have to adjust some checks in the kernel someday.

Jason


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support
  2024-10-25  9:18   ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Tian, Kevin
@ 2024-10-25 15:41     ` Jason Gunthorpe
  0 siblings, 0 replies; 19+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:41 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 09:18:05AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:20 AM
> > 
> > Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type.
> > Implement
> > an arm_vsmmu_alloc() with its viommu op
> > arm_vsmmu_domain_alloc_nested(),
> > to replace arm_smmu_domain_alloc_nesting(). As an initial step, copy the
> > VMID from s2_parent. A later cleanup series is required to move the VMID
> > allocation out of the stage-2 domain allocation routine to this.
> > 
> > After that, replace nested_domain->s2_parent with nested_domain->vsmmu.
> > 
> > Note that the validatting conditions for a nested_domain allocation are
> > moved from arm_vsmmu_domain_alloc_nested to arm_vsmmu_alloc, since
> > there
> > is no point in creating a vIOMMU (vsmmu) from the beginning if it would
> > not support a nested_domain.
> > 
> > Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> 
> hmm I wonder whether this series should be merged with Jason's
> nesting series together and directly use vIOMMU to create nesting.
> Otherwise it looks a bit weird for one series to first enable a uAPI
> which is immediately replaced by another uAPI from the following
> series.

It has changed from my original expectation, that's for sure. I've
wondered the same thing.

For now I've been keeping them separate and was going to review when
this is all settled down.

It is troublesome because of all the branches, but if we don't have a
conflict we could take the whole lot through iommufd.

Jason


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
  2024-10-25  8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
@ 2024-10-25 15:42   ` Jason Gunthorpe
  2024-10-28  2:35     ` Tian, Kevin
  2024-10-25 16:28   ` Nicolin Chen
  1 sibling, 1 reply; 19+ messages in thread
From: Jason Gunthorpe @ 2024-10-25 15:42 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > The vIOMMU object should be seen as a slice of a physical IOMMU instance
> > that is passed to or shared with a VM. That can be some HW/SW resources:
> >  - Security namespace for guest owned ID, e.g. guest-controlled cache tags
> >  - Access to a sharable nesting parent pagetable across physical IOMMUs
> >  - Virtualization of various platforms IDs, e.g. RIDs and others
> >  - Delivery of paravirtualized invalidation
> >  - Direct assigned invalidation queues
> >  - Direct assigned interrupts
> >  - Non-affiliated event reporting
> 
> sorry no idea about 'non-affiliated event'. Can you elaborate?

This would be an even that is not a connected to a device

For instance a CMDQ experienced a problem.

Jason


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC
  2024-10-25  9:04   ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Tian, Kevin
@ 2024-10-25 16:14     ` Nicolin Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:14 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 09:04:15AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > +static struct iommufd_hwpt_nested *
> > +iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu
> > *viommu,
> > +                                  const struct iommu_user_data *user_data)
> 
> probably "_for" can be skipped to reduce the name length

That would sound like a hwpt_nested allocating vIOMMU...

It'd be probably neutral to have iommufd_viommu_alloc_hwpt_nested,
yet we have iommufd_hwpt_nested_alloc (HWPT-based) to align with..

> looks there missed a check on flags in this path.

Oh yes, I missed that. Will pass in the cmd->flags.

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
  2024-10-25  9:05   ` Tian, Kevin
@ 2024-10-25 16:17     ` Nicolin Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:17 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 09:05:58AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> > +
> > +     viommu->type = cmd->type;
> > +     viommu->ictx = ucmd->ictx;
> > +     viommu->hwpt = hwpt_paging;
> > +     /* Assume physical IOMMUs are unpluggable (the most likely case)
> > */
> > +     viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
> > +
> 
> so what would happen if this assumption breaks?

I had a very verbose comments previously that Alexey suggested to
optimize away.. Perhaps I should add back the part that mentions
adding a refcount for pluggable ones..

Nicolin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl
  2024-10-25  8:59   ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Tian, Kevin
@ 2024-10-25 16:22     ` Nicolin Chen
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:22 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 08:59:11AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > Add a new ioctl for user space to do a vIOMMU allocation. It must be based
> > on a nesting parent HWPT, so take its refcount.
> >
> > If an IOMMU driver supports a driver-managed vIOMMU object, it must
> > define
> 
> why highlight 'driver-managed', implying a core-managed vIOMMU
> object some day?

Oh, core-managed vIOMMU is gone since this version. I should have
updated the commit message here too.

> > +/**
> > + * struct iommu_viommu_alloc - ioctl(IOMMU_VIOMMU_ALLOC)
> > + * @size: sizeof(struct iommu_viommu_alloc)
> > + * @flags: Must be 0
> > + * @type: Type of the virtual IOMMU. Must be defined in enum
> > iommu_viommu_type
> > + * @dev_id: The device's physical IOMMU will be used to back the virtual
> > IOMMU
> > + * @hwpt_id: ID of a nesting parent HWPT to associate to
> > + * @out_viommu_id: Output virtual IOMMU ID for the allocated object
> > + *
> > + * Allocate a virtual IOMMU object that represents the underlying physical
> > + * IOMMU's virtualization support. The vIOMMU object is a security-isolated
> > + * slice of the physical IOMMU HW that is unique to a specific VM.
> 
> the object itself is a software abstraction, while a 'slice' is a set of
> real hw resources.

Yea, let's do this:
 * Allocate a virtual IOMMU object, representing the underlying physical IOMMU's
 * virtualization support that is a security-isolated slice of the real IOMMU HW
 * that is unique to a specific VM.

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
  2024-10-25  8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
  2024-10-25 15:42   ` Jason Gunthorpe
@ 2024-10-25 16:28   ` Nicolin Chen
  1 sibling, 0 replies; 19+ messages in thread
From: Nicolin Chen @ 2024-10-25 16:28 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: jgg@nvidia.com, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Tuesday, October 22, 2024 8:19 AM
> >
> > This series introduces a new vIOMMU infrastructure and related ioctls.
> >
> > IOMMUFD has been using the HWPT infrastructure for all cases, including a
> > nested IO page table support. Yet, there're limitations for an HWPT-based
> > structure to support some advanced HW-accelerated features, such as
> > CMDQV
> > on NVIDIA Grace, and HW-accelerated vIOMMU on AMD. Even for a multi-
> > IOMMU
> > environment, it is not straightforward for nested HWPTs to share the same
> > parent HWPT (stage-2 IO pagetable), with the HWPT infrastructure alone: a
> > parent HWPT typically hold one stage-2 IO pagetable and tag it with only
> > one ID in the cache entries. When sharing one large stage-2 IO pagetable
> > across physical IOMMU instances, that one ID may not always be available
> > across all the IOMMU instances. In other word, it's ideal for SW to have
> > a different container for the stage-2 IO pagetable so it can hold another
> > ID that's available.
> 
> Just holding multiple IDs doesn't require a different container. This is
> just a side effect when vIOMMU will be required for other said reasons.
> 
> If we have to put more words here I'd prefer to adding a bit more for
> CMDQV which is more compelling. not a big deal though. 😊

Ack.

> > For this "different container", add vIOMMU, an additional layer to hold
> > extra virtualization information:
> >
> > ________________________________________________________________
> > _______
> >  |                      iommufd (with vIOMMU)                            |
> >  |                                                                       |
> >  |                             [5]                                       |
> >  |                        _____________                                  |
> >  |                       |             |                                 |
> >  |      |----------------|    vIOMMU   |                                 |
> >  |      |                |             |                                 |
> >  |      |                |             |                                 |
> >  |      |      [1]       |             |          [4]             [2]    |
> >  |      |     ______     |             |     _____________     ________  |
> >  |      |    |      |    |     [3]     |    |             |   |        | |
> >  |      |    | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
> >  |      |    |______|    |_____________|    |_____________|   |________| |
> >  |      |        |              |                  |               |     |
> >
> > |______|________|______________|__________________|_____________
> > __|_____|
> >         |        |              |                  |               |
> >   ______v_____   |        ______v_____       ______v_____       ___v__
> >  |   struct   |  |  PFN  |  (paging)  |     |  (nested)  |     |struct|
> >  |iommu_device|  |------>|iommu_domain|<----|iommu_domain|<----
> > |device|
> >  |____________|   storage|____________|     |____________|     |______|
> >
> 
> nit - [1] ... [5] can be removed.

They are copied from the Documentation where numbers are needed.
I will take all the numbers out in the cover-letters.

> > The vIOMMU object should be seen as a slice of a physical IOMMU instance
> > that is passed to or shared with a VM. That can be some HW/SW resources:
> >  - Security namespace for guest owned ID, e.g. guest-controlled cache tags
> >  - Access to a sharable nesting parent pagetable across physical IOMMUs
> >  - Virtualization of various platforms IDs, e.g. RIDs and others
> >  - Delivery of paravirtualized invalidation
> >  - Direct assigned invalidation queues
> >  - Direct assigned interrupts
> >  - Non-affiliated event reporting
> 
> sorry no idea about 'non-affiliated event'. Can you elaborate?

I'll put an "e.g.".

> > On a multi-IOMMU system, the vIOMMU object must be instanced to the
> > number
> > of the physical IOMMUs that are passed to (via devices) a guest VM, while
> 
> 'to the number of the physical IOMMUs that have a slice passed to ..."

Ack.

Thanks
Nicolin


^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct
  2024-10-25 15:24             ` Jason Gunthorpe
@ 2024-10-28  2:30               ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-28  2:30 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Baolu Lu, Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, October 25, 2024 11:24 PM
> 
> On Fri, Oct 25, 2024 at 08:47:40AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Tuesday, October 22, 2024 9:16 PM
> > >
> > > On Tue, Oct 22, 2024 at 04:59:07PM +0800, Baolu Lu wrote:
> > >
> > > > Is it feasible to make vIOMMU object more generic, rather than strictly
> > > > tying it to nested translation? For example, a normal paging domain
> that
> > > > translates gPAs to hPAs could also have a vIOMMU object associated
> with
> > > > it.
> > > >
> > > > While we can only support vIOMMU object allocation uAPI for S2 paging
> > > > domains in the context of this series, we could consider leaving the
> > > > option open to associate a vIOMMU object with other normal paging
> > > > domains that are not a nested parent?
> > >
> > > Why? The nested parent flavour of the domain is basically free to
> > > create, what reason would be to not do that?
> > >
> > > If the HW doesn't support it, then does the HW really need/support a
> > > VIOMMU?
> >
> > Now it's agreed to build trusted I/O on top of this new vIOMMU object.
> > format-wise probably it's free to assume that nested parent is supported
> > on any new platform which will support trusted I/O. But I'm not sure
> > all the conditions around allowing nested are same as for trusted I/O,
> > e.g. for ARM nesting is allowed only for CANWBS/S2FWB. Are they
> > always guaranteed in trusted I/O configuration?
> 
> ARM is a big ? what exactly will come, but I'm expecting that to be
> resolved either with continued HW support or Linux will add the cache
> flushing and relax the test.
> 
> > Baolu did raise a good open to confirm given it will be used beyond
> > nesting. 😊
> 
> Even CC is "nesting", it is just nested with a fixed Identity S1 in
> the baseline case. The S2 translation still exists and still has to be
> consistent with whatever the secure world is doing.

this is true. That is why I asked more from the conditions around
enabling nested instead of the translation/format itself. 

> 
> So, my feeling is that the S2 nested domain is mandatory for the
> viommu, especially for CC, it must exists. In the end there may be
> more options than just a nested parent.
> 
> For instance if the CC design relies on the secure world sharing the
> CPU and IOMMU page table we might need a new HWPT type to represent
> that configuration.
> 
> From a uapi perspective we seem OK here as the hwpt input could be
> anything. We might have to adjust some checks in the kernel someday.
> 

yes, that could be extended in case of a need.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1)
  2024-10-25 15:42   ` Jason Gunthorpe
@ 2024-10-28  2:35     ` Tian, Kevin
  0 siblings, 0 replies; 19+ messages in thread
From: Tian, Kevin @ 2024-10-28  2:35 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolin Chen, will@kernel.org, joro@8bytes.org,
	suravee.suthikulpanit@amd.com, robin.murphy@arm.com,
	dwmw2@infradead.org, baolu.lu@linux.intel.com, shuah@kernel.org,
	linux-kernel@vger.kernel.org, iommu@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org, eric.auger@redhat.com,
	jean-philippe@linaro.org, mdf@kernel.org, mshavit@google.com,
	shameerali.kolothum.thodi@huawei.com, smostafa@google.com,
	Liu, Yi L, aik@amd.com, zhangfei.gao@linaro.org,
	patches@lists.linux.dev

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, October 25, 2024 11:43 PM
> 
> On Fri, Oct 25, 2024 at 08:34:05AM +0000, Tian, Kevin wrote:
> > > The vIOMMU object should be seen as a slice of a physical IOMMU
> instance
> > > that is passed to or shared with a VM. That can be some HW/SW
> resources:
> > >  - Security namespace for guest owned ID, e.g. guest-controlled cache
> tags
> > >  - Access to a sharable nesting parent pagetable across physical IOMMUs
> > >  - Virtualization of various platforms IDs, e.g. RIDs and others
> > >  - Delivery of paravirtualized invalidation
> > >  - Direct assigned invalidation queues
> > >  - Direct assigned interrupts
> > >  - Non-affiliated event reporting
> >
> > sorry no idea about 'non-affiliated event'. Can you elaborate?
> 
> This would be an even that is not a connected to a device
> 
> For instance a CMDQ experienced a problem.
> 


Okay, then 'non-device-affiliated' is probably clearer.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-10-28  2:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <cover.1729553811.git.nicolinc@nvidia.com>
2024-10-25  8:34 ` [PATCH v4 00/11] iommufd: Add vIOMMU infrastructure (Part-1) Tian, Kevin
2024-10-25 15:42   ` Jason Gunthorpe
2024-10-28  2:35     ` Tian, Kevin
2024-10-25 16:28   ` Nicolin Chen
     [not found] ` <1452c535a2e6a6c61f38fa752132db7e88b55770.1729553811.git.nicolinc@nvidia.com>
2024-10-25  8:34   ` [PATCH v4 01/11] iommufd: Move struct iommufd_object to public iommufd header Tian, Kevin
     [not found] ` <74fec8c38a7d568bd88beba9082b4a5a4bc2046f.1729553811.git.nicolinc@nvidia.com>
     [not found]   ` <b2c75705-2998-4e51-90f4-00b8bab785f5@linux.intel.com>
     [not found]     ` <ZxcspVGPBmABjUPu@nvidia.com>
     [not found]       ` <dd7eb37f-13c6-4c6e-8adc-954ad9974b93@linux.intel.com>
     [not found]         ` <20241022131554.GF13034@nvidia.com>
2024-10-25  8:47           ` [PATCH v4 02/11] iommufd: Introduce IOMMUFD_OBJ_VIOMMU and its related struct Tian, Kevin
2024-10-25 15:24             ` Jason Gunthorpe
2024-10-28  2:30               ` Tian, Kevin
     [not found] ` <b48254df9ee530bedaf436e5aff279d9a882a7ca.1729553811.git.nicolinc@nvidia.com>
2024-10-25  8:49   ` [PATCH v4 03/11] iommufd: Add iommufd_verify_unfinalized_object Tian, Kevin
     [not found] ` <9da2cf334a182ced4d4ffa578b87889e9c0856f3.1729553811.git.nicolinc@nvidia.com>
2024-10-25  8:59   ` [PATCH v4 04/11] iommufd/viommu: Add IOMMU_VIOMMU_ALLOC ioctl Tian, Kevin
2024-10-25 16:22     ` Nicolin Chen
2024-10-25  9:05   ` Tian, Kevin
2024-10-25 16:17     ` Nicolin Chen
     [not found] ` <1314e12b76edd84a8c9d0f14a6598538c8eeb50e.1729553811.git.nicolinc@nvidia.com>
2024-10-25  9:00   ` [PATCH v4 05/11] iommufd: Add domain_alloc_nested op to iommufd_viommu_ops Tian, Kevin
     [not found] ` <ba9ce9b3c8a93c9c790fee52961d075dfbb63ca7.1729553811.git.nicolinc@nvidia.com>
2024-10-25  9:04   ` [PATCH v4 06/11] iommufd: Allow pt_id to carry viommu_id for IOMMU_HWPT_ALLOC Tian, Kevin
2024-10-25 16:14     ` Nicolin Chen
     [not found] ` <97fa87a511f0132ee0e233cadf09af075e4404c5.1729553811.git.nicolinc@nvidia.com>
2024-10-25  9:11   ` [PATCH v4 10/11] Documentation: userspace-api: iommufd: Update vIOMMU Tian, Kevin
     [not found] ` <2180fdf423d0f2fcc5c031687690100b12c2ba51.1729553811.git.nicolinc@nvidia.com>
2024-10-25  9:18   ` [PATCH v4 11/11] iommu/arm-smmu-v3: Add IOMMU_VIOMMU_TYPE_ARM_SMMUV3 support Tian, Kevin
2024-10-25 15:41     ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox