linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Remove split on unmap behavior
@ 2024-11-04 17:41 Jason Gunthorpe
  2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 17:41 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

This is the result of the discussion on removing split. We agreed that
split is not required, and no application should ask for anything that
would not unmap a full large IOPTE.

Instead of split the two ARM drivers will now WARN_ON and return 0. This
is in contrast to what several other drivers do of removing the whole
IOPTE and returning 0.

The kdoc is updated to describe this.

v2:
 - Use WARN_ON instead of duplicating AMD behavior
 - Add arm-v7s patch
 - Write a kdoc for iommu_unmap()
v1: https://patch.msgid.link/r/0-v1-8c5f369ec2e5+75-arm_no_split_jgg@nvidia.com

Jason Gunthorpe (3):
  iommu/io-pgtable-arm: Remove split on unmap behavior
  iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  iommu: Add a kdoc to iommu_unmap()

 drivers/iommu/io-pgtable-arm-v7s.c | 125 +----------------------------
 drivers/iommu/io-pgtable-arm.c     |  68 +---------------
 drivers/iommu/iommu.c              |  14 ++++
 3 files changed, 20 insertions(+), 187 deletions(-)


base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
-- 
2.43.0



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] iommu/io-pgtable-arm: Remove split on unmap behavior
  2024-11-04 17:41 [PATCH v2 0/3] Remove split on unmap behavior Jason Gunthorpe
@ 2024-11-04 17:41 ` Jason Gunthorpe
  2024-11-04 18:38   ` Liviu Dudau
  2024-11-06 15:12   ` Steven Price
  2024-11-04 17:41 ` [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: " Jason Gunthorpe
  2024-11-04 17:41 ` [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap() Jason Gunthorpe
  2 siblings, 2 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 17:41 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

A minority of page table implementations (arm_lpae, armv7) are unique in
how they handle partial unmap of large IOPTEs.

Other implementations will unmap the large IOPTE and return it's
length. For example if a 2M IOPTE is present and the first 4K is requested
to be unmapped then unmap will remove the whole 2M and report 2M as the
result.

arm_lpae instead replaces the IOPTE with a table of smaller IOPTEs, unmaps
the 4K and returns 4k. This is actually an illegal/non-hitless operation
on at least SMMUv3 because of the BBM level 0 rules.

Will says this was done to support VFIO, but upon deeper analysis this was
never strictly necessary:

 https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/

In summary, historical VFIO supported the AMD behavior of unmapping the
whole large IOPTE and returning the size, even if asked to unmap a
portion. The driver would see this as a request to split a large IOPTE.
Modern VFIO always unmaps entire large IOPTEs (except on AMD) and drivers
don't see an IOPTE split.

Given it doesn't work fully correctly on SMMUv3 and relying on ARM unique
behavior would create portability problems across IOMMU drivers, retire
this functionality.

Outside the iommu users, this will potentially effect io_pgtable users of
ARM_32_LPAE_S1, ARM_32_LPAE_S2, ARM_64_LPAE_S1, ARM_64_LPAE_S2, and
ARM_MALI_LPAE formats.

Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/io-pgtable-arm.c | 68 +---------------------------------
 1 file changed, 2 insertions(+), 66 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 0e67f1721a3d98..9a16815b3f3434 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -569,66 +569,6 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop)
 	kfree(data);
 }
 
-static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
-				       struct iommu_iotlb_gather *gather,
-				       unsigned long iova, size_t size,
-				       arm_lpae_iopte blk_pte, int lvl,
-				       arm_lpae_iopte *ptep, size_t pgcount)
-{
-	struct io_pgtable_cfg *cfg = &data->iop.cfg;
-	arm_lpae_iopte pte, *tablep;
-	phys_addr_t blk_paddr;
-	size_t tablesz = ARM_LPAE_GRANULE(data);
-	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
-	int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data);
-	int i, unmap_idx_start = -1, num_entries = 0, max_entries;
-
-	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
-		return 0;
-
-	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg, data->iop.cookie);
-	if (!tablep)
-		return 0; /* Bytes unmapped */
-
-	if (size == split_sz) {
-		unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
-		max_entries = ptes_per_table - unmap_idx_start;
-		num_entries = min_t(int, pgcount, max_entries);
-	}
-
-	blk_paddr = iopte_to_paddr(blk_pte, data);
-	pte = iopte_prot(blk_pte);
-
-	for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) {
-		/* Unmap! */
-		if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries))
-			continue;
-
-		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, &tablep[i]);
-	}
-
-	pte = arm_lpae_install_table(tablep, ptep, blk_pte, data);
-	if (pte != blk_pte) {
-		__arm_lpae_free_pages(tablep, tablesz, cfg, data->iop.cookie);
-		/*
-		 * We may race against someone unmapping another part of this
-		 * block, but anything else is invalid. We can't misinterpret
-		 * a page entry here since we're never at the last level.
-		 */
-		if (iopte_type(pte) != ARM_LPAE_PTE_TYPE_TABLE)
-			return 0;
-
-		tablep = iopte_deref(pte, data);
-	} else if (unmap_idx_start >= 0) {
-		for (i = 0; i < num_entries; i++)
-			io_pgtable_tlb_add_page(&data->iop, gather, iova + i * size, size);
-
-		return num_entries * size;
-	}
-
-	return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep);
-}
-
 static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 			       struct iommu_iotlb_gather *gather,
 			       unsigned long iova, size_t size, size_t pgcount,
@@ -678,12 +618,8 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
 
 		return i * size;
 	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
-		/*
-		 * Insert a table at the next level to map the old region,
-		 * minus the part we want to unmap
-		 */
-		return arm_lpae_split_blk_unmap(data, gather, iova, size, pte,
-						lvl + 1, ptep, pgcount);
+		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
+		return 0;
 	}
 
 	/* Keep on walkin' */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  2024-11-04 17:41 [PATCH v2 0/3] Remove split on unmap behavior Jason Gunthorpe
  2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
@ 2024-11-04 17:41 ` Jason Gunthorpe
  2024-11-04 19:53   ` Robin Murphy
  2024-11-04 17:41 ` [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap() Jason Gunthorpe
  2 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 17:41 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

A minority of page table implementations (arm_lpae, armv7) are unique in
how they handle partial unmap of large IOPTEs.

Other implementations will unmap the large IOPTE and return it's
length. For example if a 2M IOPTE is present and the first 4K is requested
to be unmapped then unmap will remove the whole 2M and report 2M as the
result.

armv7 instead will break up contiguous entries and replace an entry with a
whole table so it can unmap the requested 4k.

This seems copied from the arm_lpae implementation, which was analyzed
here:

 https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/

Bring consistency to the implementations and remove this unused
functionality.

There are no uses outside iommu, this effects the ARM_V7S drivers
msm_iommu, mtk_iommu, and arm-smmmu.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/io-pgtable-arm-v7s.c | 125 +----------------------------
 1 file changed, 4 insertions(+), 121 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 06ffc683b28fee..7e37459cd28332 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -166,7 +166,6 @@ struct arm_v7s_io_pgtable {
 
 	arm_v7s_iopte		*pgd;
 	struct kmem_cache	*l2_tables;
-	spinlock_t		split_lock;
 };
 
 static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl);
@@ -363,25 +362,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
 	return pte;
 }
 
-static int arm_v7s_pte_to_prot(arm_v7s_iopte pte, int lvl)
-{
-	int prot = IOMMU_READ;
-	arm_v7s_iopte attr = pte >> ARM_V7S_ATTR_SHIFT(lvl);
-
-	if (!(attr & ARM_V7S_PTE_AP_RDONLY))
-		prot |= IOMMU_WRITE;
-	if (!(attr & ARM_V7S_PTE_AP_UNPRIV))
-		prot |= IOMMU_PRIV;
-	if ((attr & (ARM_V7S_TEX_MASK << ARM_V7S_TEX_SHIFT)) == 0)
-		prot |= IOMMU_MMIO;
-	else if (pte & ARM_V7S_ATTR_C)
-		prot |= IOMMU_CACHE;
-	if (pte & ARM_V7S_ATTR_XN(lvl))
-		prot |= IOMMU_NOEXEC;
-
-	return prot;
-}
-
 static arm_v7s_iopte arm_v7s_pte_to_cont(arm_v7s_iopte pte, int lvl)
 {
 	if (lvl == 1) {
@@ -398,23 +378,6 @@ static arm_v7s_iopte arm_v7s_pte_to_cont(arm_v7s_iopte pte, int lvl)
 	return pte;
 }
 
-static arm_v7s_iopte arm_v7s_cont_to_pte(arm_v7s_iopte pte, int lvl)
-{
-	if (lvl == 1) {
-		pte &= ~ARM_V7S_CONT_SECTION;
-	} else if (lvl == 2) {
-		arm_v7s_iopte xn = pte & BIT(ARM_V7S_CONT_PAGE_XN_SHIFT);
-		arm_v7s_iopte tex = pte & (ARM_V7S_CONT_PAGE_TEX_MASK <<
-					   ARM_V7S_CONT_PAGE_TEX_SHIFT);
-
-		pte ^= xn | tex | ARM_V7S_PTE_TYPE_CONT_PAGE;
-		pte |= (xn >> ARM_V7S_CONT_PAGE_XN_SHIFT) |
-		       (tex >> ARM_V7S_CONT_PAGE_TEX_SHIFT) |
-		       ARM_V7S_PTE_TYPE_PAGE;
-	}
-	return pte;
-}
-
 static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl)
 {
 	if (lvl == 1 && !ARM_V7S_PTE_IS_TABLE(pte, lvl))
@@ -591,77 +554,6 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
 	kfree(data);
 }
 
-static arm_v7s_iopte arm_v7s_split_cont(struct arm_v7s_io_pgtable *data,
-					unsigned long iova, int idx, int lvl,
-					arm_v7s_iopte *ptep)
-{
-	struct io_pgtable *iop = &data->iop;
-	arm_v7s_iopte pte;
-	size_t size = ARM_V7S_BLOCK_SIZE(lvl);
-	int i;
-
-	/* Check that we didn't lose a race to get the lock */
-	pte = *ptep;
-	if (!arm_v7s_pte_is_cont(pte, lvl))
-		return pte;
-
-	ptep -= idx & (ARM_V7S_CONT_PAGES - 1);
-	pte = arm_v7s_cont_to_pte(pte, lvl);
-	for (i = 0; i < ARM_V7S_CONT_PAGES; i++)
-		ptep[i] = pte + i * size;
-
-	__arm_v7s_pte_sync(ptep, ARM_V7S_CONT_PAGES, &iop->cfg);
-
-	size *= ARM_V7S_CONT_PAGES;
-	io_pgtable_tlb_flush_walk(iop, iova, size, size);
-	return pte;
-}
-
-static size_t arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
-				      struct iommu_iotlb_gather *gather,
-				      unsigned long iova, size_t size,
-				      arm_v7s_iopte blk_pte,
-				      arm_v7s_iopte *ptep)
-{
-	struct io_pgtable_cfg *cfg = &data->iop.cfg;
-	arm_v7s_iopte pte, *tablep;
-	int i, unmap_idx, num_entries, num_ptes;
-
-	tablep = __arm_v7s_alloc_table(2, GFP_ATOMIC, data);
-	if (!tablep)
-		return 0; /* Bytes unmapped */
-
-	num_ptes = ARM_V7S_PTES_PER_LVL(2, cfg);
-	num_entries = size >> ARM_V7S_LVL_SHIFT(2);
-	unmap_idx = ARM_V7S_LVL_IDX(iova, 2, cfg);
-
-	pte = arm_v7s_prot_to_pte(arm_v7s_pte_to_prot(blk_pte, 1), 2, cfg);
-	if (num_entries > 1)
-		pte = arm_v7s_pte_to_cont(pte, 2);
-
-	for (i = 0; i < num_ptes; i += num_entries, pte += size) {
-		/* Unmap! */
-		if (i == unmap_idx)
-			continue;
-
-		__arm_v7s_set_pte(&tablep[i], pte, num_entries, cfg);
-	}
-
-	pte = arm_v7s_install_table(tablep, ptep, blk_pte, cfg);
-	if (pte != blk_pte) {
-		__arm_v7s_free_table(tablep, 2, data);
-
-		if (!ARM_V7S_PTE_IS_TABLE(pte, 1))
-			return 0;
-
-		tablep = iopte_deref(pte, 1, data);
-		return __arm_v7s_unmap(data, gather, iova, size, 2, tablep);
-	}
-
-	io_pgtable_tlb_add_page(&data->iop, gather, iova, size);
-	return size;
-}
-
 static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
 			      struct iommu_iotlb_gather *gather,
 			      unsigned long iova, size_t size, int lvl,
@@ -694,11 +586,8 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
 	 * case in a lock for the sake of correctness and be done with it.
 	 */
 	if (num_entries <= 1 && arm_v7s_pte_is_cont(pte[0], lvl)) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&data->split_lock, flags);
-		pte[0] = arm_v7s_split_cont(data, iova, idx, lvl, ptep);
-		spin_unlock_irqrestore(&data->split_lock, flags);
+		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
+		return 0;
 	}
 
 	/* If the size matches this level, we're in the right place */
@@ -721,12 +610,8 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
 		}
 		return size;
 	} else if (lvl == 1 && !ARM_V7S_PTE_IS_TABLE(pte[0], lvl)) {
-		/*
-		 * Insert a table at the next level to map the old region,
-		 * minus the part we want to unmap
-		 */
-		return arm_v7s_split_blk_unmap(data, gather, iova, size, pte[0],
-					       ptep);
+		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
+		return 0;
 	}
 
 	/* Keep on walkin' */
@@ -811,8 +696,6 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	if (!data)
 		return NULL;
 
-	spin_lock_init(&data->split_lock);
-
 	/*
 	 * ARM_MTK_TTBR_EXT extend the translation table base support larger
 	 * memory address.
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap()
  2024-11-04 17:41 [PATCH v2 0/3] Remove split on unmap behavior Jason Gunthorpe
  2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
  2024-11-04 17:41 ` [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: " Jason Gunthorpe
@ 2024-11-04 17:41 ` Jason Gunthorpe
  2024-11-04 18:42   ` Liviu Dudau
  2024-11-05  3:46   ` kernel test robot
  2 siblings, 2 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 17:41 UTC (permalink / raw)
  To: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

Describe the most conservative version of the driver implementations.
All drivers should support this.

Many drivers support extending the range if a large page is hit, but
let's not make that officially approved API. The main point is to
document explicitly that split is not supported.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommu.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 83c8e617a2c588..d3cf7cc69c797c 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2586,6 +2586,20 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 	return unmapped;
 }
 
+/**
+ * iommu_unmap() - Remove mappings from a range of IOVA
+ * @domain: Domain to manipulate
+ * @iova: IO virtual address to start
+ * @len: Length of the range starting from @iova
+ *
+ * iommu_unmap() will remove a translation created by iommu_map(). It cannot
+ * subdivide a mapping created by iommu_map(), so it should be called with IOVA
+ * ranges that match what was passed to iommu_map(). The range can aggregate
+ * contiguous iommu_map() calls so long as no individual range is split.
+ *
+ * Returns: Number of bytes of IOVA unmapped. iova + res will be the point
+ * unmapping stopped.
+ */
 size_t iommu_unmap(struct iommu_domain *domain,
 		   unsigned long iova, size_t size)
 {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Remove split on unmap behavior
  2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
@ 2024-11-04 18:38   ` Liviu Dudau
  2024-11-06 15:12   ` Steven Price
  1 sibling, 0 replies; 12+ messages in thread
From: Liviu Dudau @ 2024-11-04 18:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Boris Brezillon, dri-devel, patches, Steven Price

On Mon, Nov 04, 2024 at 01:41:29PM -0400, Jason Gunthorpe wrote:
> A minority of page table implementations (arm_lpae, armv7) are unique in
> how they handle partial unmap of large IOPTEs.
> 
> Other implementations will unmap the large IOPTE and return it's
> length. For example if a 2M IOPTE is present and the first 4K is requested
> to be unmapped then unmap will remove the whole 2M and report 2M as the
> result.
> 
> arm_lpae instead replaces the IOPTE with a table of smaller IOPTEs, unmaps
> the 4K and returns 4k. This is actually an illegal/non-hitless operation
> on at least SMMUv3 because of the BBM level 0 rules.
> 
> Will says this was done to support VFIO, but upon deeper analysis this was
> never strictly necessary:
> 
>  https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/
> 
> In summary, historical VFIO supported the AMD behavior of unmapping the
> whole large IOPTE and returning the size, even if asked to unmap a
> portion. The driver would see this as a request to split a large IOPTE.
> Modern VFIO always unmaps entire large IOPTEs (except on AMD) and drivers
> don't see an IOPTE split.
> 
> Given it doesn't work fully correctly on SMMUv3 and relying on ARM unique
> behavior would create portability problems across IOMMU drivers, retire
> this functionality.
> 
> Outside the iommu users, this will potentially effect io_pgtable users of
> ARM_32_LPAE_S1, ARM_32_LPAE_S2, ARM_64_LPAE_S1, ARM_64_LPAE_S2, and
> ARM_MALI_LPAE formats.
> 
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Liviu Dudau <liviu.dudau@arm.com>

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/io-pgtable-arm.c | 68 +---------------------------------
>  1 file changed, 2 insertions(+), 66 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 0e67f1721a3d98..9a16815b3f3434 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -569,66 +569,6 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop)
>  	kfree(data);
>  }
>  
> -static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
> -				       struct iommu_iotlb_gather *gather,
> -				       unsigned long iova, size_t size,
> -				       arm_lpae_iopte blk_pte, int lvl,
> -				       arm_lpae_iopte *ptep, size_t pgcount)
> -{
> -	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> -	arm_lpae_iopte pte, *tablep;
> -	phys_addr_t blk_paddr;
> -	size_t tablesz = ARM_LPAE_GRANULE(data);
> -	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
> -	int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data);
> -	int i, unmap_idx_start = -1, num_entries = 0, max_entries;
> -
> -	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
> -		return 0;
> -
> -	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg, data->iop.cookie);
> -	if (!tablep)
> -		return 0; /* Bytes unmapped */
> -
> -	if (size == split_sz) {
> -		unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
> -		max_entries = ptes_per_table - unmap_idx_start;
> -		num_entries = min_t(int, pgcount, max_entries);
> -	}
> -
> -	blk_paddr = iopte_to_paddr(blk_pte, data);
> -	pte = iopte_prot(blk_pte);
> -
> -	for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) {
> -		/* Unmap! */
> -		if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries))
> -			continue;
> -
> -		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, &tablep[i]);
> -	}
> -
> -	pte = arm_lpae_install_table(tablep, ptep, blk_pte, data);
> -	if (pte != blk_pte) {
> -		__arm_lpae_free_pages(tablep, tablesz, cfg, data->iop.cookie);
> -		/*
> -		 * We may race against someone unmapping another part of this
> -		 * block, but anything else is invalid. We can't misinterpret
> -		 * a page entry here since we're never at the last level.
> -		 */
> -		if (iopte_type(pte) != ARM_LPAE_PTE_TYPE_TABLE)
> -			return 0;
> -
> -		tablep = iopte_deref(pte, data);
> -	} else if (unmap_idx_start >= 0) {
> -		for (i = 0; i < num_entries; i++)
> -			io_pgtable_tlb_add_page(&data->iop, gather, iova + i * size, size);
> -
> -		return num_entries * size;
> -	}
> -
> -	return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep);
> -}
> -
>  static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>  			       struct iommu_iotlb_gather *gather,
>  			       unsigned long iova, size_t size, size_t pgcount,
> @@ -678,12 +618,8 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>  
>  		return i * size;
>  	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
> -		/*
> -		 * Insert a table at the next level to map the old region,
> -		 * minus the part we want to unmap
> -		 */
> -		return arm_lpae_split_blk_unmap(data, gather, iova, size, pte,
> -						lvl + 1, ptep, pgcount);
> +		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
> +		return 0;
>  	}
>  
>  	/* Keep on walkin' */
> -- 
> 2.43.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap()
  2024-11-04 17:41 ` [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap() Jason Gunthorpe
@ 2024-11-04 18:42   ` Liviu Dudau
  2024-11-05  3:46   ` kernel test robot
  1 sibling, 0 replies; 12+ messages in thread
From: Liviu Dudau @ 2024-11-04 18:42 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Robin Murphy, Will Deacon,
	Boris Brezillon, dri-devel, patches, Steven Price

On Mon, Nov 04, 2024 at 01:41:31PM -0400, Jason Gunthorpe wrote:
> Describe the most conservative version of the driver implementations.
> All drivers should support this.
> 
> Many drivers support extending the range if a large page is hit, but
> let's not make that officially approved API. The main point is to
> document explicitly that split is not supported.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/iommu.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 83c8e617a2c588..d3cf7cc69c797c 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2586,6 +2586,20 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
>  	return unmapped;
>  }
>  
> +/**
> + * iommu_unmap() - Remove mappings from a range of IOVA
> + * @domain: Domain to manipulate
> + * @iova: IO virtual address to start
> + * @len: Length of the range starting from @iova
> + *
> + * iommu_unmap() will remove a translation created by iommu_map(). It cannot
> + * subdivide a mapping created by iommu_map(), so it should be called with IOVA
> + * ranges that match what was passed to iommu_map(). The range can aggregate
> + * contiguous iommu_map() calls so long as no individual range is split.
> + *
> + * Returns: Number of bytes of IOVA unmapped. iova + res will be the point
> + * unmapping stopped.

I guess 'res' is the return value here. Not my default name for the variable,
worth replacing it with "return value" ?

Regardless of the acceptance of this nit:

Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>

Best regards,
Liviu

> + */
>  size_t iommu_unmap(struct iommu_domain *domain,
>  		   unsigned long iova, size_t size)
>  {
> -- 
> 2.43.0
> 

-- 
====================
| I would like to |
| fix the world,  |
| but they're not |
| giving me the   |
 \ source code!  /
  ---------------
    ¯\_(ツ)_/¯


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  2024-11-04 17:41 ` [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: " Jason Gunthorpe
@ 2024-11-04 19:53   ` Robin Murphy
  2024-11-04 20:09     ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Robin Murphy @ 2024-11-04 19:53 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

On 2024-11-04 5:41 pm, Jason Gunthorpe wrote:
> A minority of page table implementations (arm_lpae, armv7) are unique in
> how they handle partial unmap of large IOPTEs.
> 
> Other implementations will unmap the large IOPTE and return it's
> length. For example if a 2M IOPTE is present and the first 4K is requested
> to be unmapped then unmap will remove the whole 2M and report 2M as the
> result.
> 
> armv7 instead will break up contiguous entries and replace an entry with a
> whole table so it can unmap the requested 4k.
> 
> This seems copied from the arm_lpae implementation, which was analyzed
> here:
> 
>   https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/
> 
> Bring consistency to the implementations and remove this unused
> functionality.
> 
> There are no uses outside iommu, this effects the ARM_V7S drivers
> msm_iommu, mtk_iommu, and arm-smmmu.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/iommu/io-pgtable-arm-v7s.c | 125 +----------------------------
>   1 file changed, 4 insertions(+), 121 deletions(-)

Yikes, I'd forgotten quite how much horribleness was devoted to this, 
despite it being the "simpler" non-recursive one...

However, there are also "partial unmap" cases in both sets of selftests, 
so I think there's still a bit more to remove yet :)

Thanks,
Robin.

> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 06ffc683b28fee..7e37459cd28332 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -166,7 +166,6 @@ struct arm_v7s_io_pgtable {
>   
>   	arm_v7s_iopte		*pgd;
>   	struct kmem_cache	*l2_tables;
> -	spinlock_t		split_lock;
>   };
>   
>   static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl);
> @@ -363,25 +362,6 @@ static arm_v7s_iopte arm_v7s_prot_to_pte(int prot, int lvl,
>   	return pte;
>   }
>   
> -static int arm_v7s_pte_to_prot(arm_v7s_iopte pte, int lvl)
> -{
> -	int prot = IOMMU_READ;
> -	arm_v7s_iopte attr = pte >> ARM_V7S_ATTR_SHIFT(lvl);
> -
> -	if (!(attr & ARM_V7S_PTE_AP_RDONLY))
> -		prot |= IOMMU_WRITE;
> -	if (!(attr & ARM_V7S_PTE_AP_UNPRIV))
> -		prot |= IOMMU_PRIV;
> -	if ((attr & (ARM_V7S_TEX_MASK << ARM_V7S_TEX_SHIFT)) == 0)
> -		prot |= IOMMU_MMIO;
> -	else if (pte & ARM_V7S_ATTR_C)
> -		prot |= IOMMU_CACHE;
> -	if (pte & ARM_V7S_ATTR_XN(lvl))
> -		prot |= IOMMU_NOEXEC;
> -
> -	return prot;
> -}
> -
>   static arm_v7s_iopte arm_v7s_pte_to_cont(arm_v7s_iopte pte, int lvl)
>   {
>   	if (lvl == 1) {
> @@ -398,23 +378,6 @@ static arm_v7s_iopte arm_v7s_pte_to_cont(arm_v7s_iopte pte, int lvl)
>   	return pte;
>   }
>   
> -static arm_v7s_iopte arm_v7s_cont_to_pte(arm_v7s_iopte pte, int lvl)
> -{
> -	if (lvl == 1) {
> -		pte &= ~ARM_V7S_CONT_SECTION;
> -	} else if (lvl == 2) {
> -		arm_v7s_iopte xn = pte & BIT(ARM_V7S_CONT_PAGE_XN_SHIFT);
> -		arm_v7s_iopte tex = pte & (ARM_V7S_CONT_PAGE_TEX_MASK <<
> -					   ARM_V7S_CONT_PAGE_TEX_SHIFT);
> -
> -		pte ^= xn | tex | ARM_V7S_PTE_TYPE_CONT_PAGE;
> -		pte |= (xn >> ARM_V7S_CONT_PAGE_XN_SHIFT) |
> -		       (tex >> ARM_V7S_CONT_PAGE_TEX_SHIFT) |
> -		       ARM_V7S_PTE_TYPE_PAGE;
> -	}
> -	return pte;
> -}
> -
>   static bool arm_v7s_pte_is_cont(arm_v7s_iopte pte, int lvl)
>   {
>   	if (lvl == 1 && !ARM_V7S_PTE_IS_TABLE(pte, lvl))
> @@ -591,77 +554,6 @@ static void arm_v7s_free_pgtable(struct io_pgtable *iop)
>   	kfree(data);
>   }
>   
> -static arm_v7s_iopte arm_v7s_split_cont(struct arm_v7s_io_pgtable *data,
> -					unsigned long iova, int idx, int lvl,
> -					arm_v7s_iopte *ptep)
> -{
> -	struct io_pgtable *iop = &data->iop;
> -	arm_v7s_iopte pte;
> -	size_t size = ARM_V7S_BLOCK_SIZE(lvl);
> -	int i;
> -
> -	/* Check that we didn't lose a race to get the lock */
> -	pte = *ptep;
> -	if (!arm_v7s_pte_is_cont(pte, lvl))
> -		return pte;
> -
> -	ptep -= idx & (ARM_V7S_CONT_PAGES - 1);
> -	pte = arm_v7s_cont_to_pte(pte, lvl);
> -	for (i = 0; i < ARM_V7S_CONT_PAGES; i++)
> -		ptep[i] = pte + i * size;
> -
> -	__arm_v7s_pte_sync(ptep, ARM_V7S_CONT_PAGES, &iop->cfg);
> -
> -	size *= ARM_V7S_CONT_PAGES;
> -	io_pgtable_tlb_flush_walk(iop, iova, size, size);
> -	return pte;
> -}
> -
> -static size_t arm_v7s_split_blk_unmap(struct arm_v7s_io_pgtable *data,
> -				      struct iommu_iotlb_gather *gather,
> -				      unsigned long iova, size_t size,
> -				      arm_v7s_iopte blk_pte,
> -				      arm_v7s_iopte *ptep)
> -{
> -	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> -	arm_v7s_iopte pte, *tablep;
> -	int i, unmap_idx, num_entries, num_ptes;
> -
> -	tablep = __arm_v7s_alloc_table(2, GFP_ATOMIC, data);
> -	if (!tablep)
> -		return 0; /* Bytes unmapped */
> -
> -	num_ptes = ARM_V7S_PTES_PER_LVL(2, cfg);
> -	num_entries = size >> ARM_V7S_LVL_SHIFT(2);
> -	unmap_idx = ARM_V7S_LVL_IDX(iova, 2, cfg);
> -
> -	pte = arm_v7s_prot_to_pte(arm_v7s_pte_to_prot(blk_pte, 1), 2, cfg);
> -	if (num_entries > 1)
> -		pte = arm_v7s_pte_to_cont(pte, 2);
> -
> -	for (i = 0; i < num_ptes; i += num_entries, pte += size) {
> -		/* Unmap! */
> -		if (i == unmap_idx)
> -			continue;
> -
> -		__arm_v7s_set_pte(&tablep[i], pte, num_entries, cfg);
> -	}
> -
> -	pte = arm_v7s_install_table(tablep, ptep, blk_pte, cfg);
> -	if (pte != blk_pte) {
> -		__arm_v7s_free_table(tablep, 2, data);
> -
> -		if (!ARM_V7S_PTE_IS_TABLE(pte, 1))
> -			return 0;
> -
> -		tablep = iopte_deref(pte, 1, data);
> -		return __arm_v7s_unmap(data, gather, iova, size, 2, tablep);
> -	}
> -
> -	io_pgtable_tlb_add_page(&data->iop, gather, iova, size);
> -	return size;
> -}
> -
>   static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
>   			      struct iommu_iotlb_gather *gather,
>   			      unsigned long iova, size_t size, int lvl,
> @@ -694,11 +586,8 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
>   	 * case in a lock for the sake of correctness and be done with it.
>   	 */
>   	if (num_entries <= 1 && arm_v7s_pte_is_cont(pte[0], lvl)) {
> -		unsigned long flags;
> -
> -		spin_lock_irqsave(&data->split_lock, flags);
> -		pte[0] = arm_v7s_split_cont(data, iova, idx, lvl, ptep);
> -		spin_unlock_irqrestore(&data->split_lock, flags);
> +		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
> +		return 0;
>   	}
>   
>   	/* If the size matches this level, we're in the right place */
> @@ -721,12 +610,8 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable *data,
>   		}
>   		return size;
>   	} else if (lvl == 1 && !ARM_V7S_PTE_IS_TABLE(pte[0], lvl)) {
> -		/*
> -		 * Insert a table at the next level to map the old region,
> -		 * minus the part we want to unmap
> -		 */
> -		return arm_v7s_split_blk_unmap(data, gather, iova, size, pte[0],
> -					       ptep);
> +		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
> +		return 0;
>   	}
>   
>   	/* Keep on walkin' */
> @@ -811,8 +696,6 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
>   	if (!data)
>   		return NULL;
>   
> -	spin_lock_init(&data->split_lock);
> -
>   	/*
>   	 * ARM_MTK_TTBR_EXT extend the translation table base support larger
>   	 * memory address.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  2024-11-04 19:53   ` Robin Murphy
@ 2024-11-04 20:09     ` Jason Gunthorpe
  2024-11-05 16:59       ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-04 20:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: iommu, Joerg Roedel, linux-arm-kernel, Will Deacon,
	Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

On Mon, Nov 04, 2024 at 07:53:46PM +0000, Robin Murphy wrote:
> On 2024-11-04 5:41 pm, Jason Gunthorpe wrote:
> > A minority of page table implementations (arm_lpae, armv7) are unique in
> > how they handle partial unmap of large IOPTEs.
> > 
> > Other implementations will unmap the large IOPTE and return it's
> > length. For example if a 2M IOPTE is present and the first 4K is requested
> > to be unmapped then unmap will remove the whole 2M and report 2M as the
> > result.
> > 
> > armv7 instead will break up contiguous entries and replace an entry with a
> > whole table so it can unmap the requested 4k.
> > 
> > This seems copied from the arm_lpae implementation, which was analyzed
> > here:
> > 
> >   https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/
> > 
> > Bring consistency to the implementations and remove this unused
> > functionality.
> > 
> > There are no uses outside iommu, this effects the ARM_V7S drivers
> > msm_iommu, mtk_iommu, and arm-smmmu.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >   drivers/iommu/io-pgtable-arm-v7s.c | 125 +----------------------------
> >   1 file changed, 4 insertions(+), 121 deletions(-)
> 
> Yikes, I'd forgotten quite how much horribleness was devoted to this,
> despite it being the "simpler" non-recursive one...

Yes, it is the contiguous page support that makes it so complex..

> However, there are also "partial unmap" cases in both sets of selftests, so
> I think there's still a bit more to remove yet :)

Sneaky, I got it thanks

Runs OK now:

arm-v7s io-pgtable: self test ok
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32

Jason

--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -819,7 +819,7 @@ static int __init arm_v7s_do_selftests(void)
 		.quirks = IO_PGTABLE_QUIRK_ARM_NS,
 		.pgsize_bitmap = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
 	};
-	unsigned int iova, size, iova_start;
+	unsigned int iova, size;
 	unsigned int i, loopnr = 0;
 	size_t mapped;
 
@@ -871,25 +871,6 @@ static int __init arm_v7s_do_selftests(void)
 		loopnr++;
 	}
 
-	/* Partial unmap */
-	i = 1;
-	size = 1UL << __ffs(cfg.pgsize_bitmap);
-	while (i < loopnr) {
-		iova_start = i * SZ_16M;
-		if (ops->unmap_pages(ops, iova_start + size, size, 1, NULL) != size)
-			return __FAIL(ops);
-
-		/* Remap of partial unmap */
-		if (ops->map_pages(ops, iova_start + size, size, size, 1,
-				   IOMMU_READ, GFP_KERNEL, &mapped))
-			return __FAIL(ops);
-
-		if (ops->iova_to_phys(ops, iova_start + size + 42)
-		    != (size + 42))
-			return __FAIL(ops);
-		i++;
-	}
-
 	/* Full unmap */
 	iova = 0;
 	for_each_set_bit(i, &cfg.pgsize_bitmap, BITS_PER_LONG) {


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap()
  2024-11-04 17:41 ` [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap() Jason Gunthorpe
  2024-11-04 18:42   ` Liviu Dudau
@ 2024-11-05  3:46   ` kernel test robot
  1 sibling, 0 replies; 12+ messages in thread
From: kernel test robot @ 2024-11-05  3:46 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: oe-kbuild-all, Boris Brezillon, dri-devel, Liviu Dudau, patches,
	Steven Price

Hi Jason,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 8e929cb546ee42c9a61d24fae60605e9e3192354]

url:    https://github.com/intel-lab-lkp/linux/commits/Jason-Gunthorpe/iommu-io-pgtable-arm-Remove-split-on-unmap-behavior/20241105-014356
base:   8e929cb546ee42c9a61d24fae60605e9e3192354
patch link:    https://lore.kernel.org/r/3-v2-fd55d00a60b2%2Bc69-arm_no_split_jgg%40nvidia.com
patch subject: [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap()
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20241105/202411051125.mlgeWlEm-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241105/202411051125.mlgeWlEm-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411051125.mlgeWlEm-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/iommu/iommu.c:2605: warning: Function parameter or struct member 'size' not described in 'iommu_unmap'
>> drivers/iommu/iommu.c:2605: warning: Excess function parameter 'len' description in 'iommu_unmap'


vim +2605 drivers/iommu/iommu.c

add02cfdc9bc29 drivers/iommu/iommu.c Joerg Roedel    2017-08-23  2588  
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2589  /**
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2590   * iommu_unmap() - Remove mappings from a range of IOVA
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2591   * @domain: Domain to manipulate
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2592   * @iova: IO virtual address to start
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2593   * @len: Length of the range starting from @iova
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2594   *
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2595   * iommu_unmap() will remove a translation created by iommu_map(). It cannot
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2596   * subdivide a mapping created by iommu_map(), so it should be called with IOVA
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2597   * ranges that match what was passed to iommu_map(). The range can aggregate
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2598   * contiguous iommu_map() calls so long as no individual range is split.
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2599   *
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2600   * Returns: Number of bytes of IOVA unmapped. iova + res will be the point
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2601   * unmapping stopped.
6aa7e03e9dd8b5 drivers/iommu/iommu.c Jason Gunthorpe 2024-11-04  2602   */
add02cfdc9bc29 drivers/iommu/iommu.c Joerg Roedel    2017-08-23  2603  size_t iommu_unmap(struct iommu_domain *domain,
add02cfdc9bc29 drivers/iommu/iommu.c Joerg Roedel    2017-08-23  2604  		   unsigned long iova, size_t size)
add02cfdc9bc29 drivers/iommu/iommu.c Joerg Roedel    2017-08-23 @2605  {
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2606  	struct iommu_iotlb_gather iotlb_gather;
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2607  	size_t ret;
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2608  
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2609  	iommu_iotlb_gather_init(&iotlb_gather);
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2610  	ret = __iommu_unmap(domain, iova, size, &iotlb_gather);
aae4c8e27bd756 drivers/iommu/iommu.c Tom Murphy      2020-08-17  2611  	iommu_iotlb_sync(domain, &iotlb_gather);
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2612  
a7d20dc19d9ea7 drivers/iommu/iommu.c Will Deacon     2019-07-02  2613  	return ret;
add02cfdc9bc29 drivers/iommu/iommu.c Joerg Roedel    2017-08-23  2614  }
cefc53c7f49424 drivers/base/iommu.c  Joerg Roedel    2010-01-08  2615  EXPORT_SYMBOL_GPL(iommu_unmap);
1460432cb513f0 drivers/iommu/iommu.c Alex Williamson 2011-10-21  2616  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  2024-11-04 20:09     ` Jason Gunthorpe
@ 2024-11-05 16:59       ` Will Deacon
  2024-11-05 17:11         ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2024-11-05 16:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Robin Murphy, iommu, Joerg Roedel, linux-arm-kernel,
	Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

On Mon, Nov 04, 2024 at 04:09:51PM -0400, Jason Gunthorpe wrote:
> Runs OK now:
> 
> arm-v7s io-pgtable: self test ok
> arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
> 
> Jason
> 
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -819,7 +819,7 @@ static int __init arm_v7s_do_selftests(void)
>  		.quirks = IO_PGTABLE_QUIRK_ARM_NS,
>  		.pgsize_bitmap = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
>  	};
> -	unsigned int iova, size, iova_start;
> +	unsigned int iova, size;
>  	unsigned int i, loopnr = 0;
>  	size_t mapped;
>  
> @@ -871,25 +871,6 @@ static int __init arm_v7s_do_selftests(void)
>  		loopnr++;
>  	}
>  
> -	/* Partial unmap */
> -	i = 1;
> -	size = 1UL << __ffs(cfg.pgsize_bitmap);
> -	while (i < loopnr) {
> -		iova_start = i * SZ_16M;
> -		if (ops->unmap_pages(ops, iova_start + size, size, 1, NULL) != size)
> -			return __FAIL(ops);
> -
> -		/* Remap of partial unmap */
> -		if (ops->map_pages(ops, iova_start + size, size, size, 1,
> -				   IOMMU_READ, GFP_KERNEL, &mapped))
> -			return __FAIL(ops);
> -
> -		if (ops->iova_to_phys(ops, iova_start + size + 42)
> -		    != (size + 42))
> -			return __FAIL(ops);
> -		i++;
> -	}
> -
>  	/* Full unmap */
>  	iova = 0;
>  	for_each_set_bit(i, &cfg.pgsize_bitmap, BITS_PER_LONG) {

Yup, and you can do the same for the other selftest in io-pgtable-arm.c

Will


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: Remove split on unmap behavior
  2024-11-05 16:59       ` Will Deacon
@ 2024-11-05 17:11         ` Jason Gunthorpe
  0 siblings, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2024-11-05 17:11 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, iommu, Joerg Roedel, linux-arm-kernel,
	Boris Brezillon, dri-devel, Liviu Dudau, patches, Steven Price

On Tue, Nov 05, 2024 at 04:59:43PM +0000, Will Deacon wrote:
> >  	/* Full unmap */
> >  	iova = 0;
> >  	for_each_set_bit(i, &cfg.pgsize_bitmap, BITS_PER_LONG) {
> 
> Yup, and you can do the same for the other selftest in io-pgtable-arm.c

Ugh, yes, I ran it and thought the log it printed was the success log,
it did actually fail too.

This seems like the right output:

arm-v7s io-pgtable: self test ok
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 32
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 36
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 40
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 42
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 44
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x40201000, IAS 48
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 32
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 36
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 40
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 42
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 44
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x02004000, IAS 48
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 32
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 36
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 40
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 42
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 44
arm-lpae io-pgtable: selftest: pgsize_bitmap 0x20010000, IAS 48
arm-lpae io-pgtable: selftest: completed with 18 PASS 0 FAIL

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 9c9ecfdf87be90..abaf323843e3c0 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -1283,19 +1283,6 @@ static int __init arm_lpae_run_tests(struct io_pgtable_cfg *cfg)
                        iova += SZ_1G;
                }
 
-               /* Partial unmap */
-               size = 1UL << __ffs(cfg->pgsize_bitmap);
-               if (ops->unmap_pages(ops, SZ_1G + size, size, 1, NULL) != size)
-                       return __FAIL(ops, i);
-
-               /* Remap of partial unmap */
-               if (ops->map_pages(ops, SZ_1G + size, size, size, 1,
-                                  IOMMU_READ, GFP_KERNEL, &mapped))
-                       return __FAIL(ops, i);
-
-               if (ops->iova_to_phys(ops, SZ_1G + size + 42) != (size + 42))
-                       return __FAIL(ops, i);
-
                /* Full unmap */
                iova = 0;
                for_each_set_bit(j, &cfg->pgsize_bitmap, BITS_PER_LONG) {


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu/io-pgtable-arm: Remove split on unmap behavior
  2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
  2024-11-04 18:38   ` Liviu Dudau
@ 2024-11-06 15:12   ` Steven Price
  1 sibling, 0 replies; 12+ messages in thread
From: Steven Price @ 2024-11-06 15:12 UTC (permalink / raw)
  To: Jason Gunthorpe, iommu, Joerg Roedel, linux-arm-kernel,
	Robin Murphy, Will Deacon
  Cc: Boris Brezillon, dri-devel, Liviu Dudau, patches

On 04/11/2024 17:41, Jason Gunthorpe wrote:
> A minority of page table implementations (arm_lpae, armv7) are unique in
> how they handle partial unmap of large IOPTEs.
> 
> Other implementations will unmap the large IOPTE and return it's
> length. For example if a 2M IOPTE is present and the first 4K is requested
> to be unmapped then unmap will remove the whole 2M and report 2M as the
> result.
> 
> arm_lpae instead replaces the IOPTE with a table of smaller IOPTEs, unmaps
> the 4K and returns 4k. This is actually an illegal/non-hitless operation
> on at least SMMUv3 because of the BBM level 0 rules.
> 
> Will says this was done to support VFIO, but upon deeper analysis this was
> never strictly necessary:
> 
>  https://lore.kernel.org/all/20241024134411.GA6956@nvidia.com/
> 
> In summary, historical VFIO supported the AMD behavior of unmapping the
> whole large IOPTE and returning the size, even if asked to unmap a
> portion. The driver would see this as a request to split a large IOPTE.
> Modern VFIO always unmaps entire large IOPTEs (except on AMD) and drivers
> don't see an IOPTE split.
> 
> Given it doesn't work fully correctly on SMMUv3 and relying on ARM unique
> behavior would create portability problems across IOMMU drivers, retire
> this functionality.
> 
> Outside the iommu users, this will potentially effect io_pgtable users of
> ARM_32_LPAE_S1, ARM_32_LPAE_S2, ARM_64_LPAE_S1, ARM_64_LPAE_S2, and
> ARM_MALI_LPAE formats.
> 
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Liviu Dudau <liviu.dudau@arm.com>
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/iommu/io-pgtable-arm.c | 68 +---------------------------------
>  1 file changed, 2 insertions(+), 66 deletions(-)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 0e67f1721a3d98..9a16815b3f3434 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -569,66 +569,6 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop)
>  	kfree(data);
>  }
>  
> -static size_t arm_lpae_split_blk_unmap(struct arm_lpae_io_pgtable *data,
> -				       struct iommu_iotlb_gather *gather,
> -				       unsigned long iova, size_t size,
> -				       arm_lpae_iopte blk_pte, int lvl,
> -				       arm_lpae_iopte *ptep, size_t pgcount)
> -{
> -	struct io_pgtable_cfg *cfg = &data->iop.cfg;
> -	arm_lpae_iopte pte, *tablep;
> -	phys_addr_t blk_paddr;
> -	size_t tablesz = ARM_LPAE_GRANULE(data);
> -	size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
> -	int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data);
> -	int i, unmap_idx_start = -1, num_entries = 0, max_entries;
> -
> -	if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
> -		return 0;
> -
> -	tablep = __arm_lpae_alloc_pages(tablesz, GFP_ATOMIC, cfg, data->iop.cookie);
> -	if (!tablep)
> -		return 0; /* Bytes unmapped */
> -
> -	if (size == split_sz) {
> -		unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
> -		max_entries = ptes_per_table - unmap_idx_start;
> -		num_entries = min_t(int, pgcount, max_entries);
> -	}
> -
> -	blk_paddr = iopte_to_paddr(blk_pte, data);
> -	pte = iopte_prot(blk_pte);
> -
> -	for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) {
> -		/* Unmap! */
> -		if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries))
> -			continue;
> -
> -		__arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, &tablep[i]);
> -	}
> -
> -	pte = arm_lpae_install_table(tablep, ptep, blk_pte, data);
> -	if (pte != blk_pte) {
> -		__arm_lpae_free_pages(tablep, tablesz, cfg, data->iop.cookie);
> -		/*
> -		 * We may race against someone unmapping another part of this
> -		 * block, but anything else is invalid. We can't misinterpret
> -		 * a page entry here since we're never at the last level.
> -		 */
> -		if (iopte_type(pte) != ARM_LPAE_PTE_TYPE_TABLE)
> -			return 0;
> -
> -		tablep = iopte_deref(pte, data);
> -	} else if (unmap_idx_start >= 0) {
> -		for (i = 0; i < num_entries; i++)
> -			io_pgtable_tlb_add_page(&data->iop, gather, iova + i * size, size);
> -
> -		return num_entries * size;
> -	}
> -
> -	return __arm_lpae_unmap(data, gather, iova, size, pgcount, lvl, tablep);
> -}
> -
>  static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>  			       struct iommu_iotlb_gather *gather,
>  			       unsigned long iova, size_t size, size_t pgcount,
> @@ -678,12 +618,8 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
>  
>  		return i * size;
>  	} else if (iopte_leaf(pte, lvl, iop->fmt)) {
> -		/*
> -		 * Insert a table at the next level to map the old region,
> -		 * minus the part we want to unmap
> -		 */
> -		return arm_lpae_split_blk_unmap(data, gather, iova, size, pte,
> -						lvl + 1, ptep, pgcount);
> +		WARN_ONCE(true, "Unmap of a partial large IOPTE is not allowed");
> +		return 0;
>  	}
>  
>  	/* Keep on walkin' */



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-11-06 15:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-04 17:41 [PATCH v2 0/3] Remove split on unmap behavior Jason Gunthorpe
2024-11-04 17:41 ` [PATCH v2 1/3] iommu/io-pgtable-arm: " Jason Gunthorpe
2024-11-04 18:38   ` Liviu Dudau
2024-11-06 15:12   ` Steven Price
2024-11-04 17:41 ` [PATCH v2 2/3] iommu/io-pgtable-arm-v7s: " Jason Gunthorpe
2024-11-04 19:53   ` Robin Murphy
2024-11-04 20:09     ` Jason Gunthorpe
2024-11-05 16:59       ` Will Deacon
2024-11-05 17:11         ` Jason Gunthorpe
2024-11-04 17:41 ` [PATCH v2 3/3] iommu: Add a kdoc to iommu_unmap() Jason Gunthorpe
2024-11-04 18:42   ` Liviu Dudau
2024-11-05  3:46   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).