The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy()
@ 2026-07-01  9:20 Honglei Huang
  2026-07-01 12:36 ` Robin Murphy
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Honglei Huang @ 2026-07-01  9:20 UTC (permalink / raw)
  To: robin.murphy, joro, will, leonro, m.szyprowski
  Cc: iommu, linux-kernel, Ray.Huang, honghuan

dma_iova_try_alloc() reserves IOVA for the whole requested size and
records it in state->__size, but callers may subsequently link only a
part of that reservation, for example the drm_gpusvm mixed range case,
where a device page range is linked incrementally.

The doc for dma_iova_destroy() is:

  "Unlink the IOVA range up to @mapped_len and free the entire IOVA
   space."

However __iommu_dma_iova_unlink() computed the amount of IOVA to free
from @mapped_len rather than from the full reservation. When the
reservation is larger than the linked length, the tail
[mapped_len, reserved size] is never returned to the allocator and
is leaked, contrary to the documented contract.

Free the whole reservation using dma_iova_size(), mirroring
dma_iova_free(). The unmap step still operates on @mapped_len only, and
the same iotlb_gather is reused so a single IOTLB flush is performed.

Fixes: 433a76207dcf ("dma-mapping: Implement link/unlink ranges API")
Cc: stable@vger.kernel.org
Signed-off-by: Honglei Huang <honghuan@amd.com>
---
 drivers/iommu/dma-iommu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9abaec0703e..bb29c82d1c8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -2096,8 +2096,11 @@ static void __iommu_dma_iova_unlink(struct device *dev,
 
 	if (!iotlb_gather.queued)
 		iommu_iotlb_sync(domain, &iotlb_gather);
-	if (free_iova)
+	if (free_iova) {
+		/* Free the whole reservation, not just the linked @size. */
+		size = iova_align(iovad, dma_iova_size(state) + iova_start_pad);
 		iommu_dma_free_iova(domain, addr, size, &iotlb_gather);
+	}
 }
 
 /**

base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy()
  2026-07-01  9:20 [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy() Honglei Huang
@ 2026-07-01 12:36 ` Robin Murphy
  2026-07-01 19:08   ` Leon Romanovsky
  2026-07-01 19:09 ` Leon Romanovsky
  2026-07-02 10:24 ` Leon Romanovsky
  2 siblings, 1 reply; 5+ messages in thread
From: Robin Murphy @ 2026-07-01 12:36 UTC (permalink / raw)
  To: Honglei Huang, joro, will, leonro, m.szyprowski
  Cc: iommu, linux-kernel, Ray.Huang

On 01/07/2026 10:20 am, Honglei Huang wrote:
> dma_iova_try_alloc() reserves IOVA for the whole requested size and
> records it in state->__size, but callers may subsequently link only a
> part of that reservation, for example the drm_gpusvm mixed range case,
> where a device page range is linked incrementally.
> 
> The doc for dma_iova_destroy() is:
> 
>    "Unlink the IOVA range up to @mapped_len and free the entire IOVA
>     space."
> 
> However __iommu_dma_iova_unlink() computed the amount of IOVA to free
> from @mapped_len rather than from the full reservation. When the
> reservation is larger than the linked length, the tail
> [mapped_len, reserved size] is never returned to the allocator and
> is leaked, contrary to the documented contract.

That's not what really happens in practice though - note that 
free_iova() doesn't even take a size, only a pfn with which to look up 
the corresponding rbtree entry. At worst, for sizes small enough for the 
rcaches, a larger IOVA may be put in a cache for a smaller size, which 
although wasteful, is otherwise pretty much benign.

This isn't to say that the allocator behaviour might not eventually 
change in future, but for now I can only assume that dma_iova_destroy() 
doing this is intentional, because I pointed it out at least 3 times 
over the course of the original review from RFC to eventual merge, and 
Leon made a point of refusing to do anything about it :/

Thanks,
Robin.

> Free the whole reservation using dma_iova_size(), mirroring
> dma_iova_free(). The unmap step still operates on @mapped_len only, and
> the same iotlb_gather is reused so a single IOTLB flush is performed.
> 
> Fixes: 433a76207dcf ("dma-mapping: Implement link/unlink ranges API")
> Cc: stable@vger.kernel.org
> Signed-off-by: Honglei Huang <honghuan@amd.com>
> ---
>   drivers/iommu/dma-iommu.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9abaec0703e..bb29c82d1c8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -2096,8 +2096,11 @@ static void __iommu_dma_iova_unlink(struct device *dev,
>   
>   	if (!iotlb_gather.queued)
>   		iommu_iotlb_sync(domain, &iotlb_gather);
> -	if (free_iova)
> +	if (free_iova) {
> +		/* Free the whole reservation, not just the linked @size. */
> +		size = iova_align(iovad, dma_iova_size(state) + iova_start_pad);
>   		iommu_dma_free_iova(domain, addr, size, &iotlb_gather);
> +	}
>   }
>   
>   /**
> 
> base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy()
  2026-07-01 12:36 ` Robin Murphy
@ 2026-07-01 19:08   ` Leon Romanovsky
  0 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2026-07-01 19:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Honglei Huang, joro, will, m.szyprowski, iommu, linux-kernel,
	Ray.Huang

On Wed, Jul 01, 2026 at 01:36:00PM +0100, Robin Murphy wrote:
> On 01/07/2026 10:20 am, Honglei Huang wrote:
> > dma_iova_try_alloc() reserves IOVA for the whole requested size and
> > records it in state->__size, but callers may subsequently link only a
> > part of that reservation, for example the drm_gpusvm mixed range case,
> > where a device page range is linked incrementally.
> > 
> > The doc for dma_iova_destroy() is:
> > 
> >    "Unlink the IOVA range up to @mapped_len and free the entire IOVA
> >     space."
> > 
> > However __iommu_dma_iova_unlink() computed the amount of IOVA to free
> > from @mapped_len rather than from the full reservation. When the
> > reservation is larger than the linked length, the tail
> > [mapped_len, reserved size] is never returned to the allocator and
> > is leaked, contrary to the documented contract.
> 
> That's not what really happens in practice though - note that free_iova()
> doesn't even take a size, only a pfn with which to look up the corresponding
> rbtree entry. At worst, for sizes small enough for the rcaches, a larger
> IOVA may be put in a cache for a smaller size, which although wasteful, is
> otherwise pretty much benign.
> 
> This isn't to say that the allocator behaviour might not eventually change
> in future, but for now I can only assume that dma_iova_destroy() doing this
> is intentional, because I pointed it out at least 3 times over the course of
> the original review from RFC to eventual merge, and Leon made a point of
> refusing to do anything about it :/

I found only one sentence that was somewhat unclear in this
context. Are you referring to it?
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
https://lore.kernel.org/all/?q=dma_iova_destroy+robin+romanovsky
https://lore.kernel.org/all/ad2312e0-10d5-467a-be5e-75e80805b311@arm.com/
> +	if (free_iova)
> +		iommu_dma_free_iova(cookie, addr, size, &iotlb_gather);

Case in point, can you spot the bug here if dma_iova_destroy() is used
as intended? At least it's the relatively benign direction of this bug,
not the really fun pagetable corruption one.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Regarding the fix, I think it is the correct one. I intended to free
all space previously allocated by iommu_dma_alloc_iova(), not just
"mapped_len", as I wrote at the end.

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy()
  2026-07-01  9:20 [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy() Honglei Huang
  2026-07-01 12:36 ` Robin Murphy
@ 2026-07-01 19:09 ` Leon Romanovsky
  2026-07-02 10:24 ` Leon Romanovsky
  2 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2026-07-01 19:09 UTC (permalink / raw)
  To: Honglei Huang
  Cc: robin.murphy, joro, will, m.szyprowski, iommu, linux-kernel,
	Ray.Huang

On Wed, Jul 01, 2026 at 05:20:33PM +0800, Honglei Huang wrote:
> dma_iova_try_alloc() reserves IOVA for the whole requested size and
> records it in state->__size, but callers may subsequently link only a
> part of that reservation, for example the drm_gpusvm mixed range case,
> where a device page range is linked incrementally.
> 
> The doc for dma_iova_destroy() is:
> 
>   "Unlink the IOVA range up to @mapped_len and free the entire IOVA
>    space."
> 
> However __iommu_dma_iova_unlink() computed the amount of IOVA to free
> from @mapped_len rather than from the full reservation. When the
> reservation is larger than the linked length, the tail
> [mapped_len, reserved size] is never returned to the allocator and
> is leaked, contrary to the documented contract.
> 
> Free the whole reservation using dma_iova_size(), mirroring
> dma_iova_free(). The unmap step still operates on @mapped_len only, and
> the same iotlb_gather is reused so a single IOTLB flush is performed.
> 
> Fixes: 433a76207dcf ("dma-mapping: Implement link/unlink ranges API")
> Cc: stable@vger.kernel.org
> Signed-off-by: Honglei Huang <honghuan@amd.com>
> ---
>  drivers/iommu/dma-iommu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy()
  2026-07-01  9:20 [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy() Honglei Huang
  2026-07-01 12:36 ` Robin Murphy
  2026-07-01 19:09 ` Leon Romanovsky
@ 2026-07-02 10:24 ` Leon Romanovsky
  2 siblings, 0 replies; 5+ messages in thread
From: Leon Romanovsky @ 2026-07-02 10:24 UTC (permalink / raw)
  To: Honglei Huang
  Cc: robin.murphy, joro, will, m.szyprowski, iommu, linux-kernel,
	Ray.Huang

On Wed, Jul 01, 2026 at 05:20:33PM +0800, Honglei Huang wrote:
> dma_iova_try_alloc() reserves IOVA for the whole requested size and
> records it in state->__size, but callers may subsequently link only a
> part of that reservation, for example the drm_gpusvm mixed range case,
> where a device page range is linked incrementally.
> 
> The doc for dma_iova_destroy() is:
> 
>   "Unlink the IOVA range up to @mapped_len and free the entire IOVA
>    space."
> 
> However __iommu_dma_iova_unlink() computed the amount of IOVA to free
> from @mapped_len rather than from the full reservation. When the
> reservation is larger than the linked length, the tail
> [mapped_len, reserved size] is never returned to the allocator and
> is leaked, contrary to the documented contract.
> 
> Free the whole reservation using dma_iova_size(), mirroring
> dma_iova_free(). The unmap step still operates on @mapped_len only, and
> the same iotlb_gather is reused so a single IOTLB flush is performed.
> 
> Fixes: 433a76207dcf ("dma-mapping: Implement link/unlink ranges API")
> Cc: stable@vger.kernel.org
> Signed-off-by: Honglei Huang <honghuan@amd.com>
> ---
>  drivers/iommu/dma-iommu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 9abaec0703e..bb29c82d1c8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -2096,8 +2096,11 @@ static void __iommu_dma_iova_unlink(struct device *dev,
>  
>  	if (!iotlb_gather.queued)
>  		iommu_iotlb_sync(domain, &iotlb_gather);
> -	if (free_iova)
> +	if (free_iova) {
> +		/* Free the whole reservation, not just the linked @size. */
> +		size = iova_align(iovad, dma_iova_size(state) + iova_start_pad);
>  		iommu_dma_free_iova(domain, addr, size, &iotlb_gather);
> +	}

Probably the best change will be something like this:

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9abaec0703ef..56173e24c8cc 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -2068,10 +2068,20 @@ static void iommu_dma_iova_unlink_range_slow(struct device *dev,
 		arch_sync_dma_flush();
 }
 
-static void __iommu_dma_iova_unlink(struct device *dev,
-		struct dma_iova_state *state, size_t offset, size_t size,
-		enum dma_data_direction dir, unsigned long attrs,
-		bool free_iova)
+/**
+ * dma_iova_unlink - Unlink a range of IOVA space
+ * @dev: DMA device
+ * @state: IOVA state
+ * @offset: offset into the IOVA state to unlink
+ * @size: size of the buffer
+ * @dir: DMA direction
+ * @attrs: attributes of mapping properties
+ *
+ * Unlink a range of IOVA space for the given IOVA state.
+ */
+void dma_iova_unlink(struct device *dev, struct dma_iova_state *state,
+		size_t offset, size_t size, enum dma_data_direction dir,
+		unsigned long attrs)
 {
 	struct iommu_domain *domain = iommu_get_dma_domain(dev);
 	struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -2096,26 +2106,6 @@ static void __iommu_dma_iova_unlink(struct device *dev,
 
 	if (!iotlb_gather.queued)
 		iommu_iotlb_sync(domain, &iotlb_gather);
-	if (free_iova)
-		iommu_dma_free_iova(domain, addr, size, &iotlb_gather);
-}
-
-/**
- * dma_iova_unlink - Unlink a range of IOVA space
- * @dev: DMA device
- * @state: IOVA state
- * @offset: offset into the IOVA state to unlink
- * @size: size of the buffer
- * @dir: DMA direction
- * @attrs: attributes of mapping properties
- *
- * Unlink a range of IOVA space for the given IOVA state.
- */
-void dma_iova_unlink(struct device *dev, struct dma_iova_state *state,
-		size_t offset, size_t size, enum dma_data_direction dir,
-		unsigned long attrs)
-{
-	 __iommu_dma_iova_unlink(dev, state, offset, size, dir, attrs, false);
 }
 EXPORT_SYMBOL_GPL(dma_iova_unlink);
 
@@ -2136,14 +2126,13 @@ void dma_iova_destroy(struct device *dev, struct dma_iova_state *state,
 		unsigned long attrs)
 {
 	if (mapped_len)
-		__iommu_dma_iova_unlink(dev, state, 0, mapped_len, dir, attrs,
-				true);
-	else
-		/*
-		 * We can be here if first call to dma_iova_link() failed and
-		 * there is nothing to unlink, so let's be more clear.
-		 */
-		dma_iova_free(dev, state);
+		dma_iova_unlink(dev, state, 0, mapped_len, dir, attrs);
+
+	/*
+	 * We can be here if first call to dma_iova_link() failed and
+	 * there is nothing to unlink, so let's be more clear.
+	 */
+	dma_iova_free(dev, state);
 }
 EXPORT_SYMBOL_GPL(dma_iova_destroy);
 

>  }
>  
>  /**
> 
> base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> -- 
> 2.34.1
> 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-07-02 10:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01  9:20 [PATCH] iommu/dma: free the entire IOVA reservation in dma_iova_destroy() Honglei Huang
2026-07-01 12:36 ` Robin Murphy
2026-07-01 19:08   ` Leon Romanovsky
2026-07-01 19:09 ` Leon Romanovsky
2026-07-02 10:24 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox