* [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode
@ 2026-02-27 15:27 Jason Gunthorpe
2026-02-27 21:02 ` Samiullah Khawaja
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-02-27 15:27 UTC (permalink / raw)
To: iommu, Joerg Roedel, Will Deacon; +Cc: patches, Robin Murphy
In PT_FEAT_FLUSH_RANGE mode the gather was accumulated but never flushed
and then the accumulated range was discarded by the dma-iommu code in
DMA-FQ mode. This is basically optimal.
However for PT_FEAT_FLUSH_RANGE_NO_GAPS the page table would push flushes
that are redundant with the flush all generated by the DMA-FQ mode.
Disable all range accumulation in the gather, and iommu_pt triggered
flushing when in iommu_iotlb_gather_queued() indicates it is in DMA-FQ
mode.
Reported-by: Robin Murphy <robin.murphy@arm.com>
Closes: https://lore.kernel.org/r/794b6121-b66b-4819-b291-9761ed21cd83@arm.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/iommu/generic_pt/iommu_pt.h | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h
index 3e33fe64feab22..9c08bb594e4173 100644
--- a/drivers/iommu/generic_pt/iommu_pt.h
+++ b/drivers/iommu/generic_pt/iommu_pt.h
@@ -51,16 +51,27 @@ static void gather_range_pages(struct iommu_iotlb_gather *iotlb_gather,
iommu_pages_stop_incoherent_list(free_list,
iommu_table->iommu_device);
- if (pt_feature(common, PT_FEAT_FLUSH_RANGE_NO_GAPS) &&
- iommu_iotlb_gather_is_disjoint(iotlb_gather, iova, len)) {
- iommu_iotlb_sync(&iommu_table->domain, iotlb_gather);
- /*
- * Note that the sync frees the gather's free list, so we must
- * not have any pages on that list that are covered by iova/len
- */
+ /*
+ * If running in DMA-FQ mode then the unmap will be followed by an IOTLB
+ * flush all so we need to optimize by never flushing the IOTLB here.
+ *
+ * For NO_GAPS the user gets to pick if flushing all or doing micro
+ * flushes is better for their work load by choosing DMA vs DMA-FQ
+ * operation. Drivers should also see shadow_on_flush.
+ */
+ if (!iommu_iotlb_gather_queued(iotlb_gather)) {
+ if (pt_feature(common, PT_FEAT_FLUSH_RANGE_NO_GAPS) &&
+ iommu_iotlb_gather_is_disjoint(iotlb_gather, iova, len)) {
+ iommu_iotlb_sync(&iommu_table->domain, iotlb_gather);
+ /*
+ * Note that the sync frees the gather's free list, so
+ * we must not have any pages on that list that are
+ * covered by iova/len
+ */
+ }
+ iommu_iotlb_gather_add_range(iotlb_gather, iova, len);
}
- iommu_iotlb_gather_add_range(iotlb_gather, iova, len);
iommu_pages_list_splice(free_list, &iotlb_gather->freelist);
}
base-commit: 851dbe76d47e790db0d6ac5151e7f22be357846a
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode
2026-02-27 15:27 [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode Jason Gunthorpe
@ 2026-02-27 21:02 ` Samiullah Khawaja
2026-03-03 3:55 ` Baolu Lu
2026-03-17 12:15 ` Joerg Roedel
2 siblings, 0 replies; 4+ messages in thread
From: Samiullah Khawaja @ 2026-02-27 21:02 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Joerg Roedel, Will Deacon, patches, Robin Murphy
On Fri, Feb 27, 2026 at 11:27:02AM -0400, Jason Gunthorpe wrote:
>In PT_FEAT_FLUSH_RANGE mode the gather was accumulated but never flushed
>and then the accumulated range was discarded by the dma-iommu code in
>DMA-FQ mode. This is basically optimal.
>
>However for PT_FEAT_FLUSH_RANGE_NO_GAPS the page table would push flushes
>that are redundant with the flush all generated by the DMA-FQ mode.
>
>Disable all range accumulation in the gather, and iommu_pt triggered
>flushing when in iommu_iotlb_gather_queued() indicates it is in DMA-FQ
>mode.
>
>Reported-by: Robin Murphy <robin.murphy@arm.com>
>Closes: https://lore.kernel.org/r/794b6121-b66b-4819-b291-9761ed21cd83@arm.com
>Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>---
> drivers/iommu/generic_pt/iommu_pt.h | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
>diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h
>index 3e33fe64feab22..9c08bb594e4173 100644
>--- a/drivers/iommu/generic_pt/iommu_pt.h
>+++ b/drivers/iommu/generic_pt/iommu_pt.h
>@@ -51,16 +51,27 @@ static void gather_range_pages(struct iommu_iotlb_gather *iotlb_gather,
> iommu_pages_stop_incoherent_list(free_list,
> iommu_table->iommu_device);
>
>- if (pt_feature(common, PT_FEAT_FLUSH_RANGE_NO_GAPS) &&
>- iommu_iotlb_gather_is_disjoint(iotlb_gather, iova, len)) {
>- iommu_iotlb_sync(&iommu_table->domain, iotlb_gather);
>- /*
>- * Note that the sync frees the gather's free list, so we must
>- * not have any pages on that list that are covered by iova/len
>- */
>+ /*
>+ * If running in DMA-FQ mode then the unmap will be followed by an IOTLB
>+ * flush all so we need to optimize by never flushing the IOTLB here.
>+ *
>+ * For NO_GAPS the user gets to pick if flushing all or doing micro
>+ * flushes is better for their work load by choosing DMA vs DMA-FQ
>+ * operation. Drivers should also see shadow_on_flush.
>+ */
>+ if (!iommu_iotlb_gather_queued(iotlb_gather)) {
>+ if (pt_feature(common, PT_FEAT_FLUSH_RANGE_NO_GAPS) &&
>+ iommu_iotlb_gather_is_disjoint(iotlb_gather, iova, len)) {
>+ iommu_iotlb_sync(&iommu_table->domain, iotlb_gather);
>+ /*
>+ * Note that the sync frees the gather's free list, so
>+ * we must not have any pages on that list that are
>+ * covered by iova/len
>+ */
>+ }
>+ iommu_iotlb_gather_add_range(iotlb_gather, iova, len);
> }
>
>- iommu_iotlb_gather_add_range(iotlb_gather, iova, len);
> iommu_pages_list_splice(free_list, &iotlb_gather->freelist);
> }
>
>
>base-commit: 851dbe76d47e790db0d6ac5151e7f22be357846a
>--
>2.43.0
>
>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode
2026-02-27 15:27 [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode Jason Gunthorpe
2026-02-27 21:02 ` Samiullah Khawaja
@ 2026-03-03 3:55 ` Baolu Lu
2026-03-17 12:15 ` Joerg Roedel
2 siblings, 0 replies; 4+ messages in thread
From: Baolu Lu @ 2026-03-03 3:55 UTC (permalink / raw)
To: Jason Gunthorpe, iommu, Joerg Roedel, Will Deacon; +Cc: patches, Robin Murphy
On 2/27/26 23:27, Jason Gunthorpe wrote:
> In PT_FEAT_FLUSH_RANGE mode the gather was accumulated but never flushed
> and then the accumulated range was discarded by the dma-iommu code in
> DMA-FQ mode. This is basically optimal.
>
> However for PT_FEAT_FLUSH_RANGE_NO_GAPS the page table would push flushes
> that are redundant with the flush all generated by the DMA-FQ mode.
>
> Disable all range accumulation in the gather, and iommu_pt triggered
> flushing when in iommu_iotlb_gather_queued() indicates it is in DMA-FQ
> mode.
>
> Reported-by: Robin Murphy<robin.murphy@arm.com>
> Closes:https://lore.kernel.org/r/794b6121-b66b-4819-b291-9761ed21cd83@arm.com
> Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
> ---
> drivers/iommu/generic_pt/iommu_pt.h | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode
2026-02-27 15:27 [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode Jason Gunthorpe
2026-02-27 21:02 ` Samiullah Khawaja
2026-03-03 3:55 ` Baolu Lu
@ 2026-03-17 12:15 ` Joerg Roedel
2 siblings, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2026-03-17 12:15 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: iommu, Will Deacon, patches, Robin Murphy
On Fri, Feb 27, 2026 at 11:27:02AM -0400, Jason Gunthorpe wrote:
> In PT_FEAT_FLUSH_RANGE mode the gather was accumulated but never flushed
> and then the accumulated range was discarded by the dma-iommu code in
> DMA-FQ mode. This is basically optimal.
>
> However for PT_FEAT_FLUSH_RANGE_NO_GAPS the page table would push flushes
> that are redundant with the flush all generated by the DMA-FQ mode.
>
> Disable all range accumulation in the gather, and iommu_pt triggered
> flushing when in iommu_iotlb_gather_queued() indicates it is in DMA-FQ
> mode.
>
> Reported-by: Robin Murphy <robin.murphy@arm.com>
> Closes: https://lore.kernel.org/r/794b6121-b66b-4819-b291-9761ed21cd83@arm.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/generic_pt/iommu_pt.h | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
Applied, thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-17 12:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 15:27 [PATCH] iommupt: Optimize the gather processing for DMA-FQ mode Jason Gunthorpe
2026-02-27 21:02 ` Samiullah Khawaja
2026-03-03 3:55 ` Baolu Lu
2026-03-17 12:15 ` Joerg Roedel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox