* Re: [PATCH] iommu: Always fill in gather when unmapping
[not found] <0-v1-664d3acaabb9+78b-iommu_gather_always_jgg@nvidia.com>
@ 2026-04-01 10:40 ` Jon Hunter
[not found] ` <ee2c2044-e329-4cdd-ac35-9365824d3677@arm.com>
1 sibling, 0 replies; 3+ messages in thread
From: Jon Hunter @ 2026-04-01 10:40 UTC (permalink / raw)
To: Jason Gunthorpe, Alexandre Ghiti, AngeloGioacchino Del Regno,
Albert Ou, asahi, Baolin Wang, iommu, Janne Grunau,
Jernej Skrabec, Joerg Roedel, Jean-Philippe Brucker,
linux-arm-kernel, linux-mediatek, linux-riscv, linux-sunxi,
Matthias Brugger, Neal Gompa, Orson Zhai, Palmer Dabbelt,
Paul Walmsley, Samuel Holland, Sven Peter, virtualization,
Chen-Yu Tsai, Will Deacon, Yong Wu, Chunyan Zhang
Cc: Lu Baolu, Janusz Krzysztofik, Joerg Roedel, patches, Robin Murphy,
Samiullah Khawaja, stable, Vasant Hegde,
linux-tegra@vger.kernel.org
On 31/03/2026 20:56, Jason Gunthorpe wrote:
> The fixed commit assumed that the gather would always be populated if
> an iotlb_sync was required.
>
> arm-smmu-v3, amd, VT-d, riscv, s390, mtk all use information from the
> gather during their iotlb_sync() and this approach works for them.
>
> However, arm-smmu, qcom_iommu, ipmmu-vmsa, sun50i, sprd, virtio,
> apple-dart all ignore the gather during their iotlb_sync(). They
> mostly issue a full flush.
>
> Unfortunately the latter set of drivers often don't bother to add
> anything to the gather since they don't intend on using it. Since the
> core code now blocks gathers that were never filled, this caused those
> drivers to stop getting their iotlb_sync() calls and breaks them.
>
> Since it is impossible to tell the difference between gathers that are
> empty because there is nothing to do and gathers that are empty
> because they are not used, fill in the gathers for the missing cases.
>
> io-pgtable might have intended to allow the driver to choose between
> gather or immediate flush because it passed gather to
> ops->tlb_add_page(), however no driver does anything with it.
>
> mtk uses io-pgtable-arm-v7s but added the range to the gather in the
> unmap callback. Move this into the io-pgtable-arm unmap itself. That
> will fix all the armv7 using drivers (arm-smmu, qcom_iommu,
> ipmmu-vmsa).
>
> arm-smmu uses both ARM_V7S and ARM LPAE formats. The LPAE formats
> already have the gather population because SMMUv3 requires it, so it
> becomes consistent.
>
> Add a trivial gather population to io-pgtable-dart.
>
> Add trivial populations to sprd, sun50i and virtio-iommu in their
> unmap functions.
>
> Fixes: 90c5def10bea ("iommu: Do not call drivers for empty gathers")
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/r/8800a38b-8515-4bbe-af15-0dae81274bf7@nvidia.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/io-pgtable-arm.c | 4 +++-
> drivers/iommu/io-pgtable-dart.c | 3 +++
> drivers/iommu/mtk_iommu.c | 1 -
> drivers/iommu/sprd-iommu.c | 1 +
> drivers/iommu/sun50i-iommu.c | 1 +
> drivers/iommu/virtio-iommu.c | 2 ++
> 6 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 0208e5897c299a..8572713a42ca29 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -666,9 +666,11 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> /* Clear the remaining entries */
> __arm_lpae_clear_pte(ptep, &iop->cfg, i);
>
> - if (gather && !iommu_iotlb_gather_queued(gather))
> + if (gather && !iommu_iotlb_gather_queued(gather)) {
> + iommu_iotlb_gather_add_range(gather, iova, i * size);
> for (int j = 0; j < i; j++)
> io_pgtable_tlb_add_page(iop, gather, iova + j * size, size);
> + }
>
> return i * size;
> } else if (iopte_leaf(pte, lvl, iop->fmt)) {
> diff --git a/drivers/iommu/io-pgtable-dart.c b/drivers/iommu/io-pgtable-dart.c
> index cbc5d6aa2daa23..75d699dc28e7b0 100644
> --- a/drivers/iommu/io-pgtable-dart.c
> +++ b/drivers/iommu/io-pgtable-dart.c
> @@ -330,6 +330,9 @@ static size_t dart_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova,
> i++;
> }
>
> + if (i && !iommu_iotlb_gather_queued(gather))
> + iommu_iotlb_gather_add_range(gather, iova, i * pgsize);
> +
> return i * pgsize;
> }
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 2be990c108de2b..a2f80a92f51f2c 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -828,7 +828,6 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
> {
> struct mtk_iommu_domain *dom = to_mtk_domain(domain);
>
> - iommu_iotlb_gather_add_range(gather, iova, pgsize * pgcount);
> return dom->iop->unmap_pages(dom->iop, iova, pgsize, pgcount, gather);
> }
>
> diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
> index c1a34445d244fb..893ea67d322644 100644
> --- a/drivers/iommu/sprd-iommu.c
> +++ b/drivers/iommu/sprd-iommu.c
> @@ -340,6 +340,7 @@ static size_t sprd_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
> spin_lock_irqsave(&dom->pgtlock, flags);
> memset(pgt_base_iova, 0, pgcount * sizeof(u32));
> spin_unlock_irqrestore(&dom->pgtlock, flags);
> + iommu_iotlb_gather_add_range(iotlb_gather, iova, size);
>
> return size;
> }
> diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c
> index be3f1ce696ba29..b9aa4bbc82acad 100644
> --- a/drivers/iommu/sun50i-iommu.c
> +++ b/drivers/iommu/sun50i-iommu.c
> @@ -655,6 +655,7 @@ static size_t sun50i_iommu_unmap(struct iommu_domain *domain, unsigned long iova
>
> memset(pte_addr, 0, sizeof(*pte_addr));
> sun50i_table_flush(sun50i_domain, pte_addr, 1);
> + iommu_iotlb_gather_add_range(gather, iova, SZ_4K);
>
> return SZ_4K;
> }
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 587fc13197f122..5865b8f6c6e67a 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -897,6 +897,8 @@ static size_t viommu_unmap_pages(struct iommu_domain *domain, unsigned long iova
> if (unmapped < size)
> return 0;
>
> + iommu_iotlb_gather_add_range(gather, iova, unmapped);
> +
> /* Device already removed all mappings after detach. */
> if (!vdomain->nr_endpoints)
> return unmapped;
>
> base-commit: fcbe430399ca5c318e99bfda6df9beee90ab051c
Fixes the issue I was seeing ...
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Thanks!
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 3+ messages in thread