From: Jason Gunthorpe <jgg@nvidia.com>
To: Will Deacon <will@kernel.org>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Boris Brezillon <boris.brezillon@collabora.com>,
dri-devel@lists.freedesktop.org,
Liviu Dudau <liviu.dudau@arm.com>,
patches@lists.linux.dev, Steven Price <steven.price@arm.com>
Subject: Re: [PATCH] iommu/io-pgtable-arm: Remove split on unmap behavior
Date: Thu, 24 Oct 2024 10:44:11 -0300 [thread overview]
Message-ID: <20241024134411.GA6956@nvidia.com> (raw)
In-Reply-To: <20241024130550.GE30704@willie-the-truck>
On Thu, Oct 24, 2024 at 02:05:53PM +0100, Will Deacon wrote:
> My recollection is hazy, but I seem to remember VFIO using the largest
> page sizes in the IOMMU 'pgsize_bitmap' for map() requests but then
> using the smallest page size for unmap() requests, so you'd end up
> cracking block mappings when tearing down a VM with assigne devices.
>
> Is this what you're referring to when you say?
Sounds like it, but I'm really hazy on the long ago history here.
> > Long ago VFIO could trigger a path like this, today I know of no user of
> > this functionality.
>
> If so, please can you provide a reference to the patch that moved VFIO
> off that problematic behaviour?
Looking more deeply, I'm not really sure VFIO ever required this.
vfio commit 166fd7d94afd ("vfio: hugepage support for
vfio_iommu_type1") is the thing that added the huge page support, and
it called map like:
+ ret = iommu_map(iommu->domain, iova,
+ (phys_addr_t)pfn << PAGE_SHIFT,
+ npage << PAGE_SHIFT, prot);
But then the unmap path still looked like:
+ unmapped = iommu_unmap(iommu->domain, iova, PAGE_SIZE);
+ unlocked += vfio_unpin_pages(phys >> PAGE_SHIFT,
+ unmapped >> PAGE_SHIFT,
+ dma->prot, false);
So at that time it was relying on the "unmap smaller gives the larger
size" trick that we see Intel and AMD implementing today. No
requirement for split, but the ARM split code could be triggered.
Next came the introduction of VFIO_TYPE1v2_IOMMU which eliminated the
ability for userspace to request splitting a mapping. Userspace can
only unmap what userspace maps. commit 1ef3e2bc0422
("vfio/iommu_type1: Multi-IOMMU domain support")
To do this, our DMA tracking needs to change. We currently try to
coalesce user mappings into as few tracking entries as possible. The
problem then becomes that we lose granularity of user mappings. We've
never guaranteed that a user is able to unmap at a finer granularity
than the original mapping, but we must honor the granularity of the
original mapping. This coalescing code is therefore removed, allowing
only unmaps covering complete maps. The change in accounting is
fairly small here, a typical QEMU VM will start out with roughly a
dozen entries, so it's arguable if this coalescing was ever needed.
That blocked any requirement for splitting driven by the uAPI. Noting
uAPI based splitting never worked right and AFAICT AMD didn't
implement split at the time.
Finally, commit 6fe1010d6d9c ("vfio/type1: DMA unmap chunking")
changed the unmap loop to this:
- unmapped = iommu_unmap(domain->domain, iova, PAGE_SIZE);
+ /*
+ * To optimize for fewer iommu_unmap() calls, each of which
+ * may require hardware cache flushing, try to find the
+ * largest contiguous physical memory chunk to unmap.
+ */
+ for (len = PAGE_SIZE;
+ !domain->fgsp && iova + len < end; len += PAGE_SIZE) {
+ next = iommu_iova_to_phys(domain->domain, iova + len);
+ if (next != phys + len)
+ break;
+ }
+
+ unmapped = iommu_unmap(domain->domain, iova, len);
fgsp=true is only possible on AMD, and this loop effectively
guarantees that the iommu driver will never see a partial huge page
unmap as the size is discovered via iommu_iova_to_phys() before
calling unmap.
At that point only the AMD driver will ever see a page size that is
smaller than what is in the radix tree.
Jason
next prev parent reply other threads:[~2024-10-24 13:46 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-18 17:19 [PATCH] iommu/io-pgtable-arm: Remove split on unmap behavior Jason Gunthorpe
2024-10-21 9:17 ` Boris Brezillon
2024-10-21 11:32 ` Steven Price
2024-10-21 12:17 ` Jason Gunthorpe
2024-10-21 13:50 ` Robin Murphy
2024-10-21 14:20 ` Jason Gunthorpe
2024-10-24 13:05 ` Will Deacon
2024-10-24 13:44 ` Jason Gunthorpe [this message]
2024-11-01 11:54 ` Will Deacon
2024-11-01 11:58 ` Will Deacon
2024-11-01 15:37 ` Jason Gunthorpe
2024-11-01 13:40 ` Jason Gunthorpe
2024-11-04 11:32 ` Will Deacon
2025-07-05 20:12 ` Daniel Mentz
2025-07-07 16:05 ` Liviu Dudau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241024134411.GA6956@nvidia.com \
--to=jgg@nvidia.com \
--cc=boris.brezillon@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=liviu.dudau@arm.com \
--cc=patches@lists.linux.dev \
--cc=robin.murphy@arm.com \
--cc=steven.price@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).