public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex@shazbot.org>
To: Alex Mastro <amastro@fb.com>
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 0/3] vfio: handle DMA map/unmap up to the addressable limit
Date: Thu, 16 Oct 2025 16:01:38 -0600	[thread overview]
Message-ID: <20251016160138.374c8cfb@shazbot.org> (raw)
In-Reply-To: <aPFheZru+U+C4jT7@devgpu015.cco6.facebook.com>

On Thu, 16 Oct 2025 14:19:53 -0700
Alex Mastro <amastro@fb.com> wrote:

> On Wed, Oct 15, 2025 at 05:25:14PM -0400, Alejandro Jimenez wrote:
> > 
> > 
> > On 10/15/25 3:24 PM, Alex Williamson wrote:  
> > > On Sun, 12 Oct 2025 22:32:23 -0700
> > > Alex Mastro <amastro@fb.com> wrote:
> > >   
> > > > This patch series aims to fix vfio_iommu_type.c to support
> > > > VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA operations targeting IOVA
> > > > ranges which lie against the addressable limit. i.e. ranges where
> > > > iova_start + iova_size would overflow to exactly zero.  
> > > 
> > > The series looks good to me and passes my testing.  Any further reviews
> > > from anyone?  I think we should make this v6.18-rc material.  Thanks,
> > >   
> > 
> > I haven't had a chance yet to closely review the latest patchset versions,
> > but I did test this v4 and confirmed that it solves the issue of not being
> > able to unmap an IOVA range extending up to the address space boundary. I
> > verified both with the simplified test case at:
> > https://gist.github.com/aljimenezb/f3338c9c2eda9b0a7bf5f76b40354db8
> > 
> > plus using QEMU's amd-iommu and a guest with iommu.passthrough=0
> > iommu.forcedac=1 (which is how I first found the problem).
> > 
> > So Alex Mastro, please feel free to add:
> > 
> > Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
> > 
> > for the series. I'll try to find time to review the patches in detail.
> > 
> > Thank you,
> > Alejandro
> >   
> > > Alex  
> >   
> 
> Thanks all. I would like additional scrutiny around vfio_iommu_replay. It was
> the one block of affected code I have not been able to test, since I don't have
> / am not sure how to simulate a setup which can cause mappings to be replayed
> on a newly added IOMMU domain. My confidence in that code is from close review
> only.
> 
> I explicitly tested various combinations of the following with mappings up to
> the addressable limit:
> - VFIO_IOMMU_MAP_DMA
> - VFIO_IOMMU_UNMAP_DMA range-based, and VFIO_DMA_UNMAP_FLAG_ALL
> - VFIO_IOMMU_DIRTY_PAGES with VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP
> 
> My understanding is that my changes to vfio_iommu_replay would be traversed
> when binding a group to a container with existing mappings where the group's
> IOMMU domain is not already one of the domains in the container.

I really appreciate you making note of it if you're uneasy about the
change.  I'll take a closer look there and hopefully others can as
well.

The legacy vfio container represents a single IOMMU context, which is
typically managed by a single domain.  The replay comes into play when
groups are under IOMMUs with different properties that prevent us from
re-using the domain.  The case that most comes to mind for this is
Intel platforms with integrated graphics where there's a separate IOMMU
for the GPU, which iirc has different coherency settings.

That mechanism for triggering replay requires a specific hardware
configuration, but we can easily trigger it through code
instrumentation, ex:

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5167bec14e36..2cb19ddbb524 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2368,7 +2368,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
                    d->enforce_cache_coherency ==
                            domain->enforce_cache_coherency) {
                        iommu_detach_group(domain->domain, group->iommu_group);
-                       if (!iommu_attach_group(d->domain,
+                       if (0 && !iommu_attach_group(d->domain,
                                                group->iommu_group)) {
                                list_add(&group->next, &d->group_list);
                                iommu_domain_free(domain->domain);

We might consider whether it's useful for testing purposes to expose a
mechanism to toggle this.  For a unit test, if we create a container,
add a group, and build up some suspect mappings, if we then add another
group to the container with the above bypass we should trigger the
replay.

In general though the replay shouldn't have a mechanism to trigger
overflows, we're simply iterating the current set of mappings that have
already been validated and applying them to a new domain.

In any case, we can all take a second look at the changes there.
Thanks,

Alex

  reply	other threads:[~2025-10-16 22:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13  5:32 [PATCH v4 0/3] vfio: handle DMA map/unmap up to the addressable limit Alex Mastro
2025-10-13  5:32 ` [PATCH v4 1/3] vfio/type1: sanitize for overflow using check_*_overflow Alex Mastro
2025-10-13  5:32 ` [PATCH v4 2/3] vfio/type1: move iova increment to unmap_unpin_* caller Alex Mastro
2025-10-13  5:32 ` [PATCH v4 3/3] vfio/type1: handle DMA map/unmap up to the addressable limit Alex Mastro
2025-10-21 22:18   ` Alejandro Jimenez
2025-10-22 14:24     ` Alex Mastro
2025-10-15 19:24 ` [PATCH v4 0/3] vfio: " Alex Williamson
2025-10-15 21:25   ` Alejandro Jimenez
2025-10-16 21:19     ` Alex Mastro
2025-10-16 22:01       ` Alex Williamson [this message]
2025-10-17 16:29         ` Alex Mastro
2025-10-20 21:36           ` Alex Williamson
2025-10-21 16:25             ` Alex Mastro
2025-10-21 16:31               ` David Matlack
2025-10-21 19:13                 ` Alex Mastro
2025-10-22  0:38                   ` David Matlack
2025-10-22 14:55                     ` Alex Mastro
2025-10-23 20:52               ` Alex Mastro
2025-10-23 22:33                 ` Alex Williamson
2025-10-27 16:02               ` Alex Mastro
2025-10-28  1:57                 ` Alex Williamson
2025-10-28 15:29                   ` Alex Mastro
2025-10-21 12:36 ` Jason Gunthorpe
2025-10-21 22:21 ` Alejandro Jimenez
2025-10-25 18:11 ` Alex Mastro
2025-10-27 13:39   ` Jason Gunthorpe
2025-10-28 18:42     ` Alex Mastro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251016160138.374c8cfb@shazbot.org \
    --to=alex@shazbot.org \
    --cc=alejandro.j.jimenez@oracle.com \
    --cc=amastro@fb.com \
    --cc=jgg@ziepe.ca \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox