From: Leon Romanovsky <leon@kernel.org>
To: Matthew Brost <matthew.brost@intel.com>
Cc: "Jason Gunthorpe" <jgg@nvidia.com>,
"Francois Dugast" <francois.dugast@intel.com>,
iommu@lists.linux.dev, intel-xe@lists.freedesktop.org,
"Joerg Roedel" <joerg.roedel@amd.com>,
"Calvin Owens" <calvin@wbinvd.org>,
"David Woodhouse" <dwmw2@infradead.org>,
"Will Deacon" <will@kernel.org>,
"Robin Murphy" <robin.murphy@arm.com>,
"Samiullah Khawaja" <skhawaja@google.com>,
"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Tina Zhang" <tina.zhang@intel.com>,
"Lu Baolu" <baolu.lu@linux.intel.com>,
"Kevin Tian" <kevin.tian@intel.com>
Subject: Re: Xe performance regression with recent IOMMU changes
Date: Thu, 22 Jan 2026 12:26:07 +0200 [thread overview]
Message-ID: <20260122102607.GK13201@unreal> (raw)
In-Reply-To: <aXHTjz6SP5w/PTPa@lstrano-desk.jf.intel.com>
On Wed, Jan 21, 2026 at 11:36:47PM -0800, Matthew Brost wrote:
> On Thu, Jan 22, 2026 at 09:29:13AM +0200, Leon Romanovsky wrote:
> > On Wed, Jan 21, 2026 at 10:15:14PM -0800, Matthew Brost wrote:
> > > On Wed, Jan 21, 2026 at 02:04:49PM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Jan 21, 2026 at 09:11:35AM -0400, Jason Gunthorpe wrote:
> > > > > On Wed, Jan 21, 2026 at 02:02:16PM +0100, Francois Dugast wrote:
> > > > > > I am reporting a slowdown in Xe caused by a couple of IOMMU changes. It
> > > > > > can be observed during DMA mappings/unmappings required to issue copies
> > > > > > between system memory and the device, when handling GPU faults. Not sure
> > > > > > how other use cases or vendors are affected but below is the impact on
> > > > > > execution times for BMG:
> > > > > >
> > > > > > Before changes:
> > > > > > 4KB
> > > > > > drm_pagemap_migrate_map_pages: 0.4 us
> > > > > > drm_pagemap_migrate_unmap_pages: 0.4 us
> > > > > > 64KB
> > > > > > drm_pagemap_migrate_map_pages: 2.5 us
> > > > > > drm_pagemap_migrate_unmap_pages: 3.5 us
> > > > > > 2MB
> > > > > > drm_pagemap_migrate_map_pages: 88 us
> > > > > > drm_pagemap_migrate_unmap_pages: 108 us
> > > > > >
> > > > > > After changes:
> > > > > > 4KB
> > > > > > drm_pagemap_migrate_map_pages: 0.7 us
> > > > > > drm_pagemap_migrate_unmap_pages: 0.7 us
> > > > > > 64KB
> > > > > > drm_pagemap_migrate_map_pages: 3.5 us
> > > > > > drm_pagemap_migrate_unmap_pages: 10.5 us
> > > > > > 2MB
> > > > > > drm_pagemap_migrate_map_pages: 102 us
> > > > > > drm_pagemap_migrate_unmap_pages: 330 us
> > > > >
> > > > > I posted some more optimizations for these cases, it should reduce the
> > > > > numbers.
> > > > >
> > >
> > > We can try those — link? I believe I know the series, but just to make
> > > sure we’re on the same page.
> > >
> > > > > This is the opposite of the benchmark numbers I ran which showed
> > > > > significant gains as the page count and sizes increased.
> > > > >
> > > > > But something weird is going on to see a 3x increase in unmap, that
> > > > > shouldn't be just algorithm overhead. That almost seems like
> > > > > additional IOTLB invalidation overhead or something else going wrong.
> > > > >
> > > > > Is this from a system with the VT-d cache flushing requirement? That
> > > > > logic changed around too and could have this kind of big impact.
> > > >
> > > > Oh looking at the code a bit you've got pretty much the slowest
> > > > possible thing you can do here:
> > >
> > > This was a fairly common pattern prior to Leon’s series, I believe. The
> > > cross-references show this pattern appearing frequently in the kernel
> > > [1]. I do agree with the point below that, with Leon’s changes applied,
> > > this could be refactored into an IOVA alloc/link/unlink/free flow, which
> > > would work better (also 2M device pages reduces the common 2M case to a
> > > mute point).
> > >
> > > But that’s not what we’re discussing here. We’re talking about a
> > > regression introduced in the dma-mapping API for x86, which in my view
> > > is unacceptable for a kernel release. So IMO we should revert those
> > > changes [2].
> > >
> > > [1] https://elixir.bootlin.com/linux/v6.18.6/A/ident/dma_unmap_page
> >
> > I think this comparison is unfair. The previous behavior was bad for
> > everyone, while the current issue affects only the specific
> > drm_pagemap_migrate_unmap_pages() flow. Cases where the performance of
> > dma_unmap_page() in non-direct mode matters are extremely rare.
> >
>
> I don’t think you can reason about this without extensive testing across
> multiple platforms. Nor is it fair to say - sorry we slowed down your
> existing code, good luck.
It is not what I said. I only pointed to the specific point that loop
over dma_unmap_page() is universally performance critical.
Thanks
next prev parent reply other threads:[~2026-01-22 10:26 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-21 13:02 Xe performance regression with recent IOMMU changes Francois Dugast
2026-01-21 13:11 ` Jason Gunthorpe
2026-01-21 18:04 ` Jason Gunthorpe
2026-01-22 6:15 ` Matthew Brost
2026-01-22 7:29 ` Leon Romanovsky
2026-01-22 7:36 ` Matthew Brost
2026-01-22 10:26 ` Leon Romanovsky [this message]
2026-01-22 13:31 ` Jason Gunthorpe
2026-01-23 16:27 ` Francois Dugast
2026-01-23 19:07 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260122102607.GK13201@unreal \
--to=leon@kernel.org \
--cc=baolu.lu@linux.intel.com \
--cc=calvin@wbinvd.org \
--cc=dwmw2@infradead.org \
--cc=francois.dugast@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joerg.roedel@amd.com \
--cc=kevin.tian@intel.com \
--cc=matthew.brost@intel.com \
--cc=robin.murphy@arm.com \
--cc=skhawaja@google.com \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tina.zhang@intel.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox