All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Matthew Brost <matthew.brost@intel.com>
Cc: "Jason Gunthorpe" <jgg@nvidia.com>,
	"Francois Dugast" <francois.dugast@intel.com>,
	iommu@lists.linux.dev, intel-xe@lists.freedesktop.org,
	"Joerg Roedel" <joerg.roedel@amd.com>,
	"Calvin Owens" <calvin@wbinvd.org>,
	"David Woodhouse" <dwmw2@infradead.org>,
	"Will Deacon" <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Samiullah Khawaja" <skhawaja@google.com>,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Tina Zhang" <tina.zhang@intel.com>,
	"Lu Baolu" <baolu.lu@linux.intel.com>,
	"Kevin Tian" <kevin.tian@intel.com>
Subject: Re: Xe performance regression with recent IOMMU changes
Date: Thu, 22 Jan 2026 12:26:07 +0200	[thread overview]
Message-ID: <20260122102607.GK13201@unreal> (raw)
In-Reply-To: <aXHTjz6SP5w/PTPa@lstrano-desk.jf.intel.com>

On Wed, Jan 21, 2026 at 11:36:47PM -0800, Matthew Brost wrote:
> On Thu, Jan 22, 2026 at 09:29:13AM +0200, Leon Romanovsky wrote:
> > On Wed, Jan 21, 2026 at 10:15:14PM -0800, Matthew Brost wrote:
> > > On Wed, Jan 21, 2026 at 02:04:49PM -0400, Jason Gunthorpe wrote:
> > > > On Wed, Jan 21, 2026 at 09:11:35AM -0400, Jason Gunthorpe wrote:
> > > > > On Wed, Jan 21, 2026 at 02:02:16PM +0100, Francois Dugast wrote:
> > > > > > I am reporting a slowdown in Xe caused by a couple of IOMMU changes. It
> > > > > > can be observed during DMA mappings/unmappings required to issue copies
> > > > > > between system memory and the device, when handling GPU faults. Not sure
> > > > > > how other use cases or vendors are affected but below is the impact on
> > > > > > execution times for BMG:
> > > > > > 
> > > > > > Before changes:
> > > > > >   4KB
> > > > > >     drm_pagemap_migrate_map_pages: 0.4 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 0.4 us
> > > > > >   64KB
> > > > > >     drm_pagemap_migrate_map_pages: 2.5 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 3.5 us
> > > > > >   2MB
> > > > > >     drm_pagemap_migrate_map_pages: 88 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 108 us
> > > > > > 
> > > > > > After changes:
> > > > > >   4KB
> > > > > >     drm_pagemap_migrate_map_pages: 0.7 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 0.7 us
> > > > > >   64KB
> > > > > >     drm_pagemap_migrate_map_pages: 3.5 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 10.5 us
> > > > > >   2MB
> > > > > >     drm_pagemap_migrate_map_pages: 102 us
> > > > > >     drm_pagemap_migrate_unmap_pages: 330 us
> > > > > 
> > > > > I posted some more optimizations for these cases, it should reduce the
> > > > > numbers.
> > > > > 
> > > 
> > > We can try those — link? I believe I know the series, but just to make
> > > sure we’re on the same page.
> > > 
> > > > > This is the opposite of the benchmark numbers I ran which showed
> > > > > significant gains as the page count and sizes increased.
> > > > > 
> > > > > But something weird is going on to see a 3x increase in unmap, that
> > > > > shouldn't be just algorithm overhead. That almost seems like
> > > > > additional IOTLB invalidation overhead or something else going wrong.
> > > > > 
> > > > > Is this from a system with the VT-d cache flushing requirement? That
> > > > > logic changed around too and could have this kind of big impact.
> > > > 
> > > > Oh looking at the code a bit you've got pretty much the slowest
> > > > possible thing you can do here:
> > > 
> > > This was a fairly common pattern prior to Leon’s series, I believe. The
> > > cross-references show this pattern appearing frequently in the kernel
> > > [1]. I do agree with the point below that, with Leon’s changes applied,
> > > this could be refactored into an IOVA alloc/link/unlink/free flow, which
> > > would work better (also 2M device pages reduces the common 2M case to a
> > > mute point).
> > > 
> > > But that’s not what we’re discussing here. We’re talking about a
> > > regression introduced in the dma-mapping API for x86, which in my view
> > > is unacceptable for a kernel release. So IMO we should revert those
> > > changes [2].
> > > 
> > > [1] https://elixir.bootlin.com/linux/v6.18.6/A/ident/dma_unmap_page
> > 
> > I think this comparison is unfair. The previous behavior was bad for
> > everyone, while the current issue affects only the specific
> > drm_pagemap_migrate_unmap_pages() flow. Cases where the performance of
> > dma_unmap_page() in non-direct mode matters are extremely rare.
> > 
> 
> I don’t think you can reason about this without extensive testing across
> multiple platforms. Nor is it fair to say - sorry we slowed down your
> existing code, good luck.

It is not what I said. I only pointed to the specific point that loop
over dma_unmap_page() is universally performance critical.

Thanks

  reply	other threads:[~2026-01-22 10:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 13:02 Xe performance regression with recent IOMMU changes Francois Dugast
2026-01-21 13:11 ` Jason Gunthorpe
2026-01-21 18:04   ` Jason Gunthorpe
2026-01-22  6:15     ` Matthew Brost
2026-01-22  7:29       ` Leon Romanovsky
2026-01-22  7:36         ` Matthew Brost
2026-01-22 10:26           ` Leon Romanovsky [this message]
2026-01-22 13:31       ` Jason Gunthorpe
2026-01-23 16:27         ` Francois Dugast
2026-01-23 19:07           ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260122102607.GK13201@unreal \
    --to=leon@kernel.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=calvin@wbinvd.org \
    --cc=dwmw2@infradead.org \
    --cc=francois.dugast@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joerg.roedel@amd.com \
    --cc=kevin.tian@intel.com \
    --cc=matthew.brost@intel.com \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=thomas.hellstrom@linux.intel.com \
    --cc=tina.zhang@intel.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.